First Look at Dynamic M Query parameter using SQL Server

With PowerBI February 2022 release , finally we can use M Dynamic parameter with SQL Server in Direct Query Mode, I was really excited, I had a couple of patterns where I used M Parameter with BigQuery to do calculation on the fly that iare not supported natively in PowerBI, for example Geospatial calculation.

My first example was dynamic changing of dimension, it just works as it is relatively simple, see example here, very excited it works.

Then I tried to port this example from BigQuery, basically you select some points in a map, and you get back the polygon and the area, The calculation has to be done on the fly, pre calculating the results is not practical, generating all possible calculation is just too much.

The first step of getting the points selected as a nice list was very easy, see code here

let
TagsList =  if Type.Is(Value.Type(tag_selection), List.Type) then 

     Text.Combine({"'" , Text.Combine(tag_selection, ",") , "'"})

    else

Text.Combine({"'" , tag_selection , "'"}),



finalQuery= "select 1 as poly, value from string_split("& TagsList&",',') ",  

      
Source = Sql.Database("XXXXXXXX", "DB", [Query=finalQuery])

in
    Source

I selected some points in Icon Map and dynamic M parameter get populated, I was really excited, The hard part is done and all I need is to write some T-SQL

T-SQL Rabbit hole

I am no SQL expert by any means, by some weird coincidence, my first Database was BigQuery, (I used MS Access long time ago ), so this is the first time I tried to use T-SQL in a non trivial way ( at work I use T-SQL to retrieve data, maybe doing some joins and stuff in that nature but no GIS for sure).

The Good thing is , the amount of resources available on SQL Server is phenomenal, I got some indication on Stack overflow, but something weird happen.

I start writing T-SQL code in SSMS and it works fine, when I copy it to PowerBI, it generate errors, I was really angry and can’t understand what’s going on, I thought it is something weird about PowerBI.

I know that PowerBI, embed any custom SQL inside a Subquery, that’s very standard, actually Tableau does the same as well as Google Data Studio.

Turn out, SQL Server don’t support CTE inside a subquery

Chris has blogged about it here, that was very kind of him, basically his points is just write a view in a database, it is better to have the logic upstream anyway, which totally make sense, except it is not a realistic solution, Business users don’t just get write access to the database, actually they are very lucky to get even read access.

but for me, the Elephant in the room, Why SQL Server don’t support CTE inside a Subquery in the first place ? specially with the fact that’s the way PowerBI Works, that’s seems very odd to me !!!

And suggesting, it is what it is, is not good enough,it is Microsoft for God’s sake!!

I am just a PowerBI developer, and simply can’t understand why BigQuery Works and not SQL Server.

Calculate the shortest Path in a network in PowerBI

This blog is just another use case where we can leverage M parameter to perform calculation that can not be done in PowerBI, see example here for clustering and Area calculation

To be clear, it is always possible to precalculate outside PowerBI and just import the results, but in the current case, it can be tricky, let’s say you have 1000 points and you want to check the distance between two points using an existing route network, you will need to calculate 1000 X 1000 combination, instead the idea here, you select two points then M Parameter will send the selection to a Database using Direct Query to do the calculation and get the result back.

For this example I am using BigQuery, but you can use any Database that support M Parameter (Snowflake, Azure ADX etc)

Import all the bus stop location in a particular Area

Because of Covid situation, the Openstreetmap dataset is free to Query, here is the SQL Query, I am just using Brisbane as a reference

CREATE OR REPLACE TABLE
  test-187010.gis_free.Brisbane_Bus_Stop AS
SELECT
  ST_CENTROID(geometry) AS center_location,
  tags.key,
  tags.value
FROM
  `bigquery-public-data.geo_openstreetmap.planet_features`,
  UNNEST(all_tags) tags
WHERE
  key = 'highway'
  AND value='bus_stop'
  AND ST_INTERSECTS(geometry,
    -- Selecing Brisbane area
    ST_GEOGFROMtext("POLYGON((152.9432678222656 -27.33776203832722,153.2563781738281 -27.33776203832722,153.2563781738281 -27.594864493271448,152.9432678222656 -27.594864493271448,152.9432678222656 -27.33776203832722))"))

you can use this handy website to generate a polygon from a map

Import Road network

The same we will use a subset of Openstreetmap dataset , first we select Key= Highway and to improve performance we only the main values ( primary, secondary road etc)

CREATE OR REPLACE TABLE
  `test-187010.gis_free.brisbane_Road_Network`
CLUSTER BY
  geometry AS
SELECT
  geometry,
  tags.key,
  tags.value
FROM
  `bigquery-public-data.geo_openstreetmap.planet_ways`,
  UNNEST(all_tags) tags
WHERE
  key = 'highway'
  AND value IN ("motorway",
    "motorway_link",
    "primary",
    "primary_link",
    "secondary",
    "secondary_link",
    "tertiary",
    "tertiary_linkt",
    "runktrunk_link")
  AND ST_INTERSECTS(geometry,
    --Select you Area Here:
    ST_GEOGFROMtext("POLYGON((152.9432678222656 -27.33776203832722,153.2563781738281 -27.33776203832722,153.2563781738281 -27.594864493271448,152.9432678222656 -27.594864493271448,152.9432678222656 -27.33776203832722))"))

jut to get an idea he is how the road network looks

Time for calculation

Unfortunately as of this writing BigQuery GIS does not support the function to find the shortest path between two points in a network, luckily I find this user defined function written in Javascript, the good news it works as expected but javascript will always be slower compared to a native SQL function, anyway here is the SQL Query

WITH
  initial_parameter AS (
  SELECT
    *
  FROM
    UNNEST(['POINT(153.023194 -27.563979)','POINT(152.979212 -27.49549)'] ) AS element ),
  mynetwork AS (
  SELECT
    ARRAY_AGG(geometry) roads
  FROM
    `test-187010.gis_free.brisbane_Road_Network_cluster` ),
  calculation AS(
  SELECT
    `libjs4us.routing.geojson_path_finder`(roads,
      st_geogfromtext(a.element),
      st_geogfromtext(b.element)) AS tt
  FROM
    mynetwork,
    initial_parameter a,
    initial_parameter b
  WHERE
    a.element>b.element
  LIMIT
    100)
SELECT
  ST_ASTEXT (tt.path) AS GEO,
  tt.weight AS len
FROM
  calculation

And here’s the result, a linestring and the length in Km, the Query took 7 second, to be honest I have no idea about the calculation complexity, so not sure if it is fast or not 🙂

PowerBI M Parameter

After we make sure the Query works with two fixed points, now we need just to make it interactive, so the user can select any two points, and that exactly what M parameter do

The Table path is using Direct Query

The Table Bus_Stop is import mode, which is used to as the Parameter filter

The Parameter is Tag_Selection , for a very detailed explanation, Please read this blog first

and here is the M Query

let
TagsList = 
    if 
    //check to see if the parameter is a list
      Type.Is(
        Value.Type(tag_selection), 
        List.Type
      ) then 
        //if it is a list
        let
          //add single quotes around each value in the list
          AddSingleQuotes = List.Transform(
              tag_selection, 
              each "'" & _ & "'"
            ),
          //then turn it into a comma-delimited list
          DelimitedList = "[" & Text.Combine(
              AddSingleQuotes, 
              ","
            ) &"]"
          
        in
          DelimitedList
    else 
      //if the parameter isn't a list
      //just add single quotes around the parameter value
      "['" & tag_selection & "']",
    Source = 
    Value.NativeQuery(GoogleBigQuery.Database([BillingProject="testing-bi-engine"]){[Name="test-187010"]}[Data],
     "WITH
  initial_parameter AS (
  SELECT
    *
  FROM
    UNNEST("& TagsList &" ) AS element ),
  mynetwork AS (
  SELECT
    ARRAY_AGG(geometry) roads   FROM   `test-187010.gis_free.brisbane_Road_Network_cluster` ),
  calculation AS(
  SELECT
    `libjs4us.routing.geojson_path_finder`(roads, st_geogfromtext(a.element), st_geogfromtext(b.element)) AS tt
  FROM
    mynetwork,    initial_parameter a,    initial_parameter b
  WHERE
    a.element>b.element
  LIMIT
    100)
SELECT
  1 AS DUMMY,  CASE     WHEN ARRAY_LENGTH("& TagsList &") =2 THEN ST_ASTEXT (tt.path)  ELSE  NULL END  AS GEO,
  CASE    WHEN ARRAY_LENGTH("& TagsList &") =2 THEN tt.weight  ELSE  0 END   AS len
FROM  calculation"
       , null, [EnableFolding=true])
   
    in
       Source

Notice I added this condition ARRAY_LENGTH(“& TagsList &”) =2 then 0, just to reduce the calculation when the user select only 1 point, Currently in PowerBI, there is no way to have Query reduction option for cross filtering

Icon Map

Icon map is the only visual that can render WKT geometry in PowerBI, this previous blog explain how we simulate multi layer interaction

Performance

The Performance unfortunately it is a bit disappointing, around 20 second, javascript UDF is slow and PowerBI is very chatty , which is a nice way to say, PowerBI send a lot of SQL Queries,everytime I select two points, PowerBI send 4 Queries

The first Query is when I select the first point, hopefully one day we will have an option to action cross filtering only after we finish the selection

Query 2 and 3 are identical and are used to check the field type of the Table, I wonder why PowerQuery is sending the same Query twice

Query 4 is the real Query that bring the result

you can download the pbix here

Edit : Carto which is the same company that released the javascript function is now offering a native SQL functions which should be substantially faster, I have not used it as it is commercial, but if you have a massive network, maybe it is worth it, just to be clear I have no affiliation with them.

Edit : added the same report but now using Tableau

Using PowerBI M Parameter to calculate a polygon Area

TL;DR : PowerBI does not support GIS Area calculation, in this blog we use M Parameter to leverage third party Database to do the calculation, Works only In Direct Query Mode.

ink to Public report

With the Update of August 2021, M Parameter support multi value selection, see this Previous blog for a little background, this open some interesting new use cases that was not possible before in PowerBI.

Again, before you get too excited, it works only with Direct Query and Database that support M Parameter ( BigQuery, Snowflake, Azure ADX etc)

One example is calculating the area of a group of arbitrary points

  • In Icon Map, I select a group of points (Notice here, Icon map is acting as Parameter Action filter )
  • The List of points are used in an M Parameter
  • A SQL Query is sent back to the Database that support M Parameter in my case I am using BigQuery
  • BigQuery generate a wkt and calculate the area , you can calculate the distance too or any metrics that use geometry
  • Plot the results in another Icon Map

The Query is straightforward, and the best part because we are not running a Query against any table in BigQuery , it does not cost anything, you can register here for free and no credit card is required

Here is the M Query

let
TagsList = 
    if 
    //check to see if the parameter is a list
      Type.Is(
        Value.Type(tag_selection), 
        List.Type
      ) then 
        //if it is a list
        let
          //add single quotes around each value in the list
          AddSingleQuotes = List.Transform(
              tag_selection, 
              each "'" & _ & "'"
            ),
          //then turn it into a comma-delimited list
          DelimitedList = "[" & Text.Combine(
              AddSingleQuotes, 
              ","
            ) &"]"
          
        in
          DelimitedList
    else 
      //if the parameter isn't a list
      //just add single quotes around the parameter value
      "['" & tag_selection & "']",
      
Source = Value.NativeQuery(GoogleBigQuery.Database([BillingProject="xxxxxx"]){[Name="xxxxx"]}[Data], "WITH
  xxx AS (
  SELECT
    *
  FROM
    UNNEST( "& TagsList &" ) AS element),
  yyy AS (
  SELECT
    ST_CONVEXHULL( ST_UNION_AGG(ST_GEOGFROMTEXT(element))) AS geo
  FROM
    xxx)
SELECT
  ST_ASTEXT(geo) AS WKT,
  st_area(geo) AS area
FROM
  yyy", null, [EnableFolding=true])
in
    Source

And here is the result.

Future Improvement

One thing I would really like is the possibility to show the result in the same Map, unfortunately to the best of my knowledge a Table in DAX can not filter itself, see this example

Image

in Icon map it is possible to display wkt and points at the same time , but as you can see from the screenshot wkt geometry in the table does not change based on internal filter selection, the other viz works fine.

M Parameters has a very interesting application, and I am excited to try other tricks 🙂

Edit : I appreciate some votes here for the option to pass filter selection to the same Visual

Dynamic Geospatial Clustering using BigQuery GIS

I was reading this blog post and thought of a new use case, using OpenstreetMap Data and generate polygons based on the user Selection

First to reduce cost, we will select only all a subset of OpenstreetMap Data, you can use this post as a reference

my base table is OPENSTREETMAPAUSTRALIAPOINTS , which contains 614,111 rows

The idea is to provide some tag selection ( School, cafe etc) and let BigQuery generate a new polygons on the fly, the key function in this SQL script is ST_CLUSTERDBSCAN

WITH
  z AS (
  SELECT
    *
  FROM
    `test-187010.GIS.OPENSTREETMAPAUSTRALIAPOINTS`
  WHERE
    value IN UNNEST(@tags_selection)),
  points AS (
  SELECT
    st_geogpoint(x,
      y) AS geo_point,
    value AS type
  FROM
    z ),
  points_clustered AS (
  SELECT
    geo_point,
    type,
    st_clusterdbscan(geo_point,
      200,
      @ct) OVER() AS cluster_num
  FROM
    points),
  selection AS (
  SELECT
    cluster_num AS spot,
    COUNT(DISTINCT(type))
  FROM
    points_clustered
  WHERE
    cluster_num IS NOT NULL
  GROUP BY
    1
  HAVING
    COUNT(DISTINCT(type))>=@ct
  ORDER BY
    cluster_num)
SELECT
  spot AS Cluster,
  st_convexhull(st_union_agg(geo_point)) as geo_point,
  "Cluster" as type
FROM
  selection
LEFT JOIN
  points_clustered
ON
  selection.spot=points_clustered.cluster_num
  group by 1
union all
SELECT
  spot AS Cluster,
  geo_point ,
type
FROM
  selection
LEFT JOIN
  points_clustered
ON
  selection.spot=points_clustered.cluster_num

Technically you can hardcode the values for Tags, but the whole point is to have a dynamic selection

I am using Data Studio and because the Query is not accelerated by BI Engine , and in order to reduce the cost, I made only 6 Tags available for user selection and hard code the distance between two points to 200 m.

Here is an example when selecting the tags (restaurant, school and fuel), I get 136 cluster

here when I zoom on 1 location, the result are pretty accurate

I think it is a good use case for parameters, GIS calculation are extremely heavy and sometimes all you need from a BI tool is to send Parameter values to a Database and get back the result.

you can play with the report here

edit : August 2021, The Same report using PowerBI

%d bloggers like this: