I was reading this blog post and thought of a new use case, using OpenstreetMap Data and generate polygons based on the user Selection
First to reduce cost, we will select only all a subset of OpenstreetMap Data, you can use this post as a reference
my base table is OPENSTREETMAPAUSTRALIAPOINTS , which contains 614,111 rows
The idea is to provide some tag selection ( School, cafe etc) and let BigQuery generate a new polygons on the fly, the key function in this SQL script is ST_CLUSTERDBSCAN
WITH z AS ( SELECT * FROM `test-187010.GIS.OPENSTREETMAPAUSTRALIAPOINTS` WHERE value IN UNNEST(@tags_selection)), points AS ( SELECT st_geogpoint(x, y) AS geo_point, value AS type FROM z ), points_clustered AS ( SELECT geo_point, type, st_clusterdbscan(geo_point, 200, @ct) OVER() AS cluster_num FROM points), selection AS ( SELECT cluster_num AS spot, COUNT(DISTINCT(type)) FROM points_clustered WHERE cluster_num IS NOT NULL GROUP BY 1 HAVING COUNT(DISTINCT(type))>=@ct ORDER BY cluster_num) SELECT spot AS Cluster, st_convexhull(st_union_agg(geo_point)) as geo_point, "Cluster" as type FROM selection LEFT JOIN points_clustered ON selection.spot=points_clustered.cluster_num group by 1 union all SELECT spot AS Cluster, geo_point , type FROM selection LEFT JOIN points_clustered ON selection.spot=points_clustered.cluster_num
Technically you can hardcode the values for Tags, but the whole point is to have a dynamic selection
I am using Data Studio and because the Query is not accelerated by BI Engine , and in order to reduce the cost, I made only 6 Tags available for user selection and hard code the distance between two points to 200 m.
Here is an example when selecting the tags (restaurant, school and fuel), I get 136 cluster
here when I zoom on 1 location, the result are pretty accurate
I think it is a good use case for parameters, GIS calculation are extremely heavy and sometimes all you need from a BI tool is to send Parameter values to a Database and get back the result.
you can play with the report here
edit : August 2021, The Same report using PowerBI
One thought on “Dynamic Geospatial Clustering using BigQuery GIS”