Analyzing GIS data using BigQuery and PowerBI

TLDR, world data here , pbix file (Publish to web has a limit of 1 GB, only points are used)

Australia Report with polygons , pbix file

Australia report Using Datastudio Google Map

Edit : 14 April 2020, Updated the report to load all the tags amenity in the world, I am using this formula to dynamically calculate the distance between two points

Due to the COVID19 pandemic Google has made some public dataset free to query, one of them is openstreetmap, I thought it is an excellent opportunity to play with BigQuery GIS functions.

Using the existing documentation, I come up with this Query which return all the geometries in a radius of 100 Km from an arbitrary point ( for some reason I choose Microsoft office building in Brisbane as a reference) and with a tag =amenity

WITH
params AS (
SELECT
ST_GeogPoint(153.020749,
-27.467539) AS center,
100000 AS maxdist_m )
SELECT
ar.key,
ar.value,
feature_type,
osm_id,
osm_way_id,
geometry,
ST_CENTROID(geometry) AS center_location,
ST_Distance(ST_CENTROID(geometry),
params.center)/1000 AS distance
FROM
bigquery-public-data.geo_openstreetmap.planet_features,
params,
UNNEST(all_tags) AS ar
WHERE
('amenity') IN (
SELECT
(key)
FROM
UNNEST(all_tags))
AND ST_DWithin(ST_CENTROID(geometry),
params.center,
params.maxdist_m)

the query return

WARNING

the query processed 245 GB in 16 seconds !!!, and it did cost 0 $ at least till 14 Sept 2020, after that it will incur cost ( 1 TB/5 $)

you can explore the result using the built in Geoviz, but you can’t share the data.

PowerBI does not support custom queries when connecting to Bigquery , I had to save the query results in a view, then the connection to PowerBI is straightforward.

the query results is returned as a Key, Value

using PowerQuery pivot, it is trivial to denormalize the table ( I could not find how to do that in SQL), anyway the results looks much easier to analyze.

by the way just be careful , PowerBI support a maximum of  32766 characters , but there is an easy workaround, split the column by 32766 and then concatenate in a calculated column, yes it will increase the memory size, but it works.

and here is the final results using the beta version of icon Map, for example filtering all the data less than 4 Km, if you want print quality map you can always use R visual, see example here

the custom visual is still in beta, polygons and multipolygons render perfectly, point works but with a visual discrepancy, and I don’t think linestring is supported at all.

Icon map is a very versatile visual, I hope the author will release an official update and fix the rendering bugs and add an option for color per category.

Bigquery GIS is very powerful and easy to use, the documentation is excellent, I wished only they release a smaller public GIS dataset to play with.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s