How to build Compact layout Pivot Table in Google Data Studio

TL,DR : the report is here. and I appreciate a vote on this bug report

First, don’t be excited, it is a silly workaround, and introduce it is own problem, but if you are like me and need to deliver a nice-looking pivot table in Google Data Studio, it may be worth the hassle.

The Problem.

Show the spent and budget by Campaign and country, the spent is at the country level, the budget at the country level, here is a sample data set.

The Solution, First try

Probably you are saying, it is too easy, why you are writing a blog about it, GDS support pivot Table !!, let’s see the result

We have three Problems already (1 bug, 1 limitation and 1 by design)

Bug:            you can return not return a null in the metric spent

By design:  GDS does not understand hierarchy, country null is all good.

Limitation: The Famous Excel compact View is not supported

Here is the deal, contrary to what you may read in the internet, Pivot table is the most used viz in reporting ( ok, maybe second after table) and users will want their pivot table to look exactly like their beloved Excel, my own experience, if you show a user a map for example and he ask for a feature which is not possible, you can say, I can’t do it and people will tolerate that, but their Excel looking Pivot table, zero tolerance, if you can’t reproduce it, they will either think :

  1. Your BI is not good
  2. You don’t know the tool

The Solution, SQL!!.

Write a SQL that return a column that show the campaign and country in the same field, using union

Assuming your data is on google sheet.

  1. Link Google Sheet to an External table in BigQuery

2-Write the Query

Connect to that table using a custom Query

SELECT project,sum(budget) as budget,sum(spent) as spent FROM `test-187010.work.factraw`

group by 1

union all

SELECT Concat(“\U0001f680”,country), budget, spent FROM `test-187010.work.factraw` where country IS NOT NULL

3-BI engine does not support External table.

Every time you open the report, GDS studio will issue a new query which cost 10 mb minimum !!!, even if the data is 1 kb ( it is a big data thing after all),  to avoid that we extract the data

4-Profit 🙂

 We use conditional formatting to highlight the row campaign.

needless to say !!!! you should not use it unless you have to, cross filtering will be a mess , Hopefully GDS will improve pivot table formatting in the near future.

Is Google Data Studio Ready for Complex Business Reports ?

 I love Google data Studio, I am using it for this project nemtracker.github.io and it is perfect for this use case.


For other Dashboard at Work, I am using PowerBI, for no reason, I tried a new experiment, can GDS be used instead, it is an academic exercise only.
the use case is a typical Business Reports, a lot of different small datasets with different granularity , something like budget vs items sold etc.


I am just recording the pain points in GDS , the good thing, most of them are under development, and some of them overlap, for example if we have Parameters controls, probably, there will be less need for blending ( which is very limited at the moment).


it is not a critique, GDS has some killer features, I particular like custom visuals as there are no limit the number of data plotted which is a pain in other Software.


the assumption is all the data-sources is already loaded and cleaned and ready to be analysed in BigQuery.


TL,DR : the pain points are at the Calculation level , Obviously if all you data is at the same granularity, then everything is easy
my conclusion, nearly there !!!, but will revise when Parameter controls are supported.

Instead of Writing a full blog, I thought showing a report is a more practical approach

How to Model Primavera Activity ID and Quantity Measurement System using Multiple to Multiple in PowerBI

whenever I need to join Primavera Activity id to the quantity measurement system, I use this pattern, it did serve me well all those years, recently I started a new project where for the first time, I don’t get an extract using Excel but a proper live connection to SQL server 🙂

To get something quickly running, I started using the same approach, load Primavera export, unpivot the date and normalize it, every activity has a spread from 0 to 100 % then merge it to a Table from SQL server, all working as expected.

Although it works well, it is a bit clunky , specially that the export from Primavera does not change frequently, for the baseline maybe once a year and the forecast once a month,  so instead of merging the data using Powerquery, I loaded the Primavera data as a separate table, here what the model looks like

As you have guessed the Activity id is duplicated in both tables

Now the Metric I am looking for is how to spread the budget hours from the table BOQ using the spread ( 0-100 %) from Primavera, let’s say I filter 1 row from the BOQ the result should be something like this

As it is multiple to multiple if you simply multiply the hours X spread you get duplicate values

Planned_Hours_no_filter = sumx(Primavera, [remaining_hrs]*Primavera[Spread]) = 950K hours

Obviously, it is the wrong, the total remaining hours is 49K only, the maximum spread should be 49K (or less if some activities ID are not mapped.)

The solution is to create an explicit filter and get the hours only for the specific activiy ID

Planned_Hours = sumx(Primavera, CALCULATE([remaining_hrs],filter(BOQ,BOQ[Activity ID]=Primavera[Activity ID]))*Primavera[Spread])

And here is the result

I checked with the old model and all the results match, to be honest I am not a huge fan of multiple to multiple but in this case, it is worth it, less refresh time and got rid of two big tables.

you can download the pbix here

Analyzing GIS data using BigQuery and PowerBI

TLDR, world data here , pbix file (Publish to web has a limit of 1 GB, only points are used)

Australia Report with polygons , pbix file

Australia report Using Datastudio Google Map

Edit : 14 April 2020, Updated the report to load all the tags amenity in the world, I am using this formula to dynamically calculate the distance between two points

Due to the COVID19 pandemic Google has made some public dataset free to query, one of them is openstreetmap, I thought it is an excellent opportunity to play with BigQuery GIS functions.

Using the existing documentation, I come up with this Query which return all the geometries in a radius of 100 Km from an arbitrary point ( for some reason I choose Microsoft office building in Brisbane as a reference) and with a tag =amenity

WITH
params AS (
SELECT
ST_GeogPoint(153.020749,
-27.467539) AS center,
100000 AS maxdist_m )
SELECT
ar.key,
ar.value,
feature_type,
osm_id,
osm_way_id,
geometry,
ST_CENTROID(geometry) AS center_location,
ST_Distance(ST_CENTROID(geometry),
params.center)/1000 AS distance
FROM
bigquery-public-data.geo_openstreetmap.planet_features,
params,
UNNEST(all_tags) AS ar
WHERE
('amenity') IN (
SELECT
(key)
FROM
UNNEST(all_tags))
AND ST_DWithin(ST_CENTROID(geometry),
params.center,
params.maxdist_m)

the query return

WARNING

the query processed 245 GB in 16 seconds !!!, and it did cost 0 $ at least till 14 Sept 2020, after that it will incur cost ( 1 TB/5 $)

you can explore the result using the built in Geoviz, but you can’t share the data.

PowerBI does not support custom queries when connecting to Bigquery , I had to save the query results in a view, then the connection to PowerBI is straightforward.

the query results is returned as a Key, Value

using PowerQuery pivot, it is trivial to denormalize the table ( I could not find how to do that in SQL), anyway the results looks much easier to analyze.

by the way just be careful , PowerBI support a maximum of  32766 characters , but there is an easy workaround, split the column by 32766 and then concatenate in a calculated column, yes it will increase the memory size, but it works.

and here is the final results using the beta version of icon Map, for example filtering all the data less than 4 Km, if you want print quality map you can always use R visual, see example here

the custom visual is still in beta, polygons and multipolygons render perfectly, point works but with a visual discrepancy, and I don’t think linestring is supported at all.

Icon map is a very versatile visual, I hope the author will release an official update and fix the rendering bugs and add an option for color per category.

Bigquery GIS is very powerful and easy to use, the documentation is excellent, I wished only they release a smaller public GIS dataset to play with.