Top Performing items by category using Google Data Studio SQL Parameter

a very common pattern in Data Analytics is to calculate top performing items by category, for example Top 5 car sales by Brand and Country , now that GDS support SQL Parameter I thought it is a nice opportunity to try it.

I am using a data set that show sales of items by country and geographic region for a hypothetical Company

to get the rank, I am using Row_Number () analytics functions and to make the example a bit more complicated, I presume, the user want to filter by Sales_Channel , Online/Offline or Both

here is the query

WITH
xx AS (
SELECT
Region,
Country,
SUM(Total_Revenue) AS Total_Revenue
FROM
Sales_Summary_rand
WHERE
Sales_Channel IN UNNEST(@Sales_Channel)
GROUP BY
1,
2)
SELECT
*,
ROW_NUMBER() OVER(PARTITION BY Region ORDER BY Total_Revenue DESC) AS rnk
FROM
xx

now you define the parameter ‘@Sales_Channel’, notice you have to type the values manually, for this example there are only two Values, Online/Offline

and here is the report, notice, I am using vega-lite custom viz as the native visual shows only max of 20 categories, the user will have the option to filter the Top 5,10 etc, it is a very powerful visual analytic tool

Unfortunately as of August 2020, BigQuery BI Engines does not support Analytic functions nor UNEST ( which is required for multiple selection of the parameter)

ok, someone may say ( I am looking at you piedatastudio ) that it was possible before and we don’t need parameter, which is true, technically we can show the rank for the three possibilities (Online Sales/Offline Sales, and both) and just use a UNION them add a filter to select the Sale Channel.

now to make it more interesting, let’s say they want to see the top performing based on a date selection, this year, this month etc, very easy using Dates Parameter

WITH
xx AS (
SELECT
Region,
Country,
SUM(Total_Revenue) AS Total_Revenue
FROM
GIS.salesdates
WHERE
Sales_Channel IN UNNEST(@Sales_Channel)
AND Ship_Date >= PARSE_DATE('%Y%m%d',
@DS_START_DATE)
AND Ship_Date <= PARSE_DATE('%Y%m%d',
@DS_END_DATE)
GROUP BY
1,
2)
SELECT
*,
ROW_NUMBER() OVER(PARTITION BY Region ORDER BY Total_Revenue DESC) AS rnk
FROM
xx

in this cases you can not pre calculate the results, as they are thousands of different combination for Date selection.

Please notice, as the Query is not accelerated by BI Engine in Data Studio, the report will incurs BigQuery cost, very negligible for small tables but still 🙂

I added a new report based on Covid19 dataset which is provided for free by Google.

How to calculate percentage of row using Google Data Studio Parameter

TL;DR : Report here

let’s say you work with sales data, and you have being asked to do this analysis

1- Give me sales number by region and Item type

very easy using a pivot table

2- Give me percentage by item and Region

It is very easy using comparison calculation

and here is the report

We Have a problem the percentage is compared to the overall sales, usually we are interested in the ratio of item sold by region !!!

It is a very common request, see this question in the forum

3- Workaround using Blending

One approach is to use blending to calculate the total sales per region then divide by items, see tutorial here , it is fair to say, it is one of the most popular report I built, it seems it is a common pain point

4- I want to see the sales per region and by Country

The workaround using blending is hard coded to one dimension, let’s say region, but usually users want to drill down to multiple level,

I guess you can write a SQL Query to achieve this scenario, but you need to duplicate the data which is something I don’t particularly like !!!

5- Parameter and SQL, you can do anything

we do have parameter now, I am using the same technique as described here

basically, I write a SQL query to calculate the total sales per region and total sales per country, and using filter control , the user can switch to either dimension, as August 2020 BI Engine does not support Sum Over(), I am using self join

WITH
xx AS (
SELECT
region,
Country,
Item_Type,
SUM(Total_Revenue) AS Total_Revenue
FROM
Sales_Summary_randxx
GROUP BY
1,
2,
3)
SELECT
xx.region,
xx.Country,
Item_Type,
Total_Revenue,
Total_Revenue_Region,
Total_Revenue_Country,
CASE
WHEN @details="region" THEN xx.region
ELSE
xx.country
END
AS Level_details,
CASE
WHEN @details="region" THEN Total_Revenue/Total_Revenue_Region
ELSE
Total_Revenue/Total_Revenue_Country
END
AS Percent,
FROM
xx
LEFT JOIN (
SELECT
Region,
SUM(Total_Revenue) AS Total_Revenue_Region
FROM
Sales_Summary_rand
GROUP BY
1) yy
ON
xx.region = yy.region
LEFT JOIN (
SELECT
Country,
SUM(Total_Revenue) AS Total_Revenue_Country
FROM
Sales_Summary_rand
GROUP BY
1) vv
ON
xx.Country = vv.Country

and Voila the final results, notice you can use any number of dimensions

Change the color of a bar chart based on a value using Google Data Studio Parameter

This one I took it from a Tableau Presentation, let’s say you want some element to have a particular color when they reach a value.

For example in this report, I want to make it clear which generators is producing more than 1000 MW.

Create a parameter MWREF

Data Studio formula engine you can not compare dimension to parameter, instead we create first a helper calculated field “diff”

then we create a second calculated field “color”

in the bar chart, I use “color” as as a breakdown Dimension

and here is the result

Create Dynamic goal line on a bar Chart using Google Data Studio Parameter

TL;DR : The report is here

trying to reproduce a visual I saw before, Probably in a tableau forum, it is quite simple but give a very nice visual clue, the idea is the user input a target and the color will change based if it is higher or lower than the Target

Probably you can do it using Parameter in Google Data Studio, but using BigQuery was much easier. ( solution using only GDS , courtesy of Nimantha )

I built this Query, which generate two values, Firstsection of the bar and the secondsection

SELECT
*,
CASE
WHEN MW < @target THEN MW
ELSE
@target
END
AS firstsection,
CASE
WHEN MW < @target THEN null
ELSE
MW -@target
END
AS secondsection
FROM
datastudio.table

define parameter “Target” , currently BigQuery parameter does not accept range, instead you have to type a number

then Create Stacked Combo Chart

Make sure first section and second section are bars and target is a line and make sure bars are stacked

when you change the parameter values, the bars value change