Rendering 1.2 Million points in PowerBI using Icon Map

TL;DR, the report is here , pbix here , download Icon Map here

I blog about it really before here, but in a colorful discussion on twitter after they closed this bug report, (Max rows in PowerBI is 30 K), I recalled a bug report about R ggplot, when the authors suggested to use multi points instead of individual points to speed the plotting of the map !!! all I had to do is to check if icon map support it and it did !!!.

WKT format for point is POINT (x,y), all you need to do is to concatenate multiple rows of point in MULTIPOINT format MULTIPOINT ((x y),(x1 y1), etc).

now we need to deal with three thing:

1-The maximum row numbers returned to a visual is 30K.

2-The maximum length of a column in PowerQuery is 32766

3-The Maximum length of DAX function is 2.1 Million

so a calculated column is a no go, I am using this DAX measure to concatenate the text from (x y) to a Multipoint format, again using Chris blog, I got this measure

WKT = var concat =CONCATENATEX(values(openstreetmap[point]),openstreetmap[point],",") var wkt ="MULTIPOINT ("&concat&")"return if (concat =BLANK(),BLANK(),wkt)

using this table that contains all the items tagged as amenity in Openstreet Map, the table contain 17 Million records, got it from BigQuery dataset

remember you don’t want to group all points in 1 row for two reason.

1- Concatenax Max is 2.1 Million

2-it is better still to group by common attribute, at you can color code for example by country or category or both if you want, in this case, I added a third factor in MULTIPOINTS, just a number that change very 30k rows to make sure I will not end up with a multipoints > 2.1 million, Initially It was 100K, but I notice icon map become extremely slow

now in icon Map, you need to assign three fields.

Category : multipoints

Icon URL/WKT/SVG: the measure wkt, this extremely clever and flexible, as it is a measure, that will render using the filter context of category, you may want to be creative and implement drill using different level of details, as the geometry is calculated on the fly.

Circle/line/WKT/Geojson Outline Color : a color in hex format, in my case, coded by country, ( at work for a different use case, I use a measure instead to show change of status per time)

because, the data set is relative big, I use this option in PowerBI

Just to be clear, this only proof of concept, rendering a big dataset will be slow and will eat all your memory, and probably you will get errors in shared workspace, or if you are in a premium workspace, probably you will end up in a trouble, but it is cool, personnaly, I use it to render a 58K points and it is very smoth.

anyway here is the result filtering the tag place of worship 1.1 Million, I tried parking which is 3.2 Million but my laptop crashed !!!, I know it is subjective, but that looks very beautiful for me.

here tag : School and University color by Country

edit : got a nice feedback from Reddit user data_Crucher, just to improve the performance I materialized the results using a calculated table, the drawback is you increase the size of the model, but I guess it is worth it, and I changed the decimal precision for the lat and long to 4 digits just to reduce the size, the pbix is around 600 MB.

again for production scenarios, I think around 100K points should be doable.

Integrated EPCM Management – Engineering Progress

How is an engineering deliverable list a bit of a misnomer? Find out of a web page with editable fields can streamline the tracking of progress on list items.

I am a firm believer in the use of a standard project website portal from which the project team at large can quickly access key data and metrics about the project. This is not meant to be confused with a project dashboard, or PowerBI visualization. This is simple html file linked to a database with key flat-file information. Using smart JavaScript code, it is also possible to EDIT some of the key information.

With everything, before we even begin, we want to focus on what core digital strategies we are trying to tackle. Again, the project website is just a tactical approach, underlying it are the more relevant strategic goals we want to operate under.

Digital Strategy – Make Information easy to access

Digital Strategy – Agile Construction Management

Digital Strategy – Allow people to EDIT key data

The features discussed in this blog can be seen showcased in the following video. This is not a pipe dream. This is a functional application.

The video can be viewed in a separate window at https://www.youtube.com/watch?v=0XNA9xJS2yY

The Data

The key data sources involved with engineering progress will be:

  • Engineering Deliverable List
  • Manhours per WBS

Your engineering deliverable list is a bit of a misnomer. A lot of engineering and design tasks are not specifically related to a “deliverable.” In my view of the world, while you will have specific deliverables and need to track them, your progress list should also include everything else you are doing too – up to a point where perhaps the level of detail is too small.

For analysis, productivity factors for engineering are critical. As such, this post would not be complete without a discussion and visibility into hours and productivity factors.

The Menu

Our menu follows many of the key functions of project controls. However this specific post will dive into the ENGINEERING section only.

If we expand our menu, we see various views into our data: by project, discipline, contracts, schedule IDs and a unique view for controls engineering

Detail Views / EDIT MODE

Perhaps before we look at our summary and drill downs, the main control screen we would use on a day to day basis would be the Deliverables by WBS and Discipline

In the below screen, we can see all out progress items, our budget hours, progress% as well as the mapping to P6 ID (as all progress items should be mapped to into your schedule)

This view into deliverables by itself is not specifically unique. Where the magic happens is when we enter EDIT MODE.

Inside our EDIT MODE, we can directly update the progress %, and update our mapping to schedule ID. These features in my mind are your killer features that distinguish this from any other app. The shear ease in updating items in this way is a breath of fresh air, not to mention the live feedback and visibility throughout your team.

JIRA

Each combination of WBS and Discipline can be mapped to a JIRA task. Obviously we know our WBS can be considered an EPIC. The power in using a tool such as JIRA is that we can now track more detailed L4/5 tasks using JIRA Subtasks. These are completely flexible to allow the user to manage their own tasks.

Additionally, the ability to embed commentary and status is a brilliant way to distribute and communicate key status and blockers if any.

 

Summary Pages

Now that we have seen how we will actually interact with a project website, we can now showcase the summary level reporting. Specifics of how you report can often times be better captured using an analytics platform like PowerBI; however, more often then not, simple summaries and metrics are what drive our business.

In the summary by project, we can easily see key metrics per project. Each project would have a link to allow us to drill into the details for each project

In the above, we are now looking at the individual functional delivery areas (disciplines) for our project. Again we can see key budgets, spent hours, progress and productivity measures too. By clicking each discipline, we further dive into our project by now looking into the detailed WBS elements for that Discipline

It is in this view that we can see detailed metrics per WBS/Discipline pair.

Understanding our Spent Hours from time sheets is a critical management function. As such, the SAP actuals here is a link that takes us to a screen where we can view the detailed weekly time sheet data to see who (and when) has book to this element.

Conclusion

In summary we started with some key strategic thinking and built out a tool that ticks a lot of boxes in the EPCM construction world.

More often then not, I am confronted with technology that “shows me” but doesn’t allow me to interact, edit, or collaborate. Everyone wants to solve the “one source of truth” however, information changes and is updated by people. That is the missing link in a lot of our data analytics. Look into the work processes that generate information in the first place. Look at what people need to better capture their raw information into a data format and platform that others can now use.

This approach to put management of engineering items to plain sight, with ease of access is just one approach. There are 100s of ways to approach this, but if you stick to core strategic ideals, you can’t go wrong.

PowerBI Progress & Schedule Dashboard – By Darrin Kinney

I have recently been developing a series of videos that highlight the key features utilized in a progress and schedule dashboard. The videos showcase the capabilities of PowerBI dashboards in the Project Controls space. I have not seen dashboards effectively used in this way and want to share the valuable knowledge.

 

I have recently been developing a series of videos that highlight the key features utilized in a progress and schedule dashboard. The videos showcase the capabilities of PowerBI dashboards in the Project Controls space. I have not seen dashboards effectively used in this way and want to share the valuable knowledge.

This series is not meant to be a step by step guide. There are subtleties about this demo that may cause difficulties in the production environment. I would simply recommend you share this with your development team and discuss the pros and cons of your approach. Oftentimes, a more straight forward approach is more valuable when compared to endless development polishing an inferior product.

 

Part 1: The Showcase

This video gives an insight into the key capabilities of the dashboard. Having the ability to seamlessly review schedule activities, and how these contribute to the overall progress and forecast, is invaluable.

The ability to quickly dive into your schedule, without having to deal with the confines and limitations of your actual scheduling tool are also key features.

 

Part 2: The Excel Feeder Sheets

This highlights a simple Excel feeder sheet. Too often the time phased data that our schedules produce are not easily accessible in a digital format. I have built an excel file around a typical structure that project controls deals with. This structure will lend itself nicely to the steps that follow in converting the into a database format.

 

Part 3: PowerBI PowerQuery

Here we import the data from the Excel feeder sheets into the PowerBI platform. The use of PowerQuery is so embedded with the way PowerBI works. The steps you need to follow here are the similar to the steps you would need to follow in inserting the progress and schedule data into any formal data structure. The way we think about data is sometimes not compatible with the format that databases need. This is specifically around the need to “unpivot” time phased data.

 

Part 4:  PowerBI Measures and Dax

With all the data now structured and available to PowerBI, we need to now dive into the use of DAX to create Measures. A perfect example in the use of measures is in the generation of progress curves.

What might seem line a straight forward approach to drawing simple progress curves, is in fact (within the realms of PowerBI) not that simple. However, if you follow a logic approach and know what calculations are needed, the world is your oyster.

 

Part 5: Integration of JIRA and Agile Methods

In the (as of now) final installment of this series, I showcase a way in which we can integrate our PowerBI dashboard with a JIRA project. This approach is completely different from what you might expect. I don’t want to put a PowerBI dashboard ontop of my JIRA task list. I want to put a JIRA task list on top of my schedule.

The purpose of the dashboard is to extract the SCHEDULE data from the scheduling tool. When variations to prior forecasts occur, or where further detail is needed, we are often constrained because pictures, running commentary and discussion about each activity is not something that resides in our schedule. However, we can use JIRA to easily capture those elements and use our PowerBI dashboard and a linking tool to integrate everything together.

 

The Future: ???

There are still a lot of features and extensions that I have yet to formally discuss. The next steps are likely going to be a showcase of a SQL Server backend for this data. There is a lot of information that is missed in the way this dashboard imports data (specifically past budgets). Therefore visibility into changes is restricted.

Another interesting feature is the use of saw tooth graphs when budget changes occur. I have a clear vision for how this is possible, but perhaps not the development know how. I will definitely be passing my ideas on to a few more tech savvy than I, and will hopefully have something to say about that in the near future.

In general, the way in which dashboards and data are embedded into our work processes, is a field ripe for growth. It is also an endeavor that can greatly increase the visibility into project controls data and can also bring teams together using integrated tools like JIRA. As such, the future is bright and where we should always have half an eye looking.

 

Analyzing GIS data using BigQuery and PowerBI

TLDR, world data here , pbix file (Publish to web has a limit of 1 GB, only points are used)

Australia Report with polygons , pbix file

Australia report Using Datastudio Google Map

Edit : 14 April 2020, Updated the report to load all the tags amenity in the world, I am using this formula to dynamically calculate the distance between two points

Due to the COVID19 pandemic Google has made some public dataset free to query, one of them is openstreetmap, I thought it is an excellent opportunity to play with BigQuery GIS functions.

Using the existing documentation, I come up with this Query which return all the geometries in a radius of 100 Km from an arbitrary point ( for some reason I choose Microsoft office building in Brisbane as a reference) and with a tag =amenity

WITH
params AS (
SELECT
ST_GeogPoint(153.020749,
-27.467539) AS center,
100000 AS maxdist_m )
SELECT
ar.key,
ar.value,
feature_type,
osm_id,
osm_way_id,
geometry,
ST_CENTROID(geometry) AS center_location,
ST_Distance(ST_CENTROID(geometry),
params.center)/1000 AS distance
FROM
bigquery-public-data.geo_openstreetmap.planet_features,
params,
UNNEST(all_tags) AS ar
WHERE
('amenity') IN (
SELECT
(key)
FROM
UNNEST(all_tags))
AND ST_DWithin(ST_CENTROID(geometry),
params.center,
params.maxdist_m)

the query return

WARNING

the query processed 245 GB in 16 seconds !!!, and it did cost 0 $ at least till 14 Sept 2020, after that it will incur cost ( 1 TB/5 $)

you can explore the result using the built in Geoviz, but you can’t share the data.

PowerBI does not support custom queries when connecting to Bigquery , I had to save the query results in a view, then the connection to PowerBI is straightforward.

the query results is returned as a Key, Value

using PowerQuery pivot, it is trivial to denormalize the table ( I could not find how to do that in SQL), anyway the results looks much easier to analyze.

by the way just be careful , PowerBI support a maximum of  32766 characters , but there is an easy workaround, split the column by 32766 and then concatenate in a calculated column, yes it will increase the memory size, but it works.

and here is the final results using the beta version of icon Map, for example filtering all the data less than 4 Km, if you want print quality map you can always use R visual, see example here

the custom visual is still in beta, polygons and multipolygons render perfectly, point works but with a visual discrepancy, and I don’t think linestring is supported at all.

Icon map is a very versatile visual, I hope the author will release an official update and fix the rendering bugs and add an option for color per category.

Bigquery GIS is very powerful and easy to use, the documentation is excellent, I wished only they release a smaller public GIS dataset to play with.