How to use Excel as a Front End to Power BI Desktop

You can download the data model, the excel frond ends and the data sets in this folder.

The techniques used in the spreadsheet for connecting to PowerBI Desktop are not officially supported by Microsoft,  but I think are useful to understand how the MS BI stack works, especially if you are coming from an Excel Background, and Installing SQL server developer is either intimidating or not allowed.

I really appreciate if you vote for this idea to make this scenario offically supported by Microsoft

As far I am aware there are two approach to access PowerBI desktop, or more correctly the SSAS instance launched by Power BI Desktop.

  • Connecting using a live connection, detailed here
  • Connecting using PowerQuery, detailed here

 

The live connection is very interesting but it has the drawback that if you close PowerBI Desktop the pivot table stop working, so you can not share the spreadsheet, one solution is to use cube formula as they are persistent, if you don’t use know what a cube formula is then you are missing of the most powerful feature of Excel.

But what if you want to have a pivot chart, or a pivot table that you can keep using even if you lose the connection, or if you want to share the results with people that have not access to the data model,  turn out it is possible, welcome to Excel Pivot cache

 

  • Invoke the function SSAS_QUERY
  • 1
  • The Parameter is optional:
    • Either you write a DAX Query to retrieve the fields you want, MDX is supported too, personally I find MDX more suitable to import measures with different dimensions, I understand that DAX support this scenario too using crossjoin but I never manage to make it works correctly.
    • or Just click ok, than you can browse the SSAS cube, you can select any dimensions and measures you want, but mind that for a big cube, a query fetch the result faster.
  • Keep Powerquery as a connection only.
  • Insert a pivot table, use an external data source, choose connection, select the Powerquery Query, and voila

2

 

 

Excel cache the data in the Pivot, not only that it is extremely compressed.

Notice here PowerPivot is not used at all, Excel is acting as visualisation layer to PowerBI Desktop, leveraging two well known capabilities cube formula and Pivot cache.

Microsoft plan to release SQL server v.next this year, and then we can deploy the data model built with Power BI /Powerpivot /Powerquery into a production system, that’s what I call a natural growing path from self service to corporate BI.

 

Advertisements

Thin and Core workbook without Sharepoint or SSAS

edit : 1/1/2017, another option is described here, the thin workbook connect directly to Power BI desktop, very handy if: your client are still on Excel 2010, or you don’t want to share the data in the data model but just the results.

 

The concept of thin and core workbook was familiarized by the guys at PowerPivotPro, but they assume, either you have SharePoint installed or a connection to the cloud.

Let’s assume the only resource you have are:

  • Your PC.
  • A local server used for shared folders.

Backend and Front End approach

if you have small data sets then load everything to PowerPivot and do the reports in the same workbook, but if you are dealing with a lot of data sets, and you need to provide tailored reports to different people, typical example your client wants a particular format, the PM want a different format, the construction manager wants something different altogether, we all know what the commissioning guy wants 🙂

Then a sensible approach is to separate the data model from the reports itself, the backend will be used to store the data and a front end to report and visual the data.

Front End:

Excel: natural choice for many obvious reasons.

Tableau: has a killer feature in this scenario.

Backend:

There are many options, depending on the size and the complexity of your data.

1-MS Access:  is a solid option, but if you have tables that have more than 500 000 rows, then perhaps that’s not really a good solution, I know it depends on filed type, number of columns etc., the other issue there is no ETL for MS Access.

2- MS SQL SERVER EXPRESS: it is a free and a kick ass Database server, probably it will load all your data, but if your data is not well structured then it became a pain to load it, there is no free ETL for SQL express, and Powerquery does not natively load to SQL, you can vote for this idea here .

3- PowerBI Desktop: load the data in PowerBI Desktop and access the data from Excel using this hack. It is highly experimental but very promising, it is potentially a game changer.  You can vote here if you want Microsoft to officially support it.

4-PowerPivot: it is already there with Excel, Powerquery can literally load any data you have, it can clean transform with a simple click, and it is free, and don’t forget PowerPivot is a simple SSAS instance embedded in Excel, ok there is a problem 😦

We cannot access PowerPivot Model from another Excel workbook.

PowerPivot data can only be consumed from the same workbook :), I don’t know the reason but definitely it is not technical, all we can do is just to vote and hope Microsoft will listen

But fortunately there are some workaround.

1- Tableau: tableau can read data from a PowerPivot data model, and it has some very amazing visualisation, (not free)

2- PowerUpdate: Powerupdate can load the data directly from the PowerPivot Model to an SQL Server, and bonus point, it can refresh any workbook automatically no need for manual refresh (not free)

3- Reverse Linked Table: instead of using pivot table to read the data from the Pivot tables, we just used tables to get the same result, the Italians provided a detailed explanation here, and to avoid Excel limitation of 1 Million record, you can split the result in new tabs, you can even create summarized data that feed your different report, the only inconvenient here, the size of the workbook will increase with a lot of data.

Conclusion 

a simple option to deals with a lot of data is to load all the data source in one Master workbook, then write back the result in the same workbook using Reverse Linked Tables.

Now you can use that Master workbook as your backend for all your reports.

or use PowerBI Desktop as your backend, officially it is not support but Excel can perfectly connect to a local instance of  PowerBI desktop.

 

 

 

How to Build a Level 5 schedule using PowerPivot

Download the file Here, keep reading, if you want to see how it works

There is no industrial standard that define rigorously the levels of schedules, every company has it is own definition, but for the purpose of this post, the level 5 is at the tag level.

From working in different projects, I reach the conclusion that regardless of how much you details your schedule, you will have always some stakeholders that wants to see a more detailed schedule, or wants a different view, some examples:

  • The commissioning guys wants to see the construction schedule by subsystem.
  • The field supervisor, want his own scope by tag with physical quantity to install.
  • PM wants a level 2 by SP and phase ( construction, Precom & Com )

a planning software can easily accommodate those requirement using activities code, you can always make new layout and groups by activity code, but some request are impossible to satisfy unless you start breaking down your activities to the lowest level possible, which I think is counterproductive and may lower the schedule quality if not done properly, and put a lot of pressure on the planners that need to maintain this level of details.

Goal

Taking the “pressure” from P6, but produce multiple Gantt chart report with different granularity to satisfy our customers 😦 Management, client, the guy in the field etc.)

Solution  

the solution involve using Excel to mashup the data from Primavera P6 and Cobra and generate Gantt chart and S curve by any grouping possible, notice here how the conversation shifted from ” it will take a lot of work in P6″ to ” tell me how do you want your Gantt chart”.

P6 and Cobra are used as an example, it can be used with any other tools, the data transfer is using Excel files, so anything works as long the proper codes are used, as a matter of fact I don’t have access to cobra at all,all I need is an export.

architectures

How it works

Export Resource spread from P6

Export the remaining labor units from P6 per day, the reason to use days, so can group later using different calendar ( week, financial period, calendar month etc.).

export_p6

Make the export database friendly

P6 spread the data horizontally by default, this format cannot be used as a source for a database, we need to make the date as list.

export_p6_unpivot

Now the date are “vertical”, next step generate Forecast_%

Forecast_% = (remaining labor units by day)/ (total remaining labor units)

Here we go

export_p6_unpivot_normalized

Notice here, the sum of all the forecast % per activity id is always 100%

(1.2 means the field’s data type is decimal)

Now we have a proper Table that we can load to the data model, I used to do a lot of crazy tricks to have the same results using the tradition Excel approach (VBA, formula) but now with powerquery it is one click J

Import the MTO from COBRA

You need the detailed breakdown, no aggregation, with all the dimensions (tags, pay code, subsystems, separable portion, PCWBS L1,L2,L3, weight, etc.), the point is:  the MTO will be always more details than P6 and that’s the way it should be. Don’t try to replicate the same level of details in P6, it does not work. (at least that was my experience),

Here is a sample format (dummy data for presentation only, in big project, it can be millions of lines)

mto

Mapping between COBRA and P6

Ok that’s the hardest part, and will not be covered in this post, in Cobra for every line item you need to assign a P6 activity ID, no compromise here, it has to be 100 % done.

Link the MTO to P6 Spread

We use P6 activity ID as the lookup code between the two tables, and again with one click in PowerQuery it is done (try doing a vlookup with a 300 000 line item MTO)

merge_1

merge_2

We have now a table that have the level of details of the MTO and the spread from P6

mto_spread_tables

Unfortunately many people still think it is Excel, they miss the fact we are talking about a real database working behind the scene, here an example from an actual project

mto_spread

Generate the dimensions tables

The dimension table is how you want to slice your pivot table (BI people will cry at this explanation), in our case we want a Gantt by tag, subsystem, wbs, drawing etc.

It is trivial to get a dimension table in Powerquery, select a column, remove the rest, and remove duplicate, the key should be unique

Here an example for tag, the rest is the same method.

 dimension

Link the data in the PowerPivot Data model

Load everything into the data model, and link using the common keys

A simple straightforward star schema

data_model

Add measures

Let’s add two basic measures here, forecast qty and forecast hours

measure_rem_qty

The beauty of the measure is that they automatically change, when you add new dimensions to the pivot table, in this case, the formula here is the same as using sum product in Excel (please google the difference between sum and sumx)

Generate the chart In Excel

The Gantt chart will be generated In Excel using Pivot Tables, with conditional formatting to simulate a Gantt chart, the end users can slice and dice as they wish, you can add many combination as you want, and trust me, people will ask for some combination that you never thought about before, the way a field supervisor want to see his Gantt, is not necessary the same as the planner expect a schedule to be organized,

A supervisor want to filter his Gantt by drawing number, go do that in P6?

gantt_chart

Just for fun, Construction Gantt chart by subsystem

gantt_chart_subsystem

Conclusion

Using PowerPivot as a reporting tool can offload a lot of pressure from P6, and keep the different stakeholders engaged with the schedule as they get customized views.

Keep P6 for what P6 is good at, critical path, date, spend that time on making the logic sounds, the duration that reflect reality and let reporting be done in a better suited tools.

Edit 27-04-2017 : if you don’t have the spread from P6 and all you have is activities start and finish, you can use this approach 

 

Completion Skyline template

download the template here.

skyline charts are very popular in the precommissioning and commissioning phase, as in one page, you can see the remaining subsystems to be done, and how much ITRS to be completed, here is an example.

skyline

Excel does not support this chart natively, even power bi,( you can vote for this idea) , some people manually draw this chart in Excel, which is really time consuming or using complex vlookup formula, in this blog post i will share with you a template i built using powerquery and powerpivot, as you would expect that chart is generated with a simple refresh.

what data source you need 

for convenience, the data source are included in the same file, i used a dummy data for obvious reason.

Export from your completion system :completion_system_export

Export from your Planning Software :p6_export

A Master Table for Dates:mst_date

a master table is included as you needed to aggregate the date, there is no point in showing a skyline per day, here i am using week, but you can change the formula to be per month.

refresh

go to Powerquery, show pane, and refresh

refresh 

some consideration 

  • Although I personally use P6, any Planning software is fine, basically we need only 1 forecast date by subsystem per phase,  if the subsystem is spread in multiple PCWBS, then use the Maximum date
  • P6 here is used only for the forecast date, the actual data ( ITR and completion date) are from the completion system ,  in some cases, i overwrite P6 forecast date if it is earlier then the completion subsystem cut off date, if P6 does not provide a date for a particular subsystem,  I use the latest date from the equivalent system.
  • in the slicer Phase, select either Precom or Commissioning, if  you select both, th skyline will not work properly.

Introduction to Modern Excel

Download the .pbix file, Click  here to open on the web.

Untitled

Modern Excel is a term  first coined to distinguish between Excel 2007, and Excel 2010  that saw  the introduction of PowerPivot and PowerQuery,  Excel has become a first class enterprise Business intelligence solution, Millions of row can be loaded, using PowerQuery , data can be linked using PowerPivot interactive, reports are easy to generate using PivotCharts and Cube Formula, and your audience will have more confidence in your reports.

Basically excel become a personal datawarehouse that store and analyze millions of records from multiple data sources.

A sample dashboard In Modern Excel

A “traditional” excel dashboard has no real distinction between the data source and the report itself, everything is mixed together with the results generated by a mixtures of vlookup and sumproduct, and maybe some VBA code,  if the boss ask for a different report, you have to start from scratch, but most importantly, it is error prone, if you move one cell, you may change the results without noticing, and there is no obvious way to audit the results.

The Dashboard was first built in Excel using PowerPivot and Powerquery then imported into PowerBI desktop then  published to the web.

I could have simply distribute the excel file with the reports  but I wanted to emphasis it is not simply another Excel spreadsheet with zillion of cryptic formulas that no one can understand except the author but a proper database solution based on columns not cells.

 

How it works

Anyone who used MS Access will find modern Excel very familiar, so let see how the steps needed to generate the sample  dashboard, it is only a very high level description, not a detailed tutorial.

Data Source: most of SQL Servers, excel, text files, web, ms access, basically any data source you may think of

Data Preparation: PowerQuery is a Powerful ETL Tool for cleansing data and prepare it to be loaded in the data Model, for example if your export the spread of hours from P6, the data will be spread horizontally, in a couple of clicks  you can unpivot the data and make it as a table

Data Model: is where the relationship between the table are defined, here is a screenshot from the sample dashboard

DataModel

Dax:  the language to query the data model and define your measures, it is not that different from sql queries

Report and Charts: Pivot table, Pivot chart and cube formula that read from the data model

Here is an example of a pivot table that query from the data model, notice how are all the tables are visible, a classical pivot table will show only one table.

Pivot

Powerpivot vs PowerBI

It can be a little confusing when you hear all those terms but basically The core technology in Microsoft Business intelligence stack is in an in-memory analytic engine called Vertipaq, the same technology is present in different products intendent to service different market segment

  • PowerBI : the cloud offer from Microsoft to provide business intelligence for enterprise, it mainly compete with the like of tableau and Qlik, PowerBI come in two editions free and professional.
  • SQL Server Analysis Services : On premise Enterprise offer come with MS SQL Server
  •  PowerPivot : in Excel 2010  and up for self-service Business Intelligence

Wrap Up

Self service business intelligence is changing how to deal with data, microsoft has manage to bring a powerful technology to the end users,  previously a word like datawarehouse was exclusively a heavy IT investment, now with PowerPivot and PowerQuery you can do it by yourself.