Using PowerQuery to read Primavera Xer file

You can download the file here  or view the reports online

The last time I tried parsing an Xer file, I was using VBA and it was a horrible experience, anyway that was three years ago, The problem was not VBA per se, but supporting the different combination of Excel and Windows, I still receive sometimes emails from users complaining that Xer reader do not work for them, after a while I concluded it is not worth the effort.

Yesterday for no reason, I got this idea why not trying Powerquery , in 10 minutes I had a working prototype without writing a single line of code!!!  even better it reads in bulk, copy a bunch of xer in the same folders, and there are loaded, only conditions they needs to have the same revisions( don’t mix P6 rev 7 with rev 8), as different revisions of P6 introduce some minor changes in the Xer schema, ( they add or remove some columns)

It is only a proof of concepts, I only loaded some tables, (activities and activities code), the data is loaded in the Data Model, and you need at least Excel 2013 with the free PowerQuery Installed.

1

Make sure you write the path to where your Xer is stored

2

You need to refresh the Queries here

6-6

Here is a pivot with some measures, now you can group only by Project and Activity codes and obviously by activity id.

6-3

I test it with two Xer that contains 47 projects

How to import Progress to Primavera

It is quick post showing how to import activities status to Primavera using Excel.

Cobra is as an example for a progress measurement system, it can be any other systems.

In P6 always use new template, never trust an old file as things may have changed in the meantime and you end up with discrepancies.

Assuming the schedule is resource loaded, you need first to import

  • The new activities status ( actual start, actual finish, remaining duration )
  • The new resource status ( Budget hours, Earned Hours, at completion hours)

You need to import first the activity status then the hours.

  • Export activities status from P6

Make sure you select Export type, Activities

0

The Activity ID, activity status and WBS code are selected by default, you need to add Actual start, Actual Finish and Remaining duration

1

P6 will export an excel file LIKE THIS

7

Now you can transfer the progress data from  COBRA, using the activity id as a reference, don’t change the WBS code.

In some cases you may let P6 calculate the remaining duration, personally I prefer to do the calculation outside P6, as it is more flexible, for example you can calculate remaining duration based the actual productivity, which is very easy to do in Excel

Import back to Excel, Make sure to save any error message

  • Import the hours

The same approach, make sure you select, resource assignment

2

3

8

Make sure to update only the column (Budget units, Actual unit, At Completion Units),

The activity status cannot be updated from this template that’s why you need first to import the activity status and then the resource status.

Note: if the activity status is completed, make sure actual unit = at Completion units.

P6 can assign multiple resources to the same activity, so make sure you don’t have duplicate resources assigned.

Why you must automate the whole thing ?

As you can see the import to P6 is rather straightforward, probably you are asking why I am blogging about it in the first place?  And the reason is how do you know your import was successful?

Ok for the hours it is easy, if the total hours match, then the import was probably successful, but what if the total hours do not match? Or how do you know the activity status is updated correctly

One approach is to design a system for example in Excel , Modern Excel or Access that do the following.

  • Automatically map P6 activity ID to Cobra.
  • Generate the activity status and compare it to the current P6 status
  • Import the new status to P6
  • Redo step 2 and 3 until the status in P6 match exactly the status in Cobra.

Now if you are spending too much time manually updating P6 ( we all done that), I think it is unfortunate, and honestly it a waste of time better spend in analysis the schedule, ok it is easy said than done, and there are many technical and non technical roadblock to have such a system in place, but I think the most important one, the Planner responsible for the initial preparation of the baseline is more concerned to have his schedule approved than to worry about how the schedule will be maintained later. ( usually it is other planners problems).

but if you stick to this general rules, probably you will do fine

  • Only use, contractual Milestones, Access Milestones and resources loaded activities.
  • Schedule activities should never be more detailed than your Bill of Quantities
  • if you can not measure a task, don’t use it
  • don’t use level of effort just for presentation purpose.

Edit : 17-04-2017, for complex project, we just use one generic resource to load the total hours and earned hours per activity,  trying to load individual resource type is not feasible  and become a nightmare to manage and honestly even getting the “spent Hours” at the activity level is not possible,  that’s why I think Earned valued and productivity analysis should be done outside P6

How to use Excel as a Front End to Power BI Desktop

You can download the data model, the excel frond ends and the data sets in this folder.

The techniques used in the spreadsheet for connecting to PowerBI Desktop are not officially supported by Microsoft,  ( as of  24 may 2018, Microsoft don’t mind people using this approach ) but I think are useful to understand how the MS BI stack works, especially if you are coming from an Excel Background, and Installing SQL server developer is either intimidating or not allowed.

I really appreciate if you vote for this idea to make this scenario offically supported by Microsoft

As far I am aware there are two approach to access PowerBI desktop, or more correctly the SSAS instance launched by Power BI Desktop.

  • Connecting using a live connection, detailed here
  • Connecting using PowerQuery, detailed here

 

The live connection is very interesting but it has the drawback that if you close PowerBI Desktop the pivot table stop working, so you can not share the spreadsheet, one solution is to use cube formula as they are persistent, if you don’t use know what a cube formula is then you are missing of the most powerful feature of Excel.

But what if you want to have a pivot chart, or a pivot table that you can keep using even if you lose the connection, or if you want to share the results with people that have not access to the data model,  turn out it is possible, welcome to Excel Pivot cache

 

  • Invoke the function SSAS_QUERY
  • 1
  • The Parameter is optional:
    • Either you write a DAX Query to retrieve the fields you want, MDX is supported too, personally I find MDX more suitable to import measures with different dimensions, I understand that DAX support this scenario too using crossjoin but I never manage to make it works correctly.
    • or Just click ok, than you can browse the SSAS cube, you can select any dimensions and measures you want, but mind that for a big cube, a query fetch the result faster.
  • Keep Powerquery as a connection only.
  • Insert a pivot table, use an external data source, choose connection, select the Powerquery Query, and voila

2

 

 

Excel cache the data in the Pivot, not only that it is extremely compressed.

Notice here PowerPivot is not used at all, Excel is acting as visualisation layer to PowerBI Desktop, leveraging two well known capabilities cube formula and Pivot cache.

Microsoft plan to release SQL server v.next this year, and then we can deploy the data model built with Power BI /Powerpivot /Powerquery into a production system, that’s what I call a natural growing path from self service to corporate BI.

Edit Dec-17 : i add a new file that use only VBA, so Powerquery is not required, it can be useful if you are still on Excel 2007, or you can’t install Powerquery.

 

Thin and Core workbook without Sharepoint or SSAS

edit : 1/1/2017, another option is described here, the thin workbook connect directly to Power BI desktop, very handy if: your client are still on Excel 2010, or you don’t want to share the data in the data model but just the results.

 

The concept of thin and core workbook was familiarized by the guys at PowerPivotPro, but they assume, either you have SharePoint installed or a connection to the cloud.

Let’s assume the only resource you have are:

  • Your PC.
  • A local server used for shared folders.

Backend and Front End approach

if you have small data sets then load everything to PowerPivot and do the reports in the same workbook, but if you are dealing with a lot of data sets, and you need to provide tailored reports to different people, typical example your client wants a particular format, the PM want a different format, the construction manager wants something different altogether, we all know what the commissioning guy wants 🙂

Then a sensible approach is to separate the data model from the reports itself, the backend will be used to store the data and a front end to report and visual the data.

Front End:

Excel: natural choice for many obvious reasons.

Tableau: has a killer feature in this scenario.

Backend:

There are many options, depending on the size and the complexity of your data.

1-MS Access:  is a solid option, but if you have tables that have more than 500 000 rows, then perhaps that’s not really a good solution, I know it depends on filed type, number of columns etc., the other issue there is no ETL for MS Access.

2- MS SQL SERVER EXPRESS: it is a free and a kick ass Database server, probably it will load all your data, but if your data is not well structured then it became a pain to load it, there is no free ETL for SQL express, and Powerquery does not natively load to SQL, you can vote for this idea here .

3- PowerBI Desktop: load the data in PowerBI Desktop and access the data from Excel using this hack. It is highly experimental but very promising, it is potentially a game changer.  You can vote here if you want Microsoft to officially support it.

4-PowerPivot: it is already there with Excel, Powerquery can literally load any data you have, it can clean transform with a simple click, and it is free, and don’t forget PowerPivot is a simple SSAS instance embedded in Excel, ok there is a problem 😦

We cannot access PowerPivot Model from another Excel workbook.

PowerPivot data can only be consumed from the same workbook :), I don’t know the reason but definitely it is not technical, all we can do is just to vote and hope Microsoft will listen

But fortunately there are some workaround.

1- Tableau: tableau can read data from a PowerPivot data model, and it has some very amazing visualisation, (not free)

2- PowerUpdate: Powerupdate can load the data directly from the PowerPivot Model to an SQL Server, and bonus point, it can refresh any workbook automatically no need for manual refresh (not free)

3- Reverse Linked Table: instead of using pivot table to read the data from the Pivot tables, we just used tables to get the same result, the Italians provided a detailed explanation here, and to avoid Excel limitation of 1 Million record, you can split the result in new tabs, you can even create summarized data that feed your different report, the only inconvenient here, the size of the workbook will increase with a lot of data.

Conclusion 

a simple option to deals with a lot of data is to load all the data source in one Master workbook, then write back the result in the same workbook using Reverse Linked Tables.

Now you can use that Master workbook as your backend for all your reports.

or use PowerBI Desktop as your backend, officially it is not support but Excel can perfectly connect to a local instance of  PowerBI desktop.