This post is an extension to that which instead of looking at engineering model development, instead looks at construction development. I don’t want to delve too much into the details about exactly how this was built (again see the post above).
Some big differences is that I have used a resource assignment view. in addition to the date metrics This allows for resources histogram and progress curves to be quickly sorted down to an activity level. This approach also follows a prior post Resource Analysis Dashboard .
The underlying data is very similar to our engineering progress example. We can use a flat file export direct from P6 with a standard set of columns. As I have mentioned before, you can achieve this in a SQL query as part of a larger data model, although with everything, a delicate balance is needed (balancing database formalism and easy excel solution)
We will also have the resource assignment data
The WBS Slicer and Area Selection
This design element doesn’t work for project with too many WBS elements. For this example, each major area only has about 10 WBS elements, therefore I could pull this off with no drama. I really prefer this selection as opposed to drop downs where it is often difficult to quickly make selection.
The Pie and Metrics
Here we follow much of the look and feel I used with the engineering progress; however instead of just using activity count metrics, I have also inserted hour and percent complete metrics. There is nothing fancy about these.
The Data Table
I’ll sound like a broken record again, when you have a good design with one aspect of a project, you can likely take that and run with it for many other areas. In a following post I will detail this systems engineering aspect to nearly everything we touch.
Obviously the key inclusion into the table is the budget units and %’s. I still prefer these tables views vs the GANTT views. Having clear visibility into the last month dates, the prior month dates, and variances is the purpose of this view.
Again, the extension of this are endless. At this stage, we are starting to see how pre filtered views provide more focused dashboard as compared to a one size fits all. Sitting in an EPCM world, most of the detailed activities and schedules are managed by our contractors. Thus, this construction view is more suited to using an export from a contractor Level 4 schedule.
At some point, we will need to begin to discuss an overarching design where a user can navigate to our various dashboard in a logic way.
I have been asked to produce a simple construction report, we need to show the last 4 weeks of actual progress data and 6 weeks of forecast and to make thing a little bit complex the average installation since the start of the project, nothing special three measures, average to date, install per week and forecast per week
Obviously, it is trivial to be done in Excel using named sets,
if you don’t know what’s named set and cube formula is, you are missing the
most powerful reporting paradigm in Excel, a good introduction is here,
and there are plenty of resources here.
Unfortunately named set is not supported yet in PowerBI, you
can vote here,
Just for demonstration purpose, if you try to add those three
measures to a matrix visual, PowerBI just repeat them for every time period, obviously
that’s not good at all, the actual
installation make sense only in the past and the forecast has to be in the future,
there is no option to hide a measure if there is no value in a column and even
if it was possible we need to show the average installation independently of
the time period, anyway this the report when you add the three measures
and because I already learned a new trick on how to dynamically add measures to a matrix visual in PowerBI, I was tempted to try and see if it works in this scenario.
So, let’s see how it can be done using the disconnected table
Create a disconnected table with two columns
Order and status
Add a calculated column,
As the cut-off date change at least three times a week, the week number change accordingly, we can’t simply hard code the dates, instead let’s add a new calculated column, which will just lookup the week date from a master calendar table based on the order, when the order is -4 it will return “average to date”, I added a dummy 0.5 order just to add an empty space between actual and forecast ( cosmetic is important)
Who doesn’t love the glossy Level 1 reports our project produce. But really, when you look into these beauties, really understand the difficulty that goes into them. What follows is first a description of what a typical Level 1 report is, and how we can structure our excel based data to be a bit smarter.
This is by no means a fully comprehensive guide on this subject. It is instead just a primer to get us thinking about how we feed data into our reports.
Who doesn’t love the glossy Level 1 reports our mega construction projects produce. But really, when you look into these beauties, do you really understand the difficulty that goes into them. What follows is first a description of what a typical Level 1 report is, and how we can structure our excel based data to be a bit smarter (which is the real message to this article).
Interspersed with hopefully be a few key strategy points which can guide your work.
I’ll then showcase how you can take what will now be structured data and upload into a powerBI visual (although the process to capture the data into any database and drive any visualization tool would be the same)
Strategy – Don’t be afraid to use excel (not everything needs to be automated)
Key Elements of a Level 1 Report
Cost and Progress
Here we are presented with:
Overall progress curve
Cost & Commitment curves
Some may argue what to lead with – for me its always %. No bigger value highlights where your are more than what % are we. Not displayed on the image above is a data series reflecting how many people are have and comparison against planned. People achieve progress. Its impossible to talk progress without talking how many people we have. The graphs provide enough enough context to allow for discussions about productivity without having to muddy the waters
The cost sections should include visibility into what our final forecast costs will be (and comparison against baseline). Underneath that key metric are a few sub items such as how much contingency we have, a few cost curves associated with spend profiles and commitment profiles.
Schedule and Narrative
The schedule aspects of a Level 1 report are always tricky. Do we need to only display the final project milestone? For me, on major projects no single DATE has any meaning. Thus even on a Level 1, I still prefer to include 10-15 dates that represent some key aspect of the project. All dates should be compared against what we said last month to highlight current month variances, and dates should be compared against our project baseline (or whatever current approved version thereof).
The narrative section of a Level 1 can nearly always be updated by simply reading the progress, cost and schedule tables. Just put words to the graphs. Key adders here are insights into RISKs. What may come in the future that will alter what we are saying today!
As always, safety metrics are also usually front and center. For me, this has always been a difficult aspect of our jobs. A political correctness that is forced into our reporting. Don’t get me wrong, safety is the most important aspect of a project. So, including a safety table somewhere on the Level 1 is always done. For this article, I want to instead focus on the key project control elements and data integration.
Level 1 Data Structures
So, we all know what a Level 1 report looks like, and I would fathom we can all mostly agree these are the elements included and can be rolled out as a standard for any major construction contractor. Most of our reports likely already report this information in some manner or another. The entire point of this article is that we should really focus on entering the data in a smart data centric way so that if you want to automate anything down the line, you have the foundations to do so.
At this stage, I don’t want to talk about the source data used to generate your summaries. We can leave that for a later discussion.
Key Data Domains
We are aiming towards consistency here and want to actually represent all the data required for our key Level 1 chart to be housed in a database. Therefore we need to have structure.
Strategy – Do not focus on systems, focus on DATA
A critical strategic element in my approach is that I do not care what systems you use. Our reporting is not a function of our systems (at least in this step 1 phase). We instead need a structure from which we can extract data and as easily as possible, move that data into a template or format in which we can drive our level 1.
If you go down the path to seamlessly integrate source systems with a Level 1, you unwittingly constrain yourself.
Typically our (time phased) progress data will be sourced from Primavera. There are other systems where the progress data may live, but again, that isn’t the focus of this article – I don’t care where it lives and neither will any seasoned project controls manager. We just need to know it exists and has a common structure
Here, a few key notes, use a consistent data format. The above structure is how all your progress data should be housed, not just Level 1. All time phased data, all the way down to Level 5 detail items should be managed in a data structure, not a fancy formatted excel file. Trust me, updating a table such as the above will serve you in the long run. Even if your data is fully managed inside a system such as P6 or PRISM or ECOSYS or COBRA, you should be able to at least extract Level 1 into the format defined above.
You guessed, we can capture our Level 1 cost data in exactly the same format
In the graphs we are building, there are only 11 specific datasets. Only 4 of these require update on a period basis. So again, we really boil this down to something simple.
Strategy – Do not over complicate anything in your Level 1 layer
The implementation of the specific data model I have outlined above fits the strategic approach to keep your level 1 simple. Any project can implement this data model for Level 1 with without any integration into source systems. Level 1 can be updated by the project controls team doing a few copy-pastes into excel to capture project wide data. Again, I would assume your teams already do this, but perhaps end up copying this data into various other corporate systems as well.
Again, we are keeping a simple approach and only capture the required information.
Here, we are forced into a different structure. So whereas the cost and progress data can fit the same data model (as seen above), we will need a different template for schedule dates. We will typically be using Primavera, as such this model fits P6, but the idea is universal.
I do not believe this information can ever be fully automated from our scheduling systems. These paths will continually be adjusted. The planning lead will always refine what activities are being tracked to be displayed on the Level 1. Behind the scenes, there are tricks upon tricks to pull the dates, however, again, we are talking about the data layer here, not necessarily HOW you get the data into this format.
It is entirely possible to have the assignments encoded into P6 activity codes. Therefore, it would be possible to integrate your Level 1 data directly into either the source P6 database, or an XER export. In my experience, any automation that is attempted in this arena (for Level 1 data), is futile. We are only talking 10-15 key activities. Let you lead planner sort out how they get the data into this format. Again, our strategy is to not over complicate this. If the data is provided to a digital team in the format about, you are for all intents done.
The model above only captures the finish dates. If added visuals with simplified GANTT charts are needed in your Level 1 (and will be discussed in my next Level 2 article), you would have to edit the above.
The nice value of the above structure is that we have effectively created an interface, an integration layer, between what will be P6 data and our dashboard. The list of what activities can easily be edited by way of a sharepoint list. Then, in your data model, you can link on scheduleID to pull the relevent date data (I suspect many do this).
Too often, narrative comments are shuffled between parties via email, entered into several documents, edited, customized, etc. The project controls team is always struggling sourcing commentary from various sources, and in my experience, we end up entering in something ourselves.
Level 1 data structures have to fit into these complications. In this realm, sharepoint offers a canned solution by way of sharepoint lists.
Strategy – If Technology already exists, use it
Strategy – Technology can be used in innovative ways – use a mashup mindset to use existing technology in a new way
I find that sharepoint lists offer unparalleled capabilities for commentary. However, for lists to be really functional, they need to be embedded into FORMS or some routines that provide export functionality
In this example, I have mocked up a simple INFOPATH form that could represent our sharepoint fields. The sky is the limit when it comes to existing technology that can automate the capture of this type of commentary.
The value adder here is that instead of allowing unstructured comments (via email or manually marking up a word , excel or power point file), we have structured comments that are housed in a database and that database can be updated in a distributed manner using WEB based technologies.
The above would be a web based form which will be updated by the associated responsible parties. However, we can’t quite import a form into our data model. When the above form is filled out, the data will be stored in a data model (which we will have to design first before we can even build the form above). Thus, what we are looking for is something akin to the below
The above is just a table in an excel file, but again, when we house data in this format, it can naturally flow into a database. That is what we need to focus on. Even in our excel reporting world, if you can capture commentary in this tabular data centric way, you can still link to it from your main dashboard tabs to be “smarter” in how information is managed.
Strategy – Focus on the DATA! (I can’t say this enough)
Everything we do can be captured in a data model. Every report we design should be able to pull direct data out of a data structure. Thus, before we add anything to reports, first consider the entire flow of data required.
Putting it all together
At no point in time in the above have I had to rely on a source system. However, I have been able to take a typical Level 1 report and extract everything from it. I have taken this data and outlined a data model (in simple form) that can drive not just 1 project, but an entire corporate endeavor in this space.
As with everything, nothing novel here. Many companies already have systems that capture some of this information. This is more just a thought experiment for those that perhaps do not have a clear data model that supports level 1 reporting. It also highlights the discussion topics of “what are the manual steps” – because there will be manual steps in getting the data into the right format.
For me, everything above has to be manual at some point up or down the food chain. Your projects and portfolios need to have the discussions about where this type of Level 1 data is housed. If all projects already have this data in consistent databases, all you need to do is query that source. Everything discussed here is system independent. You can easily generate these data tables by way of query a source system directly (if you can), but I have not limited or require that approach
Strategy – Whatever you do, allow for flexibility
Even though my data model is entirely excel based, the data structure is very powerful. I can, in quite automated steps, import and convert these datesets into a more database model and thus gain value from dashboards that wouldn’t be custom for your project, but could drive an entire portfolio (and when you see how this scales to Level 2 data and beyond, the worlds your oyster).
If you actually want to proceed with a dashboard, and if you have your data as outlined above, here is what you can do with it. In fact, I would recommend that your source tab in excel that is driving your dashboard looks like the below.
The above data isn’t “immediately” friendly for digital reporting. A few transformations are required. The key steps involved are (the below was done as just an example using PowerQuery)
Unpivot the Timephase date columns
Pivot the the “SeriesName” column to create a unique “Column” for each dataset (this is need to create unique lines on our dashboard graphs)
At this stage, we have a nicely formatted table and we can now import into PowerBI. The intent here is not to showcase a beautiful Level 1 dashboard in PowerBI. My intent is more to showcase the data structures need to drive a dashboard. With the above data, we get pull each data series into graphs, tables, cards, KPI metrics, etc.
Our model has tagged each record with a “As-Of” date. Thus you can utilize this structure to have your dashboard display ALL prior months by way of a slider or select. Given more advanced skills, you can also pull out metrics about current incremental values vs what we said last month. Although, I feel those metrics are best served in Level 2 report where more detail is available.
Apologies for the look and feel below, I just pulled in the data to showcase that indeed you can drive a dashboard with what is effectively just a few lines of data that every project already has. We can bring together cost, schedule, progress, and commentary quite easily and in a very data friendly way.
For me, there is no substitute for an excel based dashboard. The value in this for me is ensuring that when I produce a Level 1 Dashboard (in Excel), I should give consideration to ensuring my data is structured appropriately. This gives us a fighting change to perhaps go down the path of creating a more digital world. It also allows for perhaps more flexibility in dealing with Level 2 data to maybe have some real automation of rolling up of data.
Level 2 obviously. I hope to showcase how the same ideas and concepts here can also help you structure your raw excel based Level 2 data to perhaps be better utilized in a more digital world
When I start that AEMO Dashboard , I had a hard time dealing with PowerBI gateway, it is just setting there, my laptop has to be online whenever I need to schedule a refresh, it just annoyed me, and I could not understand how cloud based data needs on-premises gateway anyway, obviously later I learned that strictly speaking it was not required, there was just undocumented feature to get away of it ( the trick is in the first blog, thanks @Rad_Reza).
but before I was aware of that, I went to some rabbit holes dealing with new tools that are out of my comfort zone, and I think they are worth sharing.
my first thought was instead of accessing the data directly from the website, let’s instead copy the data to a cloud storage then read it from there, I have already a google storage account, it is very generous with a free 5GB storage, my data is not big around 2 GB of zipped csv..
first setback, there is no native connector to Google storage and even if there was ,we have something called egress fees, in a nutshell, cloud storage is really low cost, loading data is free, but getting your data out is not free, unless it is for the same provider and the same region, most of the cloud vendors use the same model, as my data will be processed in PowerBI, the clear choice is azure blob storage
Azure Blob Storage
the setup is very simple I used the following options :
the same region as my PowerBI region ( otherwise your pay the egress fees)
for replication I used LRS
as PowerBI don’t support data lake V2, I used the classical Blob Storage.
When you want to copy data, the official tool in azure is data factory, I tried to play around with copy activities, it is straightforward, my first attempt did work and it was fast , actually too fast 😊, no zip was transferred but rather an HTML
probably copy data just handle this case just fine, but when you use your own credit card on a cloud tool and you don’t know what you are doing, better stay back and take the time to understand how it works, I deleted the new created resources and went to the second option, Python !!!
Normally I go with R but blob storage has no API for R, I have very limited experience with Python , just using it for the excellent package altair , let’s try something new.
I was very pleasantly surprised, the amount of documentation for Python is just amazing, actually once I asked a question on stackoverflow and got a very succinct answer in less than a minute, no one was judgemental or downvoted my question ( the question was very basic). the only drawback is that sometimes the code works well for python 2, but I am using Python 3 anyway enough talking let’s show some pseudo code.
step 1 : get a list of files name from the web site
url = “web address where the files are saved”
result = urlopen(url).read().decode(‘utf-8’)
pattern = re.compile(r'[\w.]*.zip’)
here is a snapshot of the results, the full list is 60 items.
step 2 : get a list of files name from the blob storage
in the first run, the list is empty as we did not load anything yet, I load a couple of files manually just to test if it is working, the API for blob storage are very simple, you only need to provide your storage account name and key and I love that.
step 4 : Upload the new files to Azure Blob Storage
the same here, the Azure API are very simple and clear, I had only when issues, when the script upload in a loop, it does not wait until the transfer is completed before jumping to the next file, my workaround was just to use sleep ( sync is supported but not in this scenario where the input is from an url), anyone i got the answer in stackoverflow
ok, so we do have a script that works, now we need to run it on a schedule, once per day at 5 AM, keep in mind the whole purpose of this workflow is not to use on-premise software, I just need to find the service that runs a script on the cloud on a schedule, as I am already on azure, let’s stick in that ecosystem.
and it is a personal project, I prefer a free solution, my script runs only every 24 hours, for a couple of minutes, a quick google search and i find this little treasure, I will not repeat here the steps, WebJobs is a service that just do that.
note that the package azure-blob-storage is not a base package in Python you need to install first in WebJobs, the schedule functionality is very flexible as it is using CRON, I wish we had something like that in PowerBI Dataflows.
Every day at exactly 5 AM, a new file show up in the azure storage, although I don’t need those files, I am using now another approach to load the files directly in PowerBI, it is important to build a data lake ( yes, I just said that, I am just joke, data lake is folder in the cloud where you save the raw files, nothing more), storage is cheap but most importantly the requirement may change, I may need to report on another dimension and it is crucial to keep the raw unprocessed data.
Python is awesome
Azure API for python are straightforward
Azure is awesome.
Be careful of Egress fees
CRON is awesome wish it was supported in PowerBI dataflows.
Wish PowerBI dataflows could save a raw file, Powerquery is amazing but it does not copy raw files.
I was looking for the Power Production of a particular solar farm, and I couldn’t find any public dashboard that show this level of details, all I could find was high level aggregated data (Later after I built the dashboard I found this excellent resources Nemlog)
Current, last 60 days of data ( current day not included, Updated at 4 AM)
Archive : the last 13 Months of data ( current month not included, Updated Monthly)
Pulling data from a website and building a dashboard in PowerBI is straightforward, it took me a couple of hours on a weekend to do it, the problem is how to maintain it.
Ideally, you build a dashboard and all the refresh is done by the service, which was not the case here
Pulling the data directly from the archive is very slow, it takes nearly 3 hours ( unzip, filters only the data we are interested in), and is not sustainable as the earliest month will be removed from the website, I like to keep the history, and it is really bad practise to download the same data every day
To keep the history, we need to save the archive somewhere else, too easy , just save it on a local laptop
History issues solved, now we created a new problem, on-premise data require a gateway, basically you need to install a software on your laptop, and obviously the laptop must be on when you do the refresh
After playing around of some options, I come up with this workflow
Create a local folder that contains all the archive files.
Create a PowerBI data model on the desktop just to process the archive data
Export to clean tables ( price and Production ) to CSV using DAX studio !!!!!!
Load the CSV to azure blob storage ( to get rid of the gateway)
Load the current zip files from the web site , it does not require the gateway, but you need the following consideration – Use relative path in web.content functions ( see Chris Blog) and @TheBIccountant
Append the data from azure blob storage and the current folder from the web site, the refresh is now very fast, as PowerBI just read the csv without any transformation
Publish to web
Good so far, I manage to get rid of the gateway, the dashboard is refreshed automatically in the service, no maintenance for 60 days.
as the current folders contains data for the current 60 days only, you need to update the initial CSV files.
Download the pbix from the service, export the csv , and upload to blob storage, you need to do that only once every 60 days.
PowerBI Publish to web is an amazing service and it is totally free
Powerful solution without writing any codes
PowerBI free license is free 🙂
Publish to web is not suitable for real time, as it takes nearly 1 hours to propagate the update to the web site, that’s why I can’t publish the current day data, which is updated every 5 minutes.
Publish to web does not include export data from the visual
pricing for azure blob storage can be tricky : storage itself is very cheap, data upload is free, download in the same region is free ( for example blob to PowerBI service), but when you read data from the blob to PowerBI desktop you incurs charges, so just be careful, it is not your Onedrive model, where download is free.
we showed here a simple workflow using PowerBI free license and azure blob storage (Dropbox), it is very easy but with one inconvenient you need manual operation once every two months, that’s a bit annoying.
edit 23-June-2019 :after I published this blog, I got an excellent feedback from Maxim Zelensky, actually using PowerBI dataflows ( require a PRO license), we can fully automated the whole process, as with dataflows we can have a self reference query, I am not going to repeated here, go and read it
edit 24-June-2019: as it is a personal project, and the data is public, I am not really excited about using a paid service to host the CSV files, I moved the two csv files from blob storage to dropbox, it is totally free, so the whole dashboard infrastructure is free, Good work Microsoft
edit 26-June-2019 : a proper solution will be to save the raw data in a data lake, see here
Feel free to provide comments directly to my LinkedIn Post that references this article and contact me directly – Darrin Kinney
Analyzing resources in P6 is a common responsibility of all planners. However, ultimately the data typically stays in the realm of the planner, and never properly given to the specific project managers and project engineers who actually need to execute the work. Thus, there exist a huge opportunity in the digital world to extract data from P6, and present it to the masses. Planners are penultimate professionals in this. We live for this. We take the resource assignment data from P6 and work our magic in excel creating a suite of reports and graphs (s-curves).
However, all the excel work is customized. The ability to quickly drill into the data from the data table or the graph is just too difficult for management. The need to create custom graphs, takes time. Instead, we can create something quite amazing using PowerBI.
Using PowerBI we can create a simply easy to use dashboard that provide nearly unlimited flexibility to display both schedule and resource information to any user on a project. The below is a walk through to create a simple view with a simple schedule. However, I have run this through a schedule with 5,000 activities and multiple resources including both manhour resources and quantity resources. It is situations where you have data overload that PowerBI shines. Get the data out from the planner, and into the hands of the project management group.
Step 1 – Get resource assignment data
Using the resource assignment tab in P6, remove all the grouping and just display everything. You will want to ensure some key fields are available: Start Data, End Data, Resource Data, Resource ID and Type, Activity Name and ID, plus some WBS and grouping data (ex contract).
Step 2 – Copy-Paste Resource Assignment Data into Excel
A simply copy-paste from P6 into excel will suffice. It is really that easy. Sometimes the data from P6 may not contain the right descriptions, so this is an opportunity to use some vlookups (or Index(Match()). The data also needs to be presented in a table. Select All and use CNTL-T to convert to a table.
Step 3 – Unpivot Data using Power Query
This step can be done directly inside PowerBI Desktop, or in perhaps a more flexible excel environment. I prefer to do as much data handling in excel to reduce the complications once inside PowerBI; however, many options exist.
Inside Excel (ensure you have power query add in), select the columns with the dates, and click “UnPinvot Columns”. Close and Exit.
After running the unpivot, you will see each data column turned into a row. This is an easier data format for use in databases and a trick universal not just for powerBI, but anytime you might be dealing with databases.
Step 4 – Import Into PowerBI
PowerBI has very quick and easy import routines. This post is not meant to be a specific click by click guide. There are easy import routines you can find to import excel files. Again, as indicated above, just make sure your data is converted into a table. You can directly create a datasource using something similar to the below
The parameters you will want to use are roughly as noted below
The PowerBI file use AREA CHART. This is an easy to use built in graph feature of PowerBI. You can use potential extension of line graphs to show a % Complete curve. However, from a base usage, the area chart provides the easiest visual.
A nice trick in the overall dashboard is to align the data range of the GANTT with the GRAPH. I have found some limitations to this exist, but in this example, it worked quite nicely.
The real power of this visual is through the use of slicers. Slicers allow for immediate filtering based on a variety of selections. This visuals uses a range of filters for Contract, Facility, SubFacility and ResourceID. Depending on the structure and data you extracted from the P6 resource assignment, your options here are unlimited.
This example just scratches the surface of what is possible. I found the above to be immediately useful to our project team in clearly visualizing the resources required over time. There is some manual effort to keep the source excel updated after changed to P6 – this is not a live linked datasource. There are numerous possible development extension people can use to customize this.
The biggest critique to the dashboard is the lack of visibility into the Baseline dates and plan curves. For this, we can only hope someone builds a more robust custom visual to accommodate this. In the end, happy planning.
One of the most popular discussion in planning forum is how to have an integrated project control system, every practitioner has a different opinion how it must be done, and of course you get a lot of marketing from people trying to sells their systems
In this blog, we share high level description of a data warehouse built using PowerBI desktop.
Data warehouse are not widespread in the construction industry, because the reporting specifications are different from project to project, and every client has a different systems and tools, and trying to have a standard system works only if you are the main contractor but if you are a subcontractor you have to adopt the client system.
Another reason is; it required a specialised IT skills, we are just business users not programmers, we do understand data very well, but not necessarily having the skills and tools to manage it, The good news is, with the rise of self-service Business intelligence, we have exactly that, Powerful data management tools yet accessible (assuming you want to learn something else than Excel).
So at high level this is how it works.
The Data warehouse was built using PowerBI desktop, I know it should be called sematic model, (for me data model, data warehouse are fundamentally the same thing), initially it was using Excel PowerPivot but it did not scale well with the increase of the volume of data.
As the data is not always in the format we want, PowerQuery is very handy in this case, as virtually it can transform any source of data, example lookup the subsystem using the tag field, trying to do that if you have 8 Million rows using Excel or Access is not feasible.
We maintain Master tables to integrate all the different source of data (tags, WBS, subsystems etc)
Every week, we get new Export from the source systems (Cobra, proprietary database systems etc), we load the new data and keep the historical records, it took 15 minute to refresh, which is quite impressive, Cobra alone is a folder 60 Excel file, and nearly 2 Giga in size.
Usually you publish your reports into PowerBI.com service to end users once your refresh your data model, in our case we can’t use the cloud for privacy reason, instead we use Excel as a reporting tool that pull the data from PowerBI desktop, the advantage of this approach is that we have different reports for different users, Skyline, Gantt chart, Client reports (in their required format), management reports etc.
As you can see Primavera P6 is used only as a forecasting tool and to calculate the critical path, the earned value calculation is done in the data model, personally I think P6 should not be used as the centre of your project control system, I remember the first time I start learning Primavera P3 ( a long time ago:), we kept asking the trainer how it is possible to track the spent hours at the activity level, the answer is we don’t, actually deciding at which level you track you spent hours it is the most important decision to make when you start a new project.
the basic idea here is in order to have an integrated project controls system is stop trying to have one, data will be always in silo, don’t try to change other department how they manage their specific data, it will not work and they will not listen to you anyway, So instead of trying to have one system to rule them all, just use the existing systems and build a data warehouse for reporting and Integration with P6.