CPM is Broken – The WHY

I spend a lot of time thinking about project controls, it is my love and passion. Specifically in the planning world, P6 is weapon we are tasked with us. While I firmly believe the way in which P6 is used on projects needs to be expanded (extended) with the use of Agile (JIRA), nothing tops the user base and acceptance P6 has in our business.

P6 is fundamentally, just a CPM platform. We have activities and we have relationships (and constraints). It would seem simple to use at first; however, in practice, it is anything but simple and the way we build schedules is not helping matters.

What has specifically changed over the past 15-20 years is the DATA SOURCE from which we derive our data. 20years ago, we would run progress reports from P6 (we still do, but have complicated matters). Today there is much better visibility into deliverables on a project. What follows are some example from a construction project; however I feel as if the issue is more visible in the engineering and study side of things. Additionally, what has changed is the global adoption of EPR tools (Enterprise Progress Tools – aka Ecosys).

And that is the problem – we now have 2 or 3 sources of data that everyone looks at – and everyone thinks are integrated. Everyone has clear visibility into document progress by way of our document control tools, everyone has clear visibility into our EPR reports, and everyone struggles to understand what is in the schedule and why it is different. EPR is forcing document progress to drive overall reporting, but P6 does not track documents. We do not build schedules from a bottoms up document world. We (should) build schedules based on overall workflows. I will demonstrate this in many examples that follow and what we have been forced to do inside P6 to try to solve this conundrum, and exactly why our solutions all fail because we have fundamentally deviated from the CPM roots that used to make schedules valuable.

This is what a schedule is meant to look like

As Planned

In reality, below is what the schedule would look like today. Extended durations on everything, broken relationships, out of sequence work and worse than anything – NO CRITICAL PATH! This is not a good example – however I am confident you have dealt with a schedule like this and know this isn’t just an isolated issue anymore.

As Built

What is driving this insanity is the method (the data source) for where our progress comes from. The concept of what “site prep” meant in the original schedule was not the full site prep scope, but only the key site prep work required to commence the excavation.

What changed is the method and data source for where the progress is coming for our activities. Because we are not progressing the schedule, and instead are progressing an excel file to load into our EPR systems – to most likely drive a progress payment – the way data then flows into the schedule becomes meaningless.

There are many that would counter that we are simply running our progress system wrong, and that indeed the site prep should have been 100% earlier. And, likely true. In the above I am just being silly, but we have all seen this and we all know the issue.

So what is the solution?

The solution is to understand that when progress is measured outside the schedule and at a level that is at a finer level of detail than the schedule – the schedule becomes meaningless at certain levels. You therefore need to build schedules AT A HIGHER LEVEL.

The Correct Schedule

Instead of tracking the work at a lower level, for the schedule to bring back meaning, we need to only represent the high level workflows and leave the details in our progress system.

Keep your detail in Excel or you EPR System

The project should retain clear visibility to the detail and the detailed % complete; however, the detail items should not flow into the schedule because they can not be scheduled in a proper CPM way.

Again what is driving this is the method of progress no longer aligning with the methods of schedule. I understand this is severely controversial but when you start looking at your schedules these days and see countless in progress parallel activities that are simply being driven by interfaces with EPR systems to mirror a % – and not a proper schedule workflow – something is broken in our business. We would like to think our progress are all nice CPM logically driven schedules, and obviously I am not suggesting all our schedules are broken. However, the more I see issues such as this popping up in our As Builts Schedules, the more I want our industry to understand we are simply building crap schedules that are not agile enough to capture the intricate data now available in our other systems.

This article is only meant to better illustrate the problem – the why CPM is broken. Solutions (using CPM) are available. In the above I have actually suggested one solution, but several different models that can retain alignment with detailed progress items and retain a workflow CPM model are possible. Hopefully I will dive into those in subsequent discussions.

Advertisement

PowerBI Direct Quey modeling using flat Table

Flat table modeling in PowerBI can generated some very heated arguments, every time someone suggest that that it may be useful for a particular use case, the reaction is nearly universal, flat table are bad, I think it may be useful in some very specific scenarios.

let’s say you have a nice wide fact table generated by dbt and hosted in a fast Cloud DWH, all dimensions are pre joined, , to be very clear you will not need to join it with another fact, it is a simple analysis of 1 fact table at a very specific grain

I will use Power generation in the Australian market for the last 5 years as an example.

Import Mode

When using Import Mode, PowerBI import the data to the internal Database Vertipaq, it is just a columnar database, with some very specific behavior, because the whole table is loaded into memory, less columns, means less memory requirement, which is faster, and because it does uses index joins between Fact and dimensions when you define relationships, counterintuitively, the cost of doing join is less expensive than loading a whole column in the base table.

In Import Mode, it is a no-brainer, Star Schema is nearly always the right choice.

Direct Query Mode

In Direct Query Mode, the whole way of thinking change, PowerBI is just sending SQL Queries to the source system and get back results, you try to optimize to the source system, and contrary to popular beliefs Star Schema is not always the most performant ( it is rather the exception), see this excellent blog for more details , basically pre join will often give you better performance.

let’s test it with with one fact table ( The Table is 80 millions with a materialized view to aggregate data)

And the glorious Model in PowerBI, yes, just 1 Table

and let’s build some visuals

Now let’s check the Data Source performance

Slow Slicer

The Slicer is rather slow, probably you will say, of course scanning a whole 80 million columns is not very smart, actually that’s not the Problem.

for example when I extend the State NSW, PowerBI will generate two SQL Queries

the first one to get the station Name and took 481 ms

And the second Query to get the regions, 361 ms

PowerBI Formula Language will take some time to stitch the results together ( 1023 ms, that’s seems very slow to me ?)

in this case it is only 5 states, not a big deal, the Query results will be cached eventually after the report users expand all the options.

Is 3 second ? a good enough experience for an end user, I don’t think so, slicers have to be instantaneous, Visual can take up to 5 second, I don’t mind personally , but as a user I have a higher expectation for the slicers responsiveness, I will still use Dual Mode with a star schema

Take Away

If your Database can give you a sub second response time for the slicer selection and you have a very limited and clear analysis to do and you have to do it in Direct Query Mode, then flat wide table is just fine as long as you are happy with the SQL Generated.

PowerBI Custom Chart Ranges

My go to progress dashboard calculates a lot of progress % metrics and graphs. However for larger projects, its often difficult to zoom into the graph. This is exactly why slicers exist

Slicers are not my specialty, actually, PowerBI is not my specialty. So it was with a little frustration in trying to add a slicer to a page to find the slicer altering all my data.

Set The Stage

My go to progress dashboard calculates a lot of progress % metrics and graphs. However for larger projects, its often difficult to zoom into the graph. This is exactly why slicers exist. I am sure there are other nice graph tools that allow for custom date ranges, but again, this is so not my specialty.

What didn’t work

I simply added a slicer on my “weekending” field. However, in doing so, all my measures are now calculating based on the filtered date range. This is likely an issue with my measures, but alas, I wanted something to just adjust the graph axis and not effect anything else

Below we can see that my measure are calculating a progress set from 0-100%. Thus when the date ranges were adjusted, the entire dashboard is now just wrong. My budgets and %’s are also not correct on the cards (which are also based on all the slicers).

The Solution – Create a Duplicate DIM_Date Table

The problem was caused because the slicer was based on the live master dimension table that was linked to my data. Just like I want my graph to adjust based on the adjusted the WBS dimension tables, if I insert a slicer linked in anyway to my FACT table, I am in a world of hurt

Thus, just create a duplicate DIM_Date table. Here I created a new table: DIM_Date_GraphRangeSlicer

I insert a formula into the chart X-Axis range to select the min and max dates from this new GraphRange table. I then setup a slicer that filters the range for this new table, not the master DIM_Date.

With these new ranges, linked to the dummy date range, I can now much better refine just the X-Axis display of the graph without impacting any of the measures used to calculate the % progress.

The Result

Putting it all together, we can now customize the X-Axis range without altering the measures or cards that are calculating key metrics off the full (or filtered based on the WBS slicers) data.

PowerBI – Resource Profiles from P6

One of the biggest critiques/limitations in using P6 data is the lack of time phased resource assignment data. The only effective way to pull time phased resource assignment data into PowerBI (or even excel ) is to copy-paste from P6 into Excel. This is what I have recommended in the past and still what I would recommend for anyone moving forward. However, that does not mean that PowerBI can’t produce time phased data using a start date, end date, and profile. What follows is a simply guide on how I have tackled the problem (and the limitations I have run into)

One of the biggest critiques/limitations in using P6 data is the lack of time phased resource assignment data. The only effective way to pull time phased resource assignment data into PowerBI (or even excel ) is to copy-paste from P6 into Excel. This is what I have recommended in the past and still what I would recommend for anyone moving forward. However, that does not mean that PowerBI can’t produce time phased data using a start date, end date, and profile. What follows is a simply guide on how I have tackled the problem (and the limitations I have run into)

Note: In all likelihood this problem has already been solved my many people in many different ways. I do not want to suggest this is “the way” to solve this. More so, I want to simply raise awareness of at least my approach and welcome comments and feedback on how to really solve the problem

Part 1 The Problem

P6 data does not contain time phased distributions. Instead, the backend (and inside XER) data only contains information about the activity, the resource, and the profile applied. So, our problem is to try to extract these data elements and generate a time phased distribution of the resource according to the profile and activity start and end dates.

Typical Activity Level Data
Here we have our Resource Details (including our “curv_id)
Resource Profile Data

I’ve taken a few liberties with the data above to try to focus on the:

Key Problem: how to allocate the resources assigned to an activity according to a resource profile?

I’ve seen this done in excel a lot although, i’ve never been comfortable with the excel solutions. Typically we count the weeks and distribute the hours equally to all the weeks. I know more complex files exist that allow for spreads using profiles. So to add to the problem isn’t nessessarily to simply spread per the profile, but to perform the operation inside PowerBI (or perhaps more to the point, inside Power Query).

At this point, I’d again love to call upon anyone who has a nice solution to include links about how you tackled this as what follows is just my initial stab at this.

Part 2: Getting the data into PowerBI

Firstly, although the data above is from a P6 XER file, I wanted to make this a bit more general. Therefore, I have created a toy model approach. Thus, here is what my source data looks like

Resource Profile Data:

Excel Profiles

We can run this through PowerQuery and convert it to a usable table. Note in the above I have created a PeriodCum field. This will be used to calculate an end date for each of the 10 periods required.

tbl_profiles

Activity and Resource Data:

In this example, I am combining the Activity and Resource data into just 1 table. Obviously if you were doing this formally, you would need to build a scheme to link the Resource Assignment data into the Activity level data

Excel Data

And running the above through Power Query we end up with something as seen below. Note I have added a calculation for the duration (in cal day) and have converted the date formats to numbers. This make the subsequent steps a little easier

Note: a critical hard step (for me at least) came duration this stage. Because we are breaking the duration into 10 periods and will ultimately be allocating a qnty per day to each each, if we have a fraction of a day (example a duration of 15 days), my method bombed. This caused an overlap of qnty allocation on the day of the overlap. As such, I have rounded the duration to the closest 10s.

tbl_activities

Part 3: Time Phasing (where the magic happens)

The first step of generating the time phasing is to now split the activities into the 10 periods per the resource allocation. We do this by first merging tbl_activies with tbl_profiles using the ProfileDesc field. After expanding the result, we will end up with 10 records for each activity (corresponding to each of the 10 periods).

We will now want to calculate a start and finish date unique for each of the 10 periods. In the profile table is a PeriodCum field that we can use to multiply by the duration and then add that to the start date to get a finish for each period. The result table will now look something like this. Note, at this step, its good to now use the profile allocation for each period and multiple that by the hours_total field. This will give us a hours per each period. The last step will be the divide that by the period duration to calculate what will now be an hours per day for that period.

Lastly, we now want to perform 1 additional expansion to get the DAYS for each period. Here is the code I have used. This is a nice little bit of code that can generate a sequential list from a start to an end (we are using days, but its works for any beginning to end sequence)

= Table.AddColumn(#”Changed Type2″, “Custom”, each { Number.From([PeriodStart])..Number.From([PeriodFinish]) })

resulting resource allocation table with profiles applied

In our resulting table (see above), we can clearly see that the hours per day adjusts for each period based on the profile.

Part 4: Putting It All Together

We have our activity data, we have our time phased resource data, the last step is to generate a DIM_Date table that can be used to bin the days to either weeks or months (or quarters or years or any custom grouping defined inside the DIM_Date table)

I don’t want to get to into the DIM_Date table, effectively all we need is each day assigned to a Month-Yr for the purpose of generating a nice little graph below, which is ultimately what we are after.

Conclusion

Again, I do not recommend using this approach. Instead I am more interested in how others have decided to approach this. Personally, as I noted at the beginning, my recommendation is to use a copy-paste from the P6 Resource Assignment tab. Although, this time phasing approach can be used for other (non P6) applications. But alas, I believe there are much smarter ways to achieve the spread using the DIM_Date table and perhaps custom functions. In my research for my method, I ideally wanted a “working day” spread as opposed to the calendar day spread. There are some awesome custom functions that can provide an integer for the number of working days between 2 dates. However, even when taking that approach, I ran into additional complications in getting everything to work.

So, really keen for feedback!

thanks

%d bloggers like this: