Agile in Construction

Will Agile ideas and approaches fit the construction world – Definitely.

This does not mean I think that a lot of the tried and true practices we have developed need to change. Quite the contrary, the construction world have for decades been the pioneers for quality project management.

What has changed is the way in which people communicate and the culture we live in. Agile does have a lot of good properties in the personal communication. So I believe the construction world needs to embrace some aspects of Agile and the software tools that have been developed to support Agile management

 

Introduction to Agile in Construction

In this presentation, we can see that indeed many of the terms used in Agile have existed for a long time in the construction PM space. So, in many ways, we already are Agile. Thus, if you go down a path to embrace new management approaches, you first need to understand the current overlaps and also where opportunities exist.

 

Using JIRA to improve communication on construction projects

There are some great possibilities in using JIRA in the construction world. I do not believe a typical Agile approach fits into the project space; however, many of the ideas are sound and specifically the tools that have been built to facilitate Agile can be molded to fit the construction project space.

Specifically, this relates to the way we communicate and track the work we are performing or the work we are overseeing. Applications such as JIRA and DevOPS have capabilities that I feel are a perfect match for the construction world and we need to be looking how we can customize these new tools to really revolutionize the way construction projects are managed.
 

How to reduce data volume in PowerBI Maps by using WKT

In a previous blog, I showed how to load a raster tiles into PowerBI data model, in theory that should solved all my issues with doing a detailed maps in PowerBI.

unfortunately, no, even if R and Python visual support up to 150K points,  the reality is the implementation of R in the PoweBI service has a massive overhead and you can’t do anything about it, as it is literally a black box, all you can do is try to reduce the data passed to R visual and hope it works.

Actually, in my case, the visual did not even show up and I got an error message that resources are exceeded

I am in a situation where I can’t filter data because the whole point of the visual is to show all the data, at the same time, if the visual does not work in the service then there is no point in the whole exercise.

The trick is using wkt, I will simplify the geometry without losing any visual data, for example:

Instead of showing all the points, I will just group the points in the same order and colour as a line, as you can see from 14 rows of data, it is reduced to 5 rows, and the visual representation is the same, it is like sampling, but we keep the exact shape of the data.

Now in PowerBI, all we need to do is to automatically group those points together, turn out the solution was very easy using Rankx, keep in mind the wkt is dynamic for every update, I get a new geometry

After that I just added some calculated columns to create the wkt format

For a point, POINT (X Y)

For a line, STRINLINE (start_X start_Y,finish_X finish_Y)

Keep in mind you can create polygons too, but the DAX become more complex (maybe for another blog)

you can create the wkt file in QGIS very easily but as my data change daily, it was not practical

And here is the final result

The number or rows were reduced from 3528 to 218

That make a massive difference in PowerBI service, my real data is 58K rows and I can’t tell how much I was happy when finaly it worked in the service,not only that, but the total rows using wkt keep decreasing when I do more updates 🙂

There is a catch though, unfortunately as of Dec 2019, only R and Python script can render wkt geometry, there is a new custom visual by @james dales, but it is in a private beta and has some limitation on colors by category.

You can download the pbix file here

I hope that in 2020, Microsoft invest more on improving the Maps offering in PowerBI , and optimize R and Python scripts on the service, I am very optimistic

Controls Accounts for Cost – Houston we have a (possible) problem

This article captures 2 issues that are related and perhaps not clearly discussed or understood (not even by me): Level of Detail, and Transactional vs Account based tracking

Many cost tools, for right or wrong, are now almost entirely Control Account based. This leads to some conceptual issues where people have been use to managing data in a more transaction way.

Additionally, when we begin to establish defined control accounts, picking the right level of detail, and how changes are managed between Controls Accounts, requires a lot of creative accounting (thoughtful process mapping through all your systems & Digital Strategy)

The below is just a primer for a potential discussion. This is a post on what it means to pick a correct level of detail and what it really means to how you manage your costs.

Transaction Management

This is my wheelhouse, the way my brain primarily works when dealing with projects. List Management. Everything is a new record.

We have contracts with detailed line items, we want to retain our budget line items, each contract line item will have various columns for Committed, Forecasts, Incurred and Paid values. We manage these detail items in this way. Below is a example of how we really manage costs (and progress and deliverable). Excel (sharepoint, or a simple flexible database) provides an ideal solution for users to manage this detail.

C01

However, the new age of cost tools want us to view projects at a more “control account” level. In the above example, I have created a control account to a WBS and Commodity code level of detail.

In the above example, we have 4 contractors working on this scope in various capacities. Although, its way more subtle then that, we actually only have 1 contract and 3 expected contracts. We have parsed our budget into what we expect to be 4 different contracts. Thus only 1 contract has a commitment, but yet we have a forecast (and budget) for all 4. Each contract will have a full detailed list of detail items that we will manage. We will items for specified growth, perhaps contingency, maybe a few site instructions. A transactional list!

Quantity Growth – new line or modify existing?

Here again is one of the conundrums of how we manage (specifically related to progress measurement. Consider a project with some concrete and steel.

If we have a change in quantity for a foundation, where do we capture the change and what does it look like in our database. Too often, we look at this and manage it using a simple excel file – which can make the process easy. However, this is a very complex issue. If we add a line item and base our progress off committed quantities, we will have to update 2 line items with %’s. However, so many options exist to capture this.

And again, if instead this item is managed at the “control account” level, all we need is the total actual quantity, or simply the overall % for the control account. When you look at the above from a control account level, you capture all the detail, but yet for % management, you can disregard all the %’s to the details and only insert a % to the control account.

Which method is right, which method fits into your cost/progress system, which approach aligned to your specifications?

In the second method above, you loose the ability to calculate a specific % for variations. So again, we have taken just a simple issue, and created a complex nightmare.  Obviously, we all solve these problems day in and day out. The issue here is again just to bring this topic to light and how the new range of cost tools may not be flexible enough to really capture what we do – nor should they! The real answer is as we all do now, some detail is managed inside excel and some abstraction ends up in the system.

Control Account Management

In the above, we have seen the importance of picking the right control account level of detail, but, we perhaps haven’t conceptualized what a controls account is in the first place. For “control account” management, we want to manage “SCOPE”. For this package of scope, we sum up all the detail coded to the same codes.

C02

When we look at scope, it is much easier to compare against our estimate which was built to this level before we had to detail with specified growth, claims, site instruction and even contractor commitment. This is why we want control account management and why so many cost tools are forcing us down this path.

Whats the Problem?

In the above example, the “control account” is meaningless. We can not “manage” anything at the control account level. The control account is only a metric.

Instead we are going to manage our contracts in isolation. Each contract will have its own specifics and likely its own approvals when we modify a forecast or a commitment.

A solution to this conundrum is to split the above into 4 control accounts (or more). However, that creates a nightmare for everyone dealing with the new cost systems where creating cost accounts, loading budgets and costs is not straight forward. Doubly so as we haven’t even begun to discuss at what level we manage our time phased data.

All the new tools also allow us to manage “detail items”. Budget again, as soon as you start to push the level of detail of management into the detail items, you may as well make the detail item its own control account.

The Problem is – Whats the right Level of Detail?

You can run this problem down rabbit holes with how we manage engineering deliverables, how we manage progress items, manhours, quantity management, etc. Here I have just presented the problem related to strictly just a cost control level. But yet the dimensions for each of the items above is multiplied by each of the additional management datasets we also track.

Picking the right level of detail that goes into our cost tools, is more of an artform than it is a science.

My view is that we need easy flexible transactional capabilities from cost systems to ease the excel hell aspects. But yet at the same time, need to understand how the transactional records join up to perhaps a more formal CTR or Control Account level of detail.

Load Raster tiles to PowerBI Data Model using R

In a previous blog, I showed how to use PowerBI to generate high quality print maps, with a caveat that the R script does not work in the Service unless all the packages are supported (the desktop use your local R install, so no limitation here)

I am a huge fan of ceramic, for me it shows the best of R philosophy, it does one thing and do it very well, you give it coordinated and it will give you back raster tiles.

I spent some time trying to figure out a workaround to make it works in the service and I found this trick.

  • Generate raster using ceramic outside PowerBI ( using Rstudio for example).
  • Save the raster object using saveRDS but using the option ASCII = TRUE, so it is a text file, notice you need to write version = 2, otherwise it will not work in the service.
  • Load the file into PowerBI using PowerQuery
  • The maximum data you can pass to R visual is 150K rows, which is not enough for my use case.
  • The trick is to group the data using concatenation, the limit is the number of rows not the number of columns :), please note the maximum number of length of a text value is 32766 
  • Merge the table for the raster with the table of data (coordinates, attribute etc), unfortunately, you can pass only one dataframe to R script, I changed to append so the visual can be sliced
  • Now you have a dataframe with the coordinates and a raster data which you pass to R Visual script
  • Unest the raster data into a dataframe, notice the dataframe holding the raster data is 1 Million rows
  • Save the dataframe to a “raster.rds”
  • Load the “raster.rds” and it is R Magic the raster is alive
  • Plot the map ( it is on the PowerBI service not the Desktop)

you can filter the status and hide the tiles if you want, as it is slow to render in the service, please use query reduction option in the filter

This workflow does works with any R object, not only Raster but any binary data can be passed to PowerBI data model.

All the codes are saved here.

I think the main take away is you can circumvent PowerBI limitation of 150,000 rows when you pass data to R or Python, but there is a trick, the resource available in the PowerBI Pro instanced are limited and not documentation, so your mileage may vary , but it is worth the try

now, you may ask, why bother with this tedious, slow visual, the answer is very easy, in some cases you want to control the exact look of a map, and R give you just that, you can show multiple layers, text, it support more than 30 K point in a map, it is worth the pain

edit : I just noticed that PowerBI cache the visual output , if you do the exact selection again, the visual show instantly !!!

Normalize multiple progress files using PowerQuery

A typical situation in the construction industry the progress data is sourced from multiple system with different format, generally we get two type of reports.

  • Time stamp items

My preferred one, the data is tracked at a very low level (cable, spool, pre-commissioning and commissioning tracking), and you get a date when the item is completed something like this

This format is very convenient as you need to maintain only 1 file, the history is recorded in the data itself unfortunately, this kind of report is not always available for multiple reasons, the main one is, in some kind of work to finish one item it will take  longer period of time, for example completing 1 drawing will take 3 weeks, no manager will wait 3 weeks to claim a progress.

  • Cumulative Progress

This format reports the cumulative progress at a time period (daily, weekly, or whenever there is a progress) something like this

This is format is very common, it is very easy to update by the supervisor, and works with any level of details

the challenge of this format is

  1. To get the historical data you need to keep all the previous files.
  2. As it is cumulative data, calculating the progress per time period is a bit harder, and getting something like year to date is very awkward.

we need to normalize those files to be in the same format, one approach I use with PowerQuery is

  • Load the cumulative files.
  • Calculate the reverse total cumulative using self-join
  • Filter only the values where there is a progress
  • Append to the time stamp file.

 Now we have a normalize Actual Table, where quantity per period, year to date and all date calculations are very easy to calculate.

The pbix files and the source data is saved here

I know it is tempting to just load data and start making visual and do some complex DAX calculation, but it is not sustainable and it will make your life miserable, a simple data model will make further development much easier.

PowerBI, generate multiple Print quality Maps using R

In this blog, I will present a workflow, I have been using for the last 2 years with a rather a good feedback.

Obviously, I like interactive Dashboard, I want everyone to login to the PowerBI service and start doing their own analysis, but to my dismay, not everyone is interested in doing that, a lot of users want only a report that they can print. A took me a while to understand that that there is nothing wrong with that, and in a lot of use cases, a printed report is the best medium to convey information.

in my case, we do a lot of maps, and users want print quality maps, and because the data change daily, you need automation.

In previous blog, I wrote how to integrate PowerQuery with R, in the current blog, I will show how to generate multiple pdf with a customized map, by using R custom visual.

The PowerBI team has done a fantastic job, all you have to do is add the R script visual, add the fields you need, which automatically create a dataframe and write your code, and with one click, you can edit your code in RStudio!!!

Rstudio Integration works by creating a temporary csv file that hold the dataframe data

I personally prefer RStudio, but you can use any IDE

There are two caveats though

  1. The dataframe has a maximum of 150k rows.
  2. When you work in the desktop, it will use your R installation, all packages are supported, but when you publish to the service only the packages in this list are supported (ceramic is not supported, I think packages that downloaded external data are not supported), I found a workaround

Let’s generate some maps.

I am using the excellent package tmap for the mapping, you can customize any aspects of the map, layout, Text Size, legend, titles, it is really an amazing product and show the power of R, for tiles I am using ceramic .

you need a Mapbox token (their free tier is very generous), I will use South Australia car crash data as an example.

  • Fatalities >1

No code just adds a filter in the visual

The code

library(sf)

library(raster)

library(dplyr)

library(tmap)

library(tmaptools)

library(ceramic)

dataset = rename(dataset,y=lat,x=lng,status="Crash Type",labels="Total Fats")

dataset$color <- as.character(dataset$color)

dataset$labels <- as.character(dataset$labels)

map <- st_as_sf(dataset, coords = c("x", "y"), crs = 4326)

Sys.setenv(MAPBOX_API_KEY =”get your own key")

background <- cc_location(map)

dataset[dataset==""] <- NA

new_DF<-filter(dataset, !is.na(labels))

map1 <- st_as_sf(new_DF, coords = c("x", "y"), crs = 4326)

chartlegend <- unique(dataset[c("status", "color")])

m2 <- tm_shape(background)+

  tm_rgb() +

tm_shape(map) +

tm_symbols(col = "color", size = 0.04,shape=19)+

  tm_shape(map1) +

 tm_text(text="labels",col="white")+

  tm_add_legend(type='fill',labels=chartlegend$status, col=chartlegend$color)

  tmap_save(m2, "C:/Users/mimoune.djouallah/pdf/happyValey.pdf",width=3508, height=4961)

  m2

  • Copy the same custom visuals and just change the filters

Here we go

 Best part the pdf files

Now you can share those files per email or save it in a shared folder. the map show only dots, but you can load polygon if you need to, see this blog for further details

 You can download the pbix here, you need R to be installed, and your own Mapbox token.

PowerBI Incremental refresh using Python or R

In this blog, I will show how to leverage Python (or R) to implement an incremental refresh in PowerBI using PowerQuery and Python, nothing is really new ( I am sure Imke and Maxim has blogged about it before).

in a previous blog, I showed how to use R & Python integration to load data to a Database

This approach make sense only when you do a lot of heavy transformation and your data source change based on time.

As an example, in my previous job, we receive a new excel file every Monday (300K rows), this file gets approved and corrected every Thursday.

the workflow was:

save the files in a folder, do the transformation, which was fine , but after the first year, it was around 52 files, and although technically you need only to do transformation for the last file, and as PowerBI does not support incremental refresh, twice a week we redo everything, after two years, the refresh took nearly 30 Minutes and sometimes we get out of memory errors.

in the big picture,Half an hour was not that bad (we have a desktop just for refresh), the worst was, you refresh the model and once you finish, you get a new revision and you must refresh again.

Now using Python/R script, the idea is every file get transformed only 1 time, regardless of how many times you refresh, just by exporting the results of the transformation of every file as a csv in a staging folder.  

  • The first run is slow, as it will process all the existing files in Source Data, but the subsequent run, will transform only new files.
  • Let’s say File 2 was revised, all you need to do,is to delete File2.csv and it will be transformed again, but only that file.
  • Ok, if you see step 4, the files are reloaded each time, I am not too much worried about that, as the batch loading of csv files from a folder using PowerQuery is relatively fast (yes, a bit slow compared to R), the bottleneck is rather the transformation.

the code for python script is here, as you can see PowerQuery integration is amazing, just add a new step and you get a dataframe, that’s all,

# 'dataset' holds the input data for this script

df_by_filename = dataset.groupby("filename")

for (filename, filename_df) in df_by_filename:

    filename = filename.replace("zip", "csv")

    filename = filename.replace("PUBLIC_DAILY", "UNIT_PUBLIC_DAILY")    filename_df.to_csv("C:/results/"+filename,index=False)

the script split the dataframe by the column filename, and then export each file separately, currently it is saving into a local folder, but you can easily save those files into a cloud storage

to test it, I built a quick workflow using public data, PBIX here,  the source data is zip files in a public website, there is a new zip file daily, it is relatively complex transformation as you need to unzip the file split it, delete some columns etc, the first run is slow, as it is processing all the files (62 files), but the next run, will just process 1 file, you can simulate that just by deleting some csv files in the staging folder, when you refresh again, only the files deleted will be processed again.

I think the main take away is, Python and R integration are amazing tools to implement new possibilities that will not be necessary available in PowerBI, and you don’t need to be a programmer to use those integration, a serious search on stackoverflow will get you started quickly.