Tracking AEMO data using PowerBI

Edit: 7 -Sept-2019 : I was a bit tired of workaround, now using Google Cl0ud functions and BigQuery , see blog here

I was looking for the Power Production  of a particular solar farm, and I couldn’t find any public dashboard that show this level of details, all I could find was high level aggregated data (Later after I built the dashboard I found this excellent resources Nemlog)

The dashboard is published here  https://djouallah.github.io/AEMO-POWERBI/  , it is refreshed every day at 5 AM

Capture

How it works

Australian Energy Market Operator (AEMO) publish all kind of datasets,  one I believe is real time (require a  subscription ) but for my particular use case, I m interested in this dataset

http://www.nemweb.com.au/#daily-reports

there are two folders :

  1. Current, last 60 days of data ( current day not included, Updated at 4 AM)
  2. Archive : the last 13 Months of data ( current month not included, Updated Monthly)

Pulling data from a website and building a dashboard in PowerBI is straightforward,  it took me a couple of hours on a weekend to do it, the problem is how to maintain it.

Ideally, you build a dashboard and all the refresh is done by the service, which was not the case here

  • Pulling the data directly from the archive is very slow, it takes nearly 3 hours ( unzip, filters only the data we are interested in), and is not sustainable as the earliest month will be removed from the website, I like to keep the history, and it is really bad practise to download the same data every day
  • To keep the history, we need to save the archive somewhere else, too easy , just save it on a local laptop
  • History issues solved, now we created a new problem, on-premise data require a gateway, basically you need to install a software on your laptop, and obviously the laptop must be on when you do the refresh

After playing around of some options, I come up with this workflow

  • Create a local folder that contains all the archive files.
  • Create a PowerBI data model on the desktop just to process the archive data
  • Export to clean tables ( price and Production ) to CSV using DAX studio !!!!!!
  • Load the CSV to azure blob storage ( to get rid of the gateway)
  • Load the current zip files from the web site , it does not require the gateway, but you need the following consideration                                                                                 – Use relative path in web.content functions ( see Chris Blog) and @TheBIccountant 

Web.Contents("http://www.nemweb.com.au/REPORTS/CURRENT/",[RelativePath = "Daily_Reports/" ])

Don’t use Web.Page function but parse it using XML or csv , ( Thanks Reda Rad for the advise)

so you can use something like this

Table.FromColumns({Lines.FromBinary(Web.Contents("http://www.nemweb.com.au/REPORTS/CURRENT/",[RelativePath = "Daily_Reports/" ]), null, null, 65001)})

  • Append the data from azure blob storage and the current folder from the web site, the refresh is now very fast, as PowerBI just read the csv without any transformation
  • Publish to web

Good so far, I manage to get rid of the gateway, the dashboard is refreshed automatically in the service, no maintenance for 60 days.

as the current folders contains data for the current 60 days only, you need to update the initial CSV files.

  • Download the pbix from the service, export the csv , and upload to blob storage, you need to do that only once every 60 days.

PRO

  • PowerBI Publish to web is an amazing service and it is totally free
  • Powerful solution without writing any codes
  • PowerBI free license is free 🙂

Cons

  • Publish to web is not suitable for real time, as it takes nearly 1 hours to propagate the update to the web site, that’s why I can’t publish the current day data, which is updated every 5 minutes.
  • Publish to web does not include export data from the visual
  • pricing for azure blob storage can be tricky : storage itself is very cheap, data upload is free, download in the same region is free ( for example blob to PowerBI service), but when you read data from the blob to PowerBI desktop you incurs charges, so just be careful, it is not your Onedrive model, where download is free.

we showed here a simple workflow using PowerBI free license and azure blob storage (Dropbox), it is very easy but with one inconvenient you need manual operation once every two months, that’s a bit annoying.

edit 23-June-2019 :after I published this blog, I got an excellent feedback from Maxim Zelensky, actually using PowerBI dataflows ( require a PRO license), we can fully automated the whole process, as with dataflows we can have a self reference query, I am not going to repeated here, go and read it

edit 24-June-2019: as it is a personal project, and the data is public, I am not really excited about using a paid service to host the CSV files, I moved the two csv files from blob storage to dropbox, it is totally free, so the whole dashboard infrastructure is free, Good work Microsoft

edit 26-June-2019 : a proper solution will be to save the raw data in a data lake, see here

Advertisement

One thought on “Tracking AEMO data using PowerBI”

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: