First Look at Fabric F2 

Edit : 26 December 2023 : The Bug was fixed, changing capacity is working as expected.

TL:DR ; a simple change of capacity generated a cascade of failures in Fabric workspace.

Use case

For the last month, I have been testing Fabric in my personal tenant using the free F64 trial, after analyzing three different possible workflow for the same workload I concluded the following

  • Using OneLake shortcuts from an existing Azure storage bucket : did not like the storage transaction cost and it fail the purpose anyway if I had to use an external service.
  • Dataflow Gen2 : the auto generated dataset and lakehouse consume a lot of compute and there is no way yet to turn it off.
  • Spark Notebook :  the cheapest option in term of compute usage, around 45K CU(s) and it is pretty stable, Direct lake for the custom dataset works and everything seems ok

The report is straightforward download a file every 5 minutes and append it to an existing delta lake table, run a compact every 4 hours and update a dimension table every day, here is the overall diagram

And here is the final results, the key here is to have a freshness of 5 minutes at any time

Day 1 : Changing capacity to Fabric F2

I know it may sound weird, but when I test something seriously, I usually pay for it. It is all fun to use something for free but the real test is when you spend your own money, is the tool worth it ?

I take Fabric seriously, I think it is an interesting take on how a data platform should look like.

First in Azure, I created a new Fabric F2 capacity

306 USD dollar, that’s nearly 460 Aud, that’s a lot of money for a service that does not auto suspend, in this case, I can’t manually pause and resume as the whole point is to get 5 minutes freshness time, my thinking I can use it for 2 days and delete it later, it was  a big mistake.

F2 must work,  ignoring bursting and smoothing and all that’s stuff.

F2 total compute =  2 X 24 X 3600 = 172800 CU(s), the workload consumes 45000 CU(s), it should be good to go.

The theory is Fabric compute is stateless, you change the license and it seamlessly switches the compute.

Everything seems fine

Day 2 : 

I had a look at Fabric usage in F2, Spark usage for the main notebook was null, it seems the notebook scheduler was stuck with the previous capacity ( to be honest I don’t know for sure), anyway I stopped the scheduler, suspend F2, restart the scheduler and the Capacity , the usage start to appear. Then I noticed something weird. Direct Lake  custom dataset keeps failing, I send the errors to Microsoft dev to have a look, long story short, apparently, I hit a race condition and the database did not get evicted, it is a preview and stuff happens.

Day 3 : 

Anyway, 13 Australian $  for this experimentation, it is ok ,the good news is even if F2 does not work at least the cost per day is fixed.

As I was just trying to understand why direct lake did not work, I did not notice the more critical problem, data was duplicated !!!!

I don’t know how it is possible, the notebook is very simple, download a file, append the result in a delta table and add that filename to a log ( a simple parquet file),  I used the log just to avoid reading delta table every time just to get the distinct values of  the filename.  

Maybe somehow there was concurrent write from the two capacities (F2 and Trial both stuck in the same workspace somehow), Delta table support that just fine, but maybe the log was locked by one writer,  or maybe the issue is not related and it was a bad luck, I never had any issue with that code as I use it with a cloud function and there is a guaranty of only 1 writer at the same time.

But that’s not the issue, after digging further in the recent runs, I learnt that the UI is just telling you that the notebook runs, not that it runs successfully

I started looking at runs with longer duration, and here is an example, it does not look like a successful run to me.

Errors happen,  and in any production pipeline you add another step  to get rid of duplicate records, but I think it will be nice to show errors in the user interface.

Parting Thoughts

I wrote a big rant then I realized it doesn’t matter, I am accepting the whole experience as it, I deleted F2 though and synapse as a retaliation, at the end of the day it is my own unrealistic expectation, I was hoping for Fabric to be something that it is not, it is not a serverless product (which is fine for an Enterprise setup with a constant workload), but for a scale down scenario, or personal usage, The Tech is not there, even for a Reserved instance pricing, the math doesn’t add up.

My first impression did not change though, Fabric’s sweet spot is F64+  and that’s fine, but a man can dream.

Leave a comment