Save Fabric Delta Tables as DuckDB file

The file section of Fabric Lakehouse is very interesting as although it is a blob storage, somehow behaves more or less like a real filesystem, we leverage that to save all Delta Tables in one DuckDB file

Limitation

  • DuckDB storage format is experimental at this stage and doesn’t offer backward compatibility yet, everytime they upgrade a major version, you have to export the file to parquet and import it back.
  • I am using Python Deltatlake package to read the Delta table, currently it supports only Delta reader version 1, which Microsoft uses, but this may change in a future update of Fabric.

How it works

Install DuckDB and Delta lake, and copy this function

The data will be saved for the current version of the table, old data not removed by vacuum will be ignored.

A note about concurrency

You can have multiple readers using the file at the same time (Using read_only=True), but multiple writers is  not a supported scenario nor 1 Writer and multiple reader, use at your own risk 🙂 Having said that, in the case of 1 writer and multiple reader the worst case scenario is reading inconsistent data 🙂

Leave a comment