Sharing Public Data using Onelake, Delta and Cloudflare R2 

The use case is very simple, sharing public data continuously and as cheaply as possible, especially if the consumers are in a different geographic region. 

Note: This is not an officially supported solution, and the data can be inconsistent when copying to R2 , but it is good enough for public data

How to 

1- The Data is prepared and cleaned using Fabric and saved in Onelake 

2-  Copy the data to cloudflare R2 using code, as of today Shortcuts to S3 does not support  write operation, although I did not test it, Dataflow Gen2 (data pipeline) support S3 as a destination, I used code as I had it already from a previous project, you pay egress fees for this operation and storage in R2 with transaction cost

3- Provide access token to users or make it public, you don’t pay for egress fees from Cloudflare to end users, but the throughput is not guaranteed. 

Today, Fabric shortcuts requires list buckets permission, please vote for this idea to remove this requirement,  

For example, I am writing public data in Fabric US-West and consuming it in Fabric Melbourne  

make sure you turn on cache for Onelake, it helps a lot of performance.

You can try it  

You can try it yourself using this credential, they are temporary, and I may delete them anytime. 

Access Key ID 

3a3d5b5ce8c296e41a6de910d30e7fb6 

Secret 

9a080220941f3ff0f22ac93c7d2f5ec1d73a77cd3a141416b30c1239efc50777 

Endpoint 

https://c261c23c6a526f1de4652183768d7019.r2.cloudflarestorage.com

One thought on “Sharing Public Data using Onelake, Delta and Cloudflare R2 ”

  1. Thanks for this article, “it helps a lot of performance” – would you be able to share any side by side numbers with and without cache?

    Like

Leave a comment