Integrate an Existing Data Lake to Microsoft Fabric

Valentin Loghin
3 min readFeb 12, 2024

--

Today the solution we are trying to build is, we already have a Data Lake that exists outside of Microsoft Fabric Lake house, we want to go and integrate that Data Lake into Microsoft Fabric Environment to make sure that we have access to all the data that’s existed before Microsoft Fabric was around so how we’re going to do that is we’re going to go over into Microsoft Power BI service and also into the Azure portal.

Data Lake Sales (left side) is hosted on Azure Data Blob Storage Gen2 and follow the Medallion architecture consisting of three main layers: Bronze, Silver, and Gold. The data in Bronze is stored in CSV format, Silver and Gold in Delta.

  • Bronze (Raw) : Storing your source systems data unchanged, contain duplicates, uncleaned,
  • Silver (Cleansed) : Storing deduplicated, cleansed, missing values replaced, same grain as source bronze/raw data.
  • Gold (Curated) : Storing data modelled for analysis, e.g. Star Schema, Aggregate tables. This is the highest quality data ready for reporting.

In this exercise, we will:

  • Create Microsoft Lake house named Demo_DeltaLake
  • In Demo_DeltaLake add 4 shortcuts for every table in Sales Gold Data Lake.

Prerequisites:

  • A full or trial Microsoft Fabric License activated
  • Access to an Azure Account with the ability to work with an Azure Storage Container.
  • Basic knowledge of SQL, database concepts, and objects.

Gathering Data Lake connection information from Azure Portal

1.Logging in to Azure

2.The container Name.

3. Shared access signature (SAS) for the storage account.

Integrate Data Lake to Microsoft Fabric

  1. Logging in to Microsoft Fabric using the URL: https://fabric.microsoft.com/

2. Create a workspace

  • From the Microsoft Fabric home page, select Synapse Data Engineering.
  • From the left bar menu, select Workspace, New workspace, provide Sales_Fabric for the name and click Apply.
  • Add inside the new creation Sales_Fabric, the workspace Demo_DeltaLake.
  • Select Demo_DeltaLake lake house from Sales_Fabric workspace

Create shortcuts

  1. Locate the table section and click the 3 points from the right and after New shortcut (follow the 7 steps).

Fact_Sales was added inside the tables section, the data it is visible too.

The Dim_Customer, Dim_Date and Dim_Product will be added in the same way the single difference will be at the step 6 in the Sub Path, the values in order are: sales/gold/dim_customer, sales/gold/dim_date and sales/gold/dim_product. At the end of the process, it should look like the picture below.

We just finish the integration between Data Lake and Microsoft Fabric, the data is ready to be analyzed.

See you again :) !!

--

--