Microsoft Fabric OneLake: The Ultimate Data Platform for All Your Needs đź’Ż

Gulce Ekin
7 min readJun 12, 2023

--

Large-scale data analytics solutions typically rely on data warehouses, where data is stored in relational tables and processed using SQL queries. However, the increasing volume of data, cost-effective storage options, and the availability of cloud computing technologies have given rise to alternative approaches in analytical data storage, known as data lakes.

In recent times, data engineers and analysts have embraced a unified approach, storing data in files within data lakes and augmenting them with a metamodel layer to be used as a relational schema. OneLake offers scalable file storage for relational objects through metadata and the Delta Lake table format, enabling SQL-based querying of data stored in data lakes.

Microsoft recently introduced Microsoft Fabric, an analytical data storage solution. To follow along with the example in this article, you’ll need a Microsoft Fabric license.

Ensure that the option for a Microsoft Fabric trial is enabled in your company’s tenant settings to facilitate a smooth transition. Once you’ve activated the Fabric trial, you’ll see the Power BI logo in the bottom left corner when you log in to your portal.

This logo allows you to access services supported by Microsoft Fabric. If you don’t see the logo, you may need to request your organization’s administrator to enable the Fabric trial process.

Let’s begin by creating a workspace. As an example, let’s name it “OneLakeTest” and apply the changes to complete the process.

Once you have a workspace, you can proceed to the data engineering section of the portal and create a OneLake for your files.

Click on the Power BI icon in the bottom left corner of the Power BI portal and navigate to the “Data Engineering” section.

On the Data Engineering page, you’ll find options to create commonly used data engineering assets.

On the main Data Engineering page, create a new lakehouse by entering a name such as “OneLakeLH” and clicking on “Create.” This process may take a few minutes, and once completed, a new OneLake will be created.

The OneLake Explorer pane on the left allows you to browse tables and files within the OneLake. You can use it to navigate through the tables and files stored in OneLake and view their contents.

The Tables folder contains tables that can be queried using SQL semantics. The tables in Microsoft Fabric OneLake use the open-source Delta Lake file format commonly used in Apache Spark. On the other hand, the Files folder contains the data files stored in the OneLake storage, and you can create shortcuts to reference externally stored data.

Microsoft Fabric provides various data loading methods in the data task of Fabric. One of these methods is data flows, which can be defined as pipelines to transfer data from external sources to OneLake.

To upload a file, navigate to the Files folder in the Lakehouse Explorer pane, create a subfolder called “Data,” and upload the desired file. In this case, we will upload the CSV file containing Amazon sales data.

Data: https://www.kaggle.com/datasets/karkavelrajaj/amazon-sales-dataset

Once the file is uploaded, you can preview its contents.

To query the uploaded data in SQL, you need to load it into a table. Navigate to the Files folder and select the “Load to Tables” option for the desired file.

Specify the table name, confirm the upload process, and wait for the table to be created and loaded. You can then view the data in the created table.

This process may take a few seconds. If the sales table doesn’t appear automatically, don’t forget to select the “Refresh” option in the menu of the Tables folder.

Now, go to the Tables section in the LakeHouse Explorer pane and select the created “amazonsales” table to view the data. In the menu of the “amazonsales” table, you can choose to view the files to see the underlying files of this table. The files of a Delta table are stored in the Parquet format and include a subfolder called DeltaLog, which contains details of the operations applied to the table.

By creating a Lakehouse, an SQL endpoint is automatically generated. You can switch to SQL endpoints from the Lakehouse menu to query the tables using a visual interface.

Wait for a moment until a visual interface for querying tables is opened, where you can query the tables.

To open a new Query Editor, use the “New SQL query” button. In this section, you can write any SQL statements you desire.

While many data experts are familiar with SQL, data analysts with Power BI experience may prefer to use their Power Query skills to create visual queries.

Alternatively, you can use Power Query with OneLake to create visual queries. Simply select the “New visual query” option, drag and drop the desired table onto the Visual Query Editor pane, and start working with Power Query.

You can perform various operations on the data, such as removing unwanted columns or grouping the data, using the available tools and menus. Once you’re satisfied with the data preparation, you can save the changes.
To remove unwanted columns, select the “Manage Columns” option in the “Column tools” menu. Then you can select the desired columns.

You can use the Transform menu to group the data.

Congratulations! You have learned how to create a Lakehouse, load data into tables, and query them. 🤗✨

Did you know that tables in Lakehouse are automatically added to a default dataset for reporting in Power BI? Additionally, when you add data to a table, a dataset is automatically created for that data. You can connect to this dataset using Power BI and create a report. 🤩

Select the “Model” tab at the bottom of the page. On this page, you will see the data modeling process for the dataset.

To design a report, select the “Report” button on the menu bar and choose “Create a new report.” Add a table visualization and customize the report layout using the visuals in the “Visualizations” pane.

You can create the desired report layout using the visuals in the “Visualizations” pane of the report.

Once you’ve completed the report, save it from the File menu.

If you wish to return to the Lakehouse workspace, you can close the browser tab containing the report. You’ll find the created Lakehouse and the “ReportPB” report within the workspace.

Congratulations! 🎉 You have successfully created an end-to-end Lakehouse, transferred data, created tables, and queried them using SQL. Furthermore, you’ve integrated the tables into Power BI for reporting purposes.

We will continue exploring Microsoft Fabric. ⚡

--

--