Analytics Vidhya
Published in

Analytics Vidhya

Build a new-age Cost-Effective Business Intelligence System using Azure Data Lakehouse and Synapse Serverless SQL Pool

This is a continuation of my previous article and in this article, we will see how we can build a new age BI system for your enterprise using Azure Data Lakehouse and Azure Synapse Serverless SQL Pool.

The below illustration shows the data flow from source to end-user dashboards

Now let's see how we can build each component in the data flow

step -1: Create Azure Synapse Workspace
Step-2: Create a Synapse Pipeline
Step -3: Create DataLakeHouse Database
Step -4: Create External Tables and Views
Step -5: Create a Tabular Model
Step -6: Create a Dashboard

Azure Synapse Workspace

→ Open Azure Portal and + Create a Resource Synapse Analytics)
→ Click Create Synapse Analytics and Provide all the details needed
→ You can create a new data lake storage account or use the existing account
→ You can create a new File System or use the existing File System from the Storage Account
→ Once you are ready click on review create ( keep the rest of the tabs default unless you want to change them) — for this example, I used the default settings for other tabs

Once the resource is ready, you would see two primary resources (+ related resources) created in your RG

→ Synapse Analytics Workspace
→ Data Lake Storage Account

Data Ingestion — Azure Synapse Pipeline Copy Data

→ For this example, I am using the files from my local machine as the source → Copy files from your local File System to Azure Data Lake Storage using Azure Data Synapse Pipeline — Copy Data Tool

→ Click on the Copy Data tool → Click on the Built-in Copy task

→ Follow the steps and create a Copy Data pipeline
→ If your data is located inside an on-premises network, you need to configure a self-hosted integration runtime to connect to it.

→ If SelfHosted IR is not already installed then Install and configure it as shown below ( It was already installed in my machine)

Install Selfhosted IR

→ The Pipeline copied the Data Successfully

→ I can see the two. CSV files copied to my storage container

Create DataLakeHouse using Serverless SQL Pool

→ Open the Synapse workspace and Click on Scripts from the left menu

→ On the editor enter the below script and execute it to create the database against the Builtin serverless SQL Pool

CREATE DATABASE SalesBI

→ Builtin is the default Serverless SQL Pool that comes with the Azure Synapse Workspace
→ Once the database is created, the next step is to create a DATABASE SCOPED CREDENTIAL to enable access to the SAS-protected Azure storage account.

CREATE DATABASE SCOPED CREDENTIAL [aaaa]
WITH IDENTITY=’SHARED ACCESS SIGNATURE’,
SECRET = ‘sv=2018–03–28&ss=bf&srt=sco&sp=rl&st=2019–10–14T12%3A10%3A25Z&se=2061–12–31T12%3A10%3A00Z&sig=KlSU2ullCscyTS0An0nozEpo4tO5JAgGBvw%2FJX2lguw%3D’

→ Create External Data Source that references the Azure storage account protected with SAS key

CREATE EXTERNAL DATA SOURCE xxxx WITH (
LOCATION = ‘https://sqlondemandstorage.blob.core.windows.net',
CREDENTIAL = aaaa
);

Create External Tables and Views

Create an External Table in the SalesBI database using the .CSV files in the Storage Account

USE [salesbi];
GO
CREATE EXTERNAL TABLE factSales
(
[Country] VARCHAR (100) COLLATE Latin1_General_BIN2,
[Item Type] VARCHAR (100) COLLATE Latin1_General_BIN2,
[Sales Channel] VARCHAR (100) COLLATE Latin1_General_BIN2,
[Order Priority] VARCHAR (100) COLLATE Latin1_General_BIN2,
[Order Date] DATETIME2,
[Order ID] VARCHAR (100) COLLATE Latin1_General_BIN2,
[Ship Date] DATETIME2,
[Units Sold] BIGINT,
[Unit Price] DECIMAL (10,2),
[Unit Cost] DECIMAL (10,2),
[Total Revenue] DECIMAL (10,2),
[Total Cost] DECIMAL (10,2),
[Total Profit] DECIMAL (10,2)
)
WITH (
LOCATION = ‘dlhfs/FinancialDataCSV.csv’,
DATA_SOURCE = salesbieds,
FILE_FORMAT = CSVFormat
);

CREATE EXTERNAL TABLE dimRegion
(
[name] VARCHAR (100) COLLATE Latin1_General_BIN2,
[alpha-2] VARCHAR (100) COLLATE Latin1_General_BIN2,
[alpha-3] VARCHAR (100) COLLATE Latin1_General_BIN2,
[country-code] VARCHAR (100) COLLATE Latin1_General_BIN2,
[iso_3166–2] VARCHAR (100) COLLATE Latin1_General_BIN2,
[region] VARCHAR (100) COLLATE Latin1_General_BIN2,
[sub-region] VARCHAR (100) COLLATE Latin1_General_BIN2,
[intermediate-region] VARCHAR (100) COLLATE Latin1_General_BIN2,
[region-code] VARCHAR (100) COLLATE Latin1_General_BIN2,
[sub-region-code] VARCHAR (100) COLLATE Latin1_General_BIN2,
[intermediate-region-code] VARCHAR (100) COLLATE Latin1_General_BIN2
)
WITH (
LOCATION = ‘dlhfs/CountryRegionCSV.csv’,
DATA_SOURCE = salesbieds,
FILE_FORMAT = CSVFormat
);

→ Now the Database and the Tables are ready, its time to create Views ( if required ) to add few custom columns and apply business logic

→ Created a database view to add a new calculated column

CREATE VIEW vwFactSales
AS
Select
[Country]
,[Item Type]
,[Sales Channel]
,[Order Priority]
,[Order Date]
,[Order ID]
,[Ship Date]
,[Units Sold]
,[Unit Price]
,[Unit Cost]
,[Total Revenue]
,[Total Cost]
,[Total Profit]
, ([Total Profit]*100)/[Total Cost] as [Profit Percentage]
from factSales

Create a Tabular Model using the DataLakeHouse

→ Connect to the LakeHouse using the Serverless SQL Pool Endpoint and import the Table data
→ Create the model and deploy it to the Power BI workspace

I am not covering how to create and deploy the model in this article, I will do a separate blog for this

→ Once the model is successfully deployed, refresh it to get the data into the model

→ Wait for the model to refresh and then connect from Power BI Desktop to create the reports
→ The model I deployed has been refreshed and ready to access.

Create a Power BI Dashboard and Publish

→ Connect to the newly deployed model and check the fields

→ Let's play with it and create some beautiful visuals

→ Publish the dashboard to Power BI Portal and share it with your team

→ We will talk about the cost-effectiveness of this solution in my next article until then stay safe and healthy.

ref:

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store