Power BI — Snowflake Performance Enhancements with Horizontal Fusion

Many times customers say that their Power BI report performance gets slower when their reports are running against their Snowflake instance especially while using the Direct Query mode. Often, the recommendation is to use the Import mode but that’s not always possible given the dataset sizes or data freshness requirements. This typically leads to scenarios where customers feel that the Snowflake connector for Power BI is not optimized to handle large-scale workloads.

In reality, the issue is mainly with the fact that Power BI is generating multiple SQL queries to the backend Snowflake engine where we have to wait for all the query results to be returned back to the Power BI engine. This can increase exponentially with more reports within a dashboard issuing these queries. Microsoft identified this as one of the issues for the performance degradation and decided to address it with a new feature enhancement called Horizontal fusion. Horizontal fusion(HF) is the process of fusing(Reducing) multiple “similar shaped” queries into a single complex query and sending it to Snowflake eliminating any unnecessary round trip waits. This significantly improves performance and overall reporting experience for end users. In other words, we are pushing all the computations to the Snowflake engine as one large complex query instead of multiple simple queries. Microsoft officially announced this feature in this blog article. Microsoft did deploy this capability for Power BI Gen2 workspaces globally and they are in the process of rolling it out globally by end of Q4 CY 2022.

We have a toggle switch for Power BI Desktop to enable the preview feature. To enable this capability in Power BI desktop, you can check the Horizontal fusion check box in the Preview features tab under Options and Datasource settings as shown below.

Enabling Horizontal fusion preview feature in Power BI Desktop

Please note that Power BI Desktop doesn’t have the horizontal fusion capability enabled as of now even if you enable the toggle button under Options, and the tentative timeline for this feature to be rolled out to all the builds as per Microsoft is by end of Oct ‘2022. So it depends on when you are testing this feature.

In most cases, customers would deploy their models to a Power BI premium workspace and there is no easy way to know if this feature has been enabled on your workspace or tenant and the only we can know is by doing a simple test. The purpose of this blog is to provide you with the step by step instructions on validating if your Power BI premium workspace or local PBI desktop is enabled with Horizontal fusion or not.

In order to test this, lets leverage the TPC-H Samples database which is provided as part of the free Private Sharing option within Snowflake. You can select “Shared with me” tab on top as shown below. You can select the Samples(SFC_SAMPLES*) which are shown under Direct shares option and Create a database from the share. In my demo, I have created a database named SAMPLE_DATA

Snowflake Private Sharing page

Once the database is created, you can see the ORDERS table within the TPCH_SF1 schema immediately.

TPCH schema showing Orders table

Open Power BI Desktop and Import some data using the “Get Data” widget. Provide the credentials to your Snowflake instance and Select OK. Please select the ORDERS table in the SAMPLE_DATA schema and click Load.

Select Snowflake as the Datasource, and Load the ORDERS table which would look like the below image in the Data Model tab. No need to transform anything. You can load the table as is. You can now publish this model to your Power BI Premium workspace(if you have one) or just save it in Power BI desktop(if you don’t have Power BI premium)

Orders table loaded into PBI Desktop using Power Query

Once you have this file open, Download DAX studio utility on the same machine. By default, it would connect to the PBI desktop model which you have loaded earlier and you can connect directly to that model. Incase your model is already deployed to PBI Premium workspace, you can also change the connection to point to the PBI Premium workspace as well under the Tabular server option in DAX studio. Here is a link which shows how to obtain the PBI premium workspace URL and you can point the DAX studio to that URL under the Tabular server option as shown below.

Connection to DAX studio

Once you are able connect and authenticate to PBI Desktop model or PBI Premium workspace, please paste the below DAX query in DAX studio console and hit Run.

Executing a DAX query within DAX studio

Please run the below DAX query within DAX Studio query window as shown in the above image.

DEFINE

MEASURE ‘ORDERS’[Total Price F] = CALCULATE(sum(ORDERS[O_TOTALPRICE]), ORDERS[O_ORDERSTATUS] = “F”)

MEASURE ‘ORDERS’[Total Price O] = CALCULATE(sum(ORDERS[O_TOTALPRICE]), ORDERS[O_ORDERSTATUS] = “O”)

MEASURE ‘ORDERS’[Total Price P] = CALCULATE(sum(ORDERS[O_TOTALPRICE]), ORDERS[O_ORDERSTATUS] = “P”)

MEASURE ‘ORDERS’[Total Order Amount] = [Total Price F] + [Total Price O] + [Total Price P]

VAR __DS0Core =

SUMMARIZECOLUMNS(

ROLLUPADDISSUBTOTAL(‘ORDERS’[O_CLERK], “IsGrandTotalRowTotal”),

“Total_Price__F_”, ‘ORDERS’[Total Price F],

“Total_Price__O_”, ‘ORDERS’[Total Price O],

“Total_Price__P_”, ‘ORDERS’[Total Price P],

“Total_Order_Price”, ‘ORDERS’[Total Order Amount]

)

VAR __DS0PrimaryWindowed =

TOPN(502, __DS0Core, [IsGrandTotalRowTotal], 0, ‘ORDERS’[O_CLERK], 1)

EVALUATE

__DS0PrimaryWindowed

ORDER BY

[IsGrandTotalRowTotal] DESC, ‘ORDERS’[O_CLERK]

In DAX studio, you can click on Server Timings in the ribbon and then on the Server Timings tab, you can see the source queries(Snowflake) which are generated when executing the DAX query. So the above DAX expression would produce 3 different SQL queries of the same “shape” to Snowflake as part of the Direct query mode.

Query execution and Server timings option on DAX studio

Here are the 3 SQL queries which we can capture from DAX studio “Server timings” tab which were sent to Snowflake for the above DAX query execution. They are all of the same shape which means that if Horizontal fusion capability is enabled, then we should be able to fuse these 3 queries into a single query and send it to Snowflake.

  1. SELECT TOP (1000001) *FROM (SELECT [t0].[O_CLERK],SUM([t0].[O_TOTALPRICE]) AS [a0]FROM [ORDERS] AS [t0]WHERE ([t0].[O_ORDERSTATUS] = N’F’)GROUP BY [t0].[O_CLERK]) AS [MainTable]WHERE (NOT(([a0] IS NULL )))
  2. SELECT TOP (1000001) *FROM (SELECT [t0].[O_CLERK],SUM([t0].[O_TOTALPRICE]) AS [a0]FROM [ORDERS] AS [t0]WHERE ([t0].[O_ORDERSTATUS] = N’O’)GROUP BY [t0].[O_CLERK]) AS [MainTable]WHERE (NOT(([a0] IS NULL )))
  3. SELECT TOP (1000001) *FROM (SELECT [t0].[O_CLERK],SUM([t0].[O_TOTALPRICE]) AS [a0]FROM [ORDERS] AS [t0]WHERE ([t0].[O_ORDERSTATUS] = N’P’)GROUP BY [t0].[O_CLERK]) AS [MainTable]WHERE (NOT(([a0] IS NULL )))

This is what is shown on the Snowflake side within the QUERY HISTORY confirming that these 3 queries got executed by Snowflake.

If Horizontal Fusion is enabled on the workspace or Power BI Desktop, then these 3 should merge into a single complex query as shown below which will be sent to Snowflake.

  1. SELECT TOP (1000001) *FROM (SELECT [t0].[O_ORDERSTATUS],[t0].[O_CLERK],SUM([t0].[O_TOTALPRICE]) AS [a0],SUM([t0].[O_TOTALPRICE]) AS [a1],SUM([t0].[O_TOTALPRICE]) AS [a2]FROM [ORDERS] AS [t0]WHERE (([t0].[O_ORDERSTATUS] IN (N’F’,N’O’,N’P’)))GROUP BY [t0].[O_ORDERSTATUS],[t0].[O_CLERK]) AS [MainTable]WHERE (NOT(([a0] IS NULL )) OR NOT(([a1] IS NULL )) OR NOT(([a2] IS NULL )))

And that’s pretty much it. If you see more than 1 SQL query being issued to Snowflake then Horizontal Fusion isn’t enabled yet. If its just 1 query which shows up in DAX Studio, then Voila you have Horizontal Fusion enabled.

Imagine a scenario where your Power BI reports\dashboard sends out multiple similar queries to Snowflake and they have to wait for all the results to be returned from Snowflake (including any queuing, processing times, network waits etc for each query) before being sent out to the Power BI report\dashboard. This adds significant latencies to user experience and often comes across as that Power BI is running slow while using Snowflake. With the introduction of Horizontal fusion, we should be able to up to see up to 80% improvement in the performance and user experience for Snowflake.

As Microsoft mentioned in the official blog, you can see that Horizontal can reduce data load times significantly.

Performance improvements of using Horizontal fusion

--

--