Connect Power BI to Storage Account using Shared Access Signature
I thought “Easy!” but…
Azure Storage Account is a solution widely used to store data that could be used in analytics pipelines. Each Storage Account has 2 keys that can be used to access its containers and they gave full control.
Every application that needs to access data in a Storage Account must be authorized but sharing key1 and key2 is considered a bad practice, explaining reasons is out of scope of this article.
One of the best practice to grant access to your blobs is to provide applications a Shared Access Signature (SAS).
A shared access signature (SAS) provides secure delegated access to resources in your storage account. With a SAS, you have granular control over how a client can access your data. For example:
What resources the client may access.
What permissions they have to those resources.
How long the SAS is valid.
Our customer, an insurance player in Europe, contacted us to build some Power BI reports to help their consultants analyzing their policies to better approach the market. Data are generated by Machine Learning models and the output is parked in Azure Storage Account containers.
According to customer’ policies we’ll be given SAS to access Storage Account, and I think it’s not bad since it’s a quite common approach, even if I’ve never tried it with Power BI.
The problem (the first…)
SAS it’s considered as a best practice when accessing Storage Accounts, but you’ll find out that Power BI standard Storage Account connector doesn’t support it.
If you try to use it you’ll see it only supports:
- Anonymous authentication
- Account key (key1 or key2)
I thought this is a quite common scenario and someone should have already found a solution so I’ve googled and something came up.
- Microsoft’s doc (archived) that explains exactly the problem and the solution
- More recent SQLServerCentral’s article that applies Microsoft’s doc
Long story short, if you apply these approach your Power BI Desktop will successfully connect to Storage Account using SAS but when you publish it to the service you’ll get a warning on the dataset stating you’re building a dynamic data source that Power Query can’t evaluate before runtime the query itself and therefore this dataset can’t be refreshed.
In my scenario asking users to update the dataset in Power BI Desktop and then publish data onto the service it’s definitely not an option so I started digging into the code to find a way to make my dataset not dynamic and refreshable.
The problem (the second…)
First of all I tried to replace the concatenation of variables in the Source steps with the real URL as a plain text, but this yield to a dynamic data source.
What is really causing the dataset to be dynamic is the fact that the Web.Contents at step “Added Custom1” uses an URL that is composed by a column “Name” that is obviously not available before query is executed, therefore Power Query can’t evaluate it.
If you google for “Dynamic data source” in Power BI you’ll find 2 great blog posts by Chriss Webb explaining how you can get rid of it using RelativePath and Query parameters for Web.Contents.
At a first sight I wasn’t sure I could solve my problem with this approach (I’m not proficient with API at all!) but Miguel Llopis confirmed I was on the right path therefore I pushed my self to make it work.
According to documentation, SAS is composed by resource URI and SAS token. The SAS token is the concatenation of query parameters.
Because of this I revamped code in the following way.
Using RelativePath and Query parameters in the Web.Contents call the data source isn’t not considered dynamic anymore since Power Query evaluates the base URL correctly.
The problem (the third…)
Even if data source is not considered dynamic anymore, the refresh didn’t work giving errors like “Access to the resource is forbidden”. Obviously could be something related to the SAS token but using the same SAS token in line 2 and line 19 it didn’t make sense.
Moreover, using the SAS signature in Google Chrome correctly yield to download the file meaning it to be the correct one.
Third and decisive attempt
Telling my story to PBIUG Italy friends I got a decisive advise from Amedeo Maggini to use Chrome development tools to identify the payload sent by the browser in order to mimic it in Power Query.
Chrome development tools help you to see the encoded and the decoded version of the payload and I realized that values in Query parameters of Web.Contents must be decoded since Power Query will encode it.
Having the dataset successfully refreshed both in Power BI Desktop and Power BI service, I decided to parametrize the query using Power Query parameters and this is the final result.
Chris Webb's BI Blog: Using The RelativePath And Query Options With Web.Contents() In Power Query…
The Web.Contents() function in M is the key to getting data from web pages and web services, and has a number of useful…