Process Azure Analysis Services Models with Azure Data Factory v2

James Coulter
Mar 23, 2020 · 5 min read

Azure Data Factory (ADFv2) is Microsoft’s cloud-based data movement, integration and transformation service. It offers a drag-and-drop UI for authoring extract-transform-load (ETL) processes and comes with built-in monitoring and management capabilities.

Azure Data Factory

However, one omission from ADFv2 is that it lacks a native component to process Azure Analysis Services models. There are a number of articles on the web explaining how this might be achieved using other Azure services, such as Azure Logic Apps or custom .Net activities in Azure Batch. Our favoured method at Ricoh Digital Services is to refresh the model using an Azure Automation job. This works well, but requires some initial setup and configuration.

I felt that there ought to be an easier way to accomplish this and by chance I discovered an alternative approach developed by Greg Galloway that uses only built-in ADF components and nothing else — no logic apps, no .Net code.

The method he describes uses the Azure Analysis Services REST API. If you’re not familiar with the REST API, it enables data refresh operations to be carried out asynchronously on your tabular models.

Base URL

The base URL follows this format:

https://<region>.asazure.windows.net/servers/<server>/models/<model>

A model called Sales on a server called “myserver” in the UK South region would have the following server name:

asazure://uksouth.asazure.windows.net/myserver

The base URL for this server would be:

https://uksouth.asazure.windows.net/servers/myserver/models/Sales/

From this base URL, refreshes and other operations can be carried out on your model as follows:

Azure Analysis Service REST API
  • Anything that ends in s is a collection.
  • Anything that ends with () is a function.
  • Anything else is a resource or object

POST /refreshes

To perform a refresh operation, use the POST method on the /refreshes collection to add a new refresh item to the collection. The body of the request will look like the following:

{
"Type": "Full",
"CommitMode": "transactional",
"MaxParallelism": 2,
"RetryCount": 2,
"Objects": [
{
"table": "DimCustomer",
"partition": "DimCustomer"
},
{
"table": "DimDate"
}
]
}

Note: if the Objects array is empty ([]) the whole model will be refreshed.

The response from the POST method will include the refresh ID which can be used to check the status of the refresh later.

GET /refreshes

To get a list of historical refresh operations on a model, use the GET method on the /refreshes collection. The response body will look like this:

[
{
"refreshId": "1344a272-7893-4afa-a4b3-3fb87222fdac",
"startTime": "2017-12-09T01:58:04.76",
"endTime": "2017-12-09T01:58:12.607",
"status": "succeeded"
},
{
"refreshId": "474fc5a0-3d69-4c5d-adb4-8a846fa5580b",
"startTime": "2017-12-07T02:05:48.32",
"endTime": "2017-12-07T02:05:54.913",
"status": "succeeded"
}
]

ADF Pipeline

Armed with the information from the REST API, Greg has wrapped them up into a very simple 5-step ADF pipeline, which looks like this:

  1. Use a Web Activity to invoke a refresh operation on the Analysis Services model via the REST API
  2. Get the /refreshes collection of the model via the REST API
  3. Get the latest refresh ID
  4. Poll the REST API at 30 second intervals (or whatever interval you like) until the status of the last refresh operation is not “in progress”, in other words, either success/failure.
  5. Take action depending on the outcome of the refresh operation e.g. log failure

The whole pipeline looks like this:

ADFv2 Process Analysis Services Model Pipeline

You can download a sample pipeline from Greg’s Github repo. The pipeline has 5 parameters that allow for easy portability and re-use:

  • TenantID The GUID of your Azure AD (AAD) tenant. This is the Directory ID for your Azure Active Directory.
  • SubscriptionID The GUID identifier for the subscription the Azure Analysis Services instance is running in. This is in the Subscriptions tab of the Azure Portal.
  • Region The Azure region (e.g. uksouth) containing your Azure Analysis Services instance.
  • Server The name of your Azure Analysis Services instance.
  • DatabaseName The name of the model in Azure Analysis Services you want to process.

Authentication

Calls to the REST API must be authenticated with a valid Azure Active Directory (OAuth 2) token in the Authorization header and must meet the following requirements:

  • The token must be either a user token or an application service principal.
  • The token must have the correct audience set to
  • The user or application must have sufficient permissions on the server or model to make the requested call. The permission level is determined by roles within the model or the admin group on the server.

In Greg’s repo, he describes how he uses the ADF Managed Service Identity (MSI) for authentication - this identity is the “user account” under whose context the model will be processed.

I wasn’t aware that this Managed Service Identity existed and to find it is trivial. It can be found in the Properties section of the Azure Data Factory blade in the Azure portal. You will need the Managed Identity Application ID and the Managed Identity Tenant values.

ADF Managed Service Identity Values

Next, create an application identity like so:

app:<ApplicationID>@<TenantID>

Add this “user” as an Analysis Services Administrator via SQL Server Management Studio, as described here.

And that’s it. What I like about Greg’s solution is it’s simplicity. I knew that Azure Analysis Services had a REST API, but because all operations are asynchronous, I couldn’t work out a way to subsequently establish whether the refresh had succeeded or failed.

The key step for me was the Until activity which monitors the status of the most recent refresh until it has either succeeded or failed.

Credit to Greg Galloway for coming up with this innovative solution and do go check out his Github repo for other Azure Analysis Services automation examples.

P.S. If you want to encourage Microsoft to develop a native ADFv2 component to process Azure AS models, please upvote the feature request on UserVoice here: https://feedback.azure.com/forums/270578-data-factory/suggestions/16939123-refreshing-azure-analysis-cube

Ricoh Digital Services

Enabling business transformation with innovation