ADF pipeline on-demand from everywhere

You’ve just completed a beautiful Azure Data Factory v2 pipeline and you’re ready to schedule but you’ve been asked to make it “launchable” on-demand by non-technical people.

It means no scheduling at all and we cannot ask them to connect to the portal to run it manually.

So, we have to find a way “to call it from the outside”.

Luckily, ADF v2 has a good set of REST APIs that enables any kind of applications to interact with it, and most of them are also wrapped into a specific PowerShell cmdlets collected into the Az.DataFactory module that you can add to your PowerShell environment.

The Az.DataFactory module it’s very handy, with a lot of commands and one of them it’s my starting point: Invoke-AzDataFactoryV2Pipeline.

If we’re connected to an Azure Subscription, using this command we can invoke a pipeline passing 3 parameters:

  • Resource Group Name
  • Data Factory Name
  • Pipeline Name

With very few lines of PowerShell you can create a script that does the following:

  1. Prompts you for credentials in order to connect to Azure
  2. Locates your Subscription and Data Factory resources
  3. Invokes the pipeline
  4. Based on a setup, it loops checking pipeline’s status (you can define how many seconds between one check and another)

You can wrap everything into a Try-Catch statement in order to manage exceptions and you can print some useful information to the user.

Here an example of what you can do:

Running something similar this is what you’ll get:

Indeed the pipeline has been successfully run, as you can verify from the portal.

So far so good but… we have some problems with this approach:

  1. User must have rights to connect to Azure and to execute ADF pipeline
  2. Authentication it’s not an unattended process
  3. PowerShell script could be too technical for our non-technical user
  4. Other applications must execute PowerShell to execute the script

Do we need to throw everything away? Not at all, using Azure Automation we can encapsulate everything into a PowerShell based Runbook and then expose it on the web through Webhook feature.

  1. Navigate to the Azure Portal
  2. Create and Azure Automation Account (or use an existing one)
  3. Create a Runbook type PowerShell

4. Since we use Az.DataFactory module in our script we need to import it in the Automation Account from the modules gallery (and all its dependencies)

It’s time to include our code in the Runbook but, since everything must work without user interaction, and must be runnable by low/no permission users (on Azure resources) we need to leverage some Runbooks’ features.

Variable assets are values that are available to all runbooks in your Automation account and they could be very useful to store some info in our case:

  • Subscription ID
  • Resource Group Name
  • Data Factory Name

An Automation Credential asset holds an object that contains security credentials, such as a user name and a password. In our case it would be super useful to store credentials to connect to Azure Data Factory resource.

So let’s create and setup them appropriately.

We can now modify our code to get Credential to connect to Data Factory and all the needed info from these assets:

#Get and set configs
$CredentialName = ‘DataFactoryCredential’
$ResourceGroupName = Get-AutomationVariable -Name ‘DataFactoryResourceGroupName’
$DataFactoryName = Get-AutomationVariable -Name ‘DataFactoryName’
$SubscriptionID = Get-AutomationVariable -Name ‘AzureSubscriptionId’
#Get credentials
$AzureDataFactoryUser = Get-AutomationPSCredential -Name $CredentialName

In the end we could transform $PipelineName variable into a parameter in order to dynamically choose which pipeline to invoke:

param (
[Parameter(Mandatory=$true)][string]$PipelineName
)

The final version of the code could be something similar to this:

The final step is to make our runbook runnable through POST calls using another Runbooks’ feature that is Webhooks awareness.

A POST call will be the only task for applications to perform in order to start a pipeline specified as a parameter in the body message.

$pipelineName = “<my_pipeline_name>”$bodyMsg = @(
@{ Message=$pipelineName }
)
$URL = ‘<my_webhook>’
$body = ConvertTo-Json -InputObject $bodyMsg
$header = @{ message = “StartedByRik”}
$response = Invoke-RestMethod -Method post -Uri $URL -Body $body -Headers $header
$response

Conclusions:

We manage to create an ADF pipeline that must be launched by users on-demand with some constraints:

  • Runnable on-demand 🗸
  • No permission needed on an Azure perspective for users 🗸
  • No technical background needed 🗸
  • Unattended authentication 🗸
  • Runnable trough others applications 🗸 (any applications able to perform POST calls)
  • Azure resources hidden behind the scenes 🗸
  • “Data Factory User” store securely 🗸

All scripts in the Github repo: https://github.com/R1k91/ADFPipelineOnDemandInvoke

--

--

--

Container of my adventures wrangling data

Recommended from Medium

2020 Year In Review

Take Your Kusama Address For A Joy Ride With Lunie In Explore Mode

Spring Boot Hystrix

Project SEED Announces Partnership with Esports 3.0 Platform Backed by LABS Group

Cocoapods Handbook

How to Query PostgreSQL with Athena

Managing Your Mobile Applications

<span>Photo by <a href=”https://unsplash.com/@garrhetsampson?utm_source=unsplash&amp;utm_medium=referral&amp;utm_content=cred

WHAT! “INSTALLING”…-BUT…BUT…There are so many!!!

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Perico Riccardo

Perico Riccardo

BI & Power BI Engineer since 2010 — Data and me in a complicated relationship — Hard Rock and Heavy Metal addicted

More from Medium

EXCEPTION handling for transaction with Snowflake Scripting(Stored Procedure/Anonymous block)

Do’s and Don’ts when working with CosmosDB Gremlin API

Data Masking & Encryption in Azure

Import Cloud Software Data From Azure AD Into AssetSonar