Automate Your Data Pipeline Deployments Across Tenants

FABRIC SERIES 07: ADVANCED — CROSS-TENANT DATA PIPELINE DEPLOYMENT AUTOMATION

RK Iyer
Microsoft Azure
5 min readMay 31, 2024

--

Microsoft Fabric Automated Data Pipeline Deployment

❑ Overview

Embarking on your journey with Microsoft Fabric? Or perhaps you’re already a seasoned explorer, pushing the boundaries of what’s possible? Have you come across a scenario where you want to automate your data pipelines deployments from one tenant to another?

For those familiar with Azure Data Factory (ADF), you know that its import and export capabilities have made pipeline migration a breeze, enabling customers and ISVs alike to reuse and redeploy pipelines across tenants.

In this blog post, I’m excited to share a utility I’ve developed that automates the cross-tenant deployment of Fabric Data Pipelines. I am confident this tool will prove invaluable when faced with cross-tenant migration scenarios. So, let’s dive in and explore this solution together!

❑ Problem Statement — Below 2 pipelines needs to be migrated to another tenant

Data pipeline to be migrated

Details of data pipeline “PL_CopyOrdersData” data pipeline—

PL_CopyOrdersData Details

Above pipeline consists of 2 activities -

  1. ACT_MT_Copy_InventoryDataForFullLoad — For copying from Azure PostgreSQL DB to Microsoft Fabric Lakehouse.
  2. ACT_GEN_EPIPE_invokeInventoryAuditSuccessForFullLoad- To invoke auditing on successful full Inventory Table Ingestion. Here we have used an invoke activity to invoke a child pipeline to showcase how to.

Details of data pipeline “PL_WaitAct” data pipeline (invoked by “PL_CopyOrdersData”) —

PL_WaitActPipeline

Above pipeline consists of 1 activity -

  1. ACT_GEN_Wait — This activity is used to pause the execution of a pipeline for a specified period.

❑ Start with pre-requisites

1. Create a Microsoft Fabric destination tenant workspace. You can also identify existing lakehouse where the data needs to be ingested in Destination workspace. Get the e.g. Workspace ID — 66a92280-b91b-408e-bdd5–0276ad7d35a1.

2. Within this workspace, create a Lakehouse or identify existing lakehouse where the data needs to be ingested in Destination workspace.

  • Get the Lakehouse ID e.g. a1e8b2fe-7527–40b7-a137–648be36a6602
  • Get the SQL analytics endpoint e.g. x6eps4xrq2xudenlfv6naeo3i4-qarkszq3xghebpovaj3k27jvue.msit-datawarehouse.fabric.microsoft.com
  • Get the name e.g. CMFLHStore

3. For both of the above pipelines that needs to be deployed, go to the pipeline & select “View JSON code". Go to “Copy to clipboard” & “Close". Save the Json & upload the Json in the lakehouse in the Destination workspace.

SaveJSONinDestinationWorkspace
UploadJSON

4. Next steps is to identify the internal & external connections used in our source data pipelines by executing “IdentifyExternalInternalConnections” notebook.

As a pre-requisite for this, update the value of “identifyconnections_deployment_config.yml” file with

pipelinedisplayname — The Name of the data pipeline

pipelinejsonfilepath — The file path of the data pipeline; as shown in below figure.

identifyconnections_deployment_config.yml

5. Execute “IdentifyExternalInternalConnections” notebook. Make sure that you have attached & pinned the right lakehouse before executing the notebook.

IdentifyExternalInternalConnections-Notebook

After executing below output is obtained.

Output of IdentifyExternalInternalConnections-Notebook

With this we have identified that we have an external connection so go to Settings in Source tenant -> Manage Connections, identify the connection type e.g. PostgreSQL, SQL DB based on the ID obtained from above step.

6. Once the external connection type for the source tenant is identified, We need to create the external connection in destination tenant. Please follow below steps to create a new connection & obtain the new ConnectionID. Please note that this will be used further.

7. In the destination tenant, the penultimate step is to update “pipeline_deployment_config.yml” file as below & upload the updated file to lakehouse.

pipeline_deployment_config.yml

8. The final step is to Run the “FabricDataPipelineAutoDeployment” notebook.Make sure that you have attached & pinned the right lakehouse before executing the notebook.

After executing below output is obtained.

Notebookrunoutput

The data pipeline gets deployed in the target tenant.

Datapipelinedeployed

❑ GitHub Repo

Here’s the GitHub repo of the utility.

rkniyer999/FabricAutomatedDataPipelineDeployment: This repository is used to perform automated data pipeline deployment in another tenant. (github.com)

❑ Errors

If you come across below error while executing the “MicrosoftFabricDataPipelineAutoDeployment” notebook, delete the pipeline & wait for 5 mins to re-execute the notebook.

❑ Reference

Fabric data pipeline public REST API (Preview) — Microsoft Fabric | Microsoft Learn

❑ Conclusion

I hope this blog helps you in migrating Fabric data pipelines across tenant. I will keep you posted if there are any feature enhancements. Till then, Happy Learning!!!

❑ Thanks

Special thanks to Abhishek Narain, Vedang Sharma & Gyani Sinha for brainstorming & guiding on Automated Deployment of Data Pipelines enabling tenant migration for data pipeline resources.

Please Note — All opinions expressed here are my personal views and not of my employer.

--

--

RK Iyer
Microsoft Azure

Architect@Microsoft, Technology Evangelist, Sports Enthusiast! All opinions here are my personal thoughts and not my employers.