Integrate Data bricks Data Lineage with Azure Purview
Image Source : Intellishore
Create service principal for Azure
- Go to “Azure Active Directory”, then “App Registration” and then “New Registration
- Give your service principal a name and click “Register”
- Note down the tenet ID and client ID and a secret
- Go to “azure purview”, then “access control ( iam)”. Add the role “data curator” to purview_api service principal under “add role assignments”.
Create Databricks runtime with Spline
- Open Azure Databricks and create a new cluster. Create a cluster of your desired needs, but it must use the 6.4 runtime version. This is
a limitation of Spline since it does not have support for newer runtimes yet.
- Then install the Spline packages from Maven.
- Upload the “Spark Lineage Harvest Init.ipynb ” to your Databricks Environment
- Run the initialization notebook with the code shown in the notebook you want to track
In this blog, we explored about the how to integrate data bricks with Azure Purview to get data lineage with Data bricks notebooks using spline.