<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:cc="http://cyber.law.harvard.edu/rss/creativeCommonsRssModule.html">
    <channel>
        <title><![CDATA[Stories by Senthil Kumar Vijayaraghavan on Medium]]></title>
        <description><![CDATA[Stories by Senthil Kumar Vijayaraghavan on Medium]]></description>
        <link>https://medium.com/@say2senthil?source=rss-c37b43646b19------2</link>
        <image>
            <url>https://cdn-images-1.medium.com/fit/c/150/150/1*aW7rqCIRFkDmvPFyOWpI6Q.jpeg</url>
            <title>Stories by Senthil Kumar Vijayaraghavan on Medium</title>
            <link>https://medium.com/@say2senthil?source=rss-c37b43646b19------2</link>
        </image>
        <generator>Medium</generator>
        <lastBuildDate>Thu, 28 May 2026 00:58:03 GMT</lastBuildDate>
        <atom:link href="https://medium.com/@say2senthil/feed" rel="self" type="application/rss+xml"/>
        <webMaster><![CDATA[yourfriends@medium.com]]></webMaster>
        <atom:link href="http://medium.superfeedr.com" rel="hub"/>
        <item>
            <title><![CDATA[Achieving Cross Platform Lineage on Snowflake]]></title>
            <link>https://medium.com/snowflake/achieving-cross-platform-lineage-on-snowflake-f311eadd6811?source=rss-c37b43646b19------2</link>
            <guid isPermaLink="false">https://medium.com/p/f311eadd6811</guid>
            <category><![CDATA[snowflake-data-cloud]]></category>
            <category><![CDATA[snowflake-computing]]></category>
            <category><![CDATA[snowflake]]></category>
            <category><![CDATA[data-lineage]]></category>
            <dc:creator><![CDATA[Senthil Kumar Vijayaraghavan]]></dc:creator>
            <pubDate>Thu, 22 Jan 2026 15:18:00 GMT</pubDate>
            <atom:updated>2026-01-27T00:50:43.805Z</atom:updated>
            <content:encoded><![CDATA[<p><em>Understanding how your data flows from source to consumption has never been more important. Snowflake’s External Lineage feature bridges the gap between in-platform and cross-platform data visibility.</em></p><h3>Overview</h3><p>In today’s complex data ecosystems, organizations rarely operate within a single platform. Data flows through multiple systems, from source databases like PostgreSQL and MySQL, through orchestration tools like Apache Airflow and dbt, into Snowflake, and finally out to BI tools like Power BI, Tableau, etc. Understanding this complete data journey, known as data lineage, is critical for impact analysis, debugging, and governance.</p><p>Snowflake’s External Lineage extends the platform’s native lineage capabilities to include external data sources and destinations. By leveraging the open-source <a href="https://openlineage.io/">OpenLineage</a> standard, Snowflake now provides a unified view of how data moves through your entire data pipeline, not just within Snowflake itself.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*xtXMw6cit1JcRgpc.png" /><figcaption>External Objects in Snowflake Lineage graph</figcaption></figure><h3>What is OpenLineage?</h3><p>OpenLineage is an open standard for capturing and sharing metadata about data workflows. It defines a consistent model for emitting lineage events from tools like Airflow, dbt, and Spark. By standardizing lineage capture, OpenLineage enables interoperability across the modern data stack.</p><h3>How External Lineage Works in Snowflake</h3><p>External lineage works by accepting OpenLineage-compatible events through a REST endpoint:</p><pre>POST https://&lt;account_identifier&gt;.snowflakecomputing.com/api/v2/lineage/external-lineage</pre><p>External tools send lineage metadata to Snowflake, which incorporates this information into the native lineage graph displayed in Snowsight. This creates a connected view showing how data moves and transforms from source to final use.</p><h3>Key Use Cases</h3><p>External Lineage addresses several critical scenarios for data teams:</p><h4>1. Impact Analysis for Data Engineers</h4><p>Scenario: <em>dbt to Snowflake, Airflow to Snowflake, S3 to Snowflake</em></p><p>External ETL tools such as dbt and Airflow are popular for building data pipelines across multiple source systems. Data engineers need to understand:</p><ul><li>How changes to Snowflake tables will impact downstream datasets and dashboards outside Snowflake (Power BI, Tableau)</li><li>How changes to non-Snowflake sources (Amazon S3, PostgreSQL) will impact downstream datasets in Snowflake</li></ul><p>Example: A data engineer modifying a staging table in Snowflake can instantly see which dbt models, Airflow DAGs, and Power BI dashboards depend on that table.</p><h4>2. Debugging &amp; Troubleshooting for Analysts</h4><p>Scenario: <em>Snowflake to Power BI/Tableau</em></p><p>Analysts using Power BI or Tableau connect to Snowflake tables to create dashboards. When data quality issues arise, they can use Snowflake’s lineage feature to:</p><ul><li>Trace data back to its source</li><li>Identify transformation steps that may have introduced errors</li><li>Understand the complete data flow from source to visualization</li></ul><h4>3. Data Governance &amp; Compliance</h4><p>Scenario: <em>Cross-platform data flow visibility</em></p><p>Governance teams need visibility into data movement and transformation history to ensure compliance with internal policies. External lineage provides:</p><ul><li>Audit trails showing how data flows between platforms</li><li>Documentation of data transformations for regulatory requirements</li><li>Clear ownership and accountability across the data pipeline</li></ul><h3>Supported Data Sources by Platform</h3><p>External Lineage supports integration with major data orchestration platforms. Here’s a comprehensive breakdown:</p><h4>Apache Airflow</h4><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*em-vnWeTBNo6Bic_kcbnfg.png" /></figure><h4>dbt (Data Build Tool)</h4><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*min-XyYwOIOiqDlxoH3jGw.png" /></figure><h4>Apache Spark</h4><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*ckZlOTypGCvMb2c_aGN8qQ.png" /></figure><h3>Setting Up External Lineage</h3><h4>Prerequisites</h4><p>Before configuring external lineage, you must grant the necessary privileges:</p><pre>-- Create a dedicated role for lineage ingestion<br>CREATE ROLE dbt_lineage_role;<br><br>-- Grant the INGEST LINEAGE privilege<br>GRANT INGEST LINEAGE ON ACCOUNT TO ROLE dbt_lineage_role;<br><br>-- Assign the role to your service user<br>GRANT ROLE dbt_lineage_role TO USER dbt_integration_user;</pre><h4>Configuring OpenLineage with dbt</h4><p>Step 1) Install the OpenLineage-dbt integration</p><pre>pip3 install openlineage-dbt</pre><p>Step 2) Configure the transport variables in your YAML configuration file</p><pre>transport:<br>  type: http<br>  url: https://MYORG-DEV_ACCOUNT.snowflakecomputing.com<br>  endpoint: /api/v2/lineage/external-lineage<br>  auth:<br>    type: api_key<br>    apiKey: eyJ0eXAiOiJKV1QiL...  # Your security token<br>  compression: gzip</pre><p>Step 3) Replace standard dbt commands with dbt-ol</p><pre># Instead of: dbt run<br>dbt-ol run<br><br># Instead of: dbt build<br>dbt-ol build</pre><p>The integration uses a wrapper script (dbt-ol) that runs dbt and then reads generated artifacts (manifest.json, run_results.json, and optionally catalog.json) to extract metadata and emit OpenLineage events.</p><h3>Configuring OpenLineage with Apache Airflow</h3><p>Step 1) Install the OpenLineage Airflow provider</p><pre># For Airflow 2.7+<br>pip install apache-airflow-providers-openlineage<br><br># For older versions<br>pip install openlineage-airflow</pre><p>Step 2) Configure the transport variables</p><p><strong>Option A: </strong>Via environment variables</p><pre>export OPENLINEAGE_URL=https://MYORG-DEV_ACCOUNT.snowflakecomputing.com<br>export OPENLINEAGE_ENDPOINT=/api/v2/lineage/external-lineage<br>export OPENLINEAGE_API_KEY=eyJ0eXAiOiJKV1QiL...<br>export OPENLINEAGE_NAMESPACE=my-airflow-instance</pre><p><strong>Option B:</strong> Via YAML configuration (openlineage.yml)</p><pre>transport:<br>  type: http<br>  url: https://MYORG-DEV_ACCOUNT.snowflakecomputing.com<br>  endpoint: /api/v2/lineage/external-lineage<br>  auth:<br>    type: api_key<br>    apiKey: eyJ0eXAiOiJKV1QiL...<br>  compression: gzip</pre><p>Step 3) Run your DAGs normally. Lineage events are emitted automatically during DAG execution by hooking into Airflow’s task lifecycle.</p><h3>Lineage Flow Architecture</h3><p>The following diagrams illustrate how data lineage flows through a typical enterprise data stack with Snowflake External Lineage enabled:</p><p>End-to-End Lineage Flow:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*XVXPXnAgA5IQTkYquOaNIg.png" /></figure><p>Data Flow Example:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*qkvVgrk3u5C2oY7LcMhJeg.png" /></figure><h3>Preview State &amp; Current Limitations</h3><blockquote><strong>Important</strong>: External Lineage is currently in Preview (Open) state, available to all accounts on Enterprise Edition or higher.</blockquote><h3>Current Capabilities</h3><ul><li>Accept OpenLineage-compatible events via REST endpoint</li><li>Visualize external nodes in Snowsight lineage graph</li><li>Distinguish between native Snowflake objects and external objects</li><li>One-year retention period for external lineage events</li></ul><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*FQnl0BA98srKkkWHsLS5ng.png" /></figure><h3>Authentication Options</h3><p>Snowflake supports multiple authentication methods for the external lineage endpoint:</p><ol><li>Key Pair Authentication (Recommended for automation)</li><li>OAuth</li><li>API Keys</li></ol><p>Example curl request with key pair JWT:</p><pre>curl -i -X POST \<br>  -H &quot;Content-Type: application/json&quot; \<br>  -H &quot;Authorization: Bearer eyJ0eXAiOiJKV1QiL...&quot; \<br>  -H &quot;Accept: application/json&quot; \<br>  -H &quot;User-Agent: myApplication/1.0&quot; \<br>  -H &quot;X-Snowflake-Authorization-Token-Type: KEYPAIR_JWT&quot; \<br>  -d &quot;@request_body.json&quot; \<br>  &quot;https://MYORG-DEV_ACCOUNT.snowflakecomputing.com/api/v2/lineage/external-lineage&quot;</pre><h3>Removing External Lineage</h3><p>To remove lineage relationships, send a DELETE request:</p><pre>DELETE https://&lt;account_identifier&gt;.snowflakecomputing.com/api/v2/lineage/external-lineage?<br>  sourceNamespace={namespace}&amp;<br>  sourceName={FQN}&amp;<br>  sourceDatasetType={type}&amp;<br>  targetNamespace={namespace}&amp;<br>  targetName={FQN}&amp;<br>  targetDatasetType={type}</pre><blockquote>Note: Users must have the DELETE LINEAGE privilege on the account to remove lineage.</blockquote><h3>Conclusion</h3><p>External Lineage represents a significant step forward in Snowflake’s data governance capabilities. By embracing the OpenLineage standard, Snowflake enables organizations to achieve true end-to-end visibility across their entire data ecosystem.</p><h3>Key Takeaways</h3><blockquote>Unified Visibility — View lineage from source databases through Snowflake to BI dashboards in a single interface</blockquote><blockquote>Open Standards — Built on OpenLineage, ensuring compatibility with major orchestration tools</blockquote><blockquote>Easy Integration — Simple configuration for dbt and Airflow with minimal code changes</blockquote><blockquote>Enterprise Ready — Available on Enterprise Edition with robust security and access controls</blockquote><h3>Getting Started</h3><ol><li>Ensure you’re on Snowflake Enterprise Edition or higher</li><li>Grant INGEST LINEAGE privilege to your service accounts</li><li>Configure your dbt or Airflow environment to emit OpenLineage events</li><li>View your complete lineage in Snowsight</li></ol><p>As this feature evolves from Preview to General Availability, expect enhanced capabilities including column-level lineage, native BI tool connectors, and custom lineage editing for truly comprehensive data governance.</p><p><em>External Lineage is currently in Preview. For the latest information, visit the </em><a href="https://docs.snowflake.com/user-guide/external-lineage"><em>Snowflake Documentation</em></a><em>.</em></p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=f311eadd6811" width="1" height="1" alt=""><hr><p><a href="https://medium.com/snowflake/achieving-cross-platform-lineage-on-snowflake-f311eadd6811">Achieving Cross Platform Lineage on Snowflake</a> was originally published in <a href="https://medium.com/snowflake">Snowflake Builders Blog: Data Engineers, App Developers, AI, &amp; Data Science</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>]]></content:encoded>
        </item>
    </channel>
</rss>