Continuous Data Ingestion from Azure Blob Storage to Snowflake using Snowpipe

Gowtham
BI3 Technologies
Published in
6 min readSep 15, 2021

In this blog, you will learn to load data from Azure storage account to snowflake automatically using Snowpipe.

Process Flow

Snowpipe Process Flow Diagram
Process Flow Diagram
  1. Prerequisite

Assume, we have a table in our Snowflake’s Database, and we want to copy the data from Azure Blob Storage into our tables as soon as new files are uploaded into the Blob Storage.

Create table Script
Create a table

2. Facilitate Azure Services and Snowflake to build a data pipeline to auto ingest files from Azure Blob Storage into Snowflake’s table.

Create a storage account under the resource group. Example: “testportal1

2.1 Create a container (Azure)

Example: “snowpipe-demo-container

Create a Container
Create a container

2.2 Create a queue (Azure)

Example: “snowpipe-demo-queue

Create a queue
Create a queue

2.3 Create an Event Grid Subscription (Azure)

Use the below step to create an Event Grind Subscription for the Container and set the endpoint to the Storage Queue.

Step 1: Navigate to Storage Account and click on Events, then click +Event Subscription.

Create Event Subscription
Create Event Subscription

Step 2: In the EVENT SUBSCRIPTION DETAILS Section, Give a name and choose Event Grid Schema for Event Schema.

Fill details in Event Subscriptions
Fill details in Event Subscriptions

Step 3: In Topic Details Section, Give a name for the Topic

Step 4: In the Event Types Section, Choose only Blob Created and uncheck the others.

Step 5: In Endpoint Details Section, Choose Storage Queues as endpoint type.

Step 6: Now click on Select an endpoint, then Select your subscription, Storage Account the created Storage Queue, finally, click on Confirm Selection.

Select storage account
Select storage account

Step 7: Click on Create, now the Topic and the Event Subscription creation are completed.

2.4 Create a Notification Integration in Snowflake (Snowflake)

For this we need, the Azure Tenant Id and Storage Queue URL

Step 1: To get the Storage Queue URL, Navigate to Storage Account, then click on Queues, Copy the URL of the desired queue.

Copy Storage Queue URL
Copy Storage Queue URL

Step 2: To get the Tenant Id, Navigate to Azure Active Directory Service, In the Overview pane, copy the Tenant Id.

Copy Tenant ID
Copy Tenant ID
Storage Queue URL: https://***.core.windows.net/***-demo-queueTenant Id:***-d49a-45bd-8ad1–88695***

Step 3: Now Navigate to Snowflake and change the role to ACCOUNTADMIN.

Step 4: Use the below syntax for creating Notification Integration.

// Create Notification IntegrationCREATE NOTIFICATION INTEGRATION INTEGRATION_NAMEENABLED=TRUETYPE=QUEUENOTIFICATION_PROVIDER=AZURE_STORAGE_QUEUEAZURE_STORAGE_QUEUE_PRIMARY_URI=’STORAGE_QUEUE_URL’AZURE_TENANT_ID=’AD_TENANT_ID’;

Replace the STORAGE_QUEUE_URL with the actual Storage Queue URL and AD_TENANT_ID with the actual Tenant Id.

// Create Notification IntegrationCREATE NOTIFICATION INTEGRATION SNOWPIPE_DEMO2_NOTFINTENABLED=TRUETYPE=QUEUENOTIFICATION_PROVIDER=AZURE_STORAGE_QUEUEAZURE_STORAGE_QUEUE_PRIMARY_URI=’https://***.core.windows.net/***-queue’AZURE_TENANT_ID=’***-d49a-45bd-8ad1–886951**';

Step 5: Now snowflake knows that this integration is for Azure, where will it get the event notifications, but we need Azure also to authenticate our Snowflake Account one time.

Execute the below command to achieve this.

DESC NOTIFICATION INTEGRATION SNOWPIPE_DEMO2_NOTFINT ;

It gives an AZURE_CONSENT_URL which will be used to establish a protocol between Azure Account and Snowflake Account.

Step 6: Copy the URL, and paste it into the browser, where you have already logged into Azure Portal.

Step 7: Click on Accept.

Once the establishment is successful, you will be redirected to the Snowflake page like below.

Azure Portal
Azure Portal

2.5 Confirm Snowflake is authorized

Navigate to Azure Portal and redirect to the Active Directory service page, and click on Enterprise Applications.

Enterprise Application
Enterprise Application

You can see now Snowflake is added as an Enterprise Application.

2.6 Grant Storage access to the Snowflake Application (Azure)

Step 1: Copy the name of the Snowflake Enterprise Application.

Name: SnowflakePACInt0181

Step 2: Navigate to Azure Storage Account and choose Access Control IAM.

Step 3: Click on Add and Add Role Assignment.

Role Assignment
Role Assignment

Step 4: Choose Storage Queue Data Contribute for Role.

Step 5: Choose our created Snowflake Enterprise Application and click on Save.

Add role assignment
Add role assignment

Now the Snowflake application is added to the Storage Queue Data Contributor role.

2.7 Create Stages in Snowflake for two different file formats (Snowflake)

For this, we need two pieces of information. One is Blob Service URL and Shared Access Signature Token which we need to generate.

Step 1: To get the Blob Service Endpoint, Go to Storage Account, Click on Endpoints and copy the Blob Service Endpoint.

Copy Blob Service Endpoint
Copy Blob Service Endpoint
Blob Service Endpoint: https://***.blob.core.windows.***

Step 2: To get the Shared Access Signature Token, Click on Shared Access Signature

Step 3: In the selection pane, select only the required permissions.

Step 4: Now click on Generate SAS and Connection String and Copy the SAS Token

Generate SAS
Generate SAS
Generate SAS and Connection String and Copy the SAS Token
Generate SAS and Connection String and Copy the SAS Token
SAS Token: ?sv=2020–02–10***************8&se=2021–06–23T18:21:32Z&st=2021–06–23T10:21:32Z&s***************SUV76ygOOGyMh******RPkxpkqD5FDLaM%3D

Step 5: Navigate to Snowflake and create a stage for JSON file using the below commands.

// Create Stage for a JSON File

CREATE OR REPLACE STAGE “SNOWPIPE_DEMO_DB”.”PUBLIC”.SNOWPIPE_STUDENT_STG_JSONURL = ’ https://***.core.windows.net/snowpipe-demo-container/***'CREDENTIALS = (AZURE_SAS_TOKEN = ‘?sv=***se=2021–06–23T18:21:32Z&st=2021–06–*****’)FILE_FORMAT = (TYPE = ‘JSON’);Note: Replace the URL and AZURE_SAS_TOKEN with the actual values.To see the available staging files, execute the below commands.LS @SNOWPIPE_STUDENT_STG_JSON;

2.8 Creating SnowPipe (Snowflake)

Create the SnowPipe for different stage files using the below commands.

// Create pipe for Stage JSON FileCREATE OR REPLACE PIPE SNOWPIPE_DEMO_DB.PUBLIC.DEMO_SNOWPIPE_STUDENTS_JSONAUTO_INGEST=TRUEINTEGRATION=SNOWPIPE_DEMO2_NOTFINTASCOPY INTO SNOWPIPE_DEMO_DB.PUBLIC.STUDENTS_JSON(STUDENT_ID,NAME)FROM(SELECT $1:STUDENT_ID, $1:NAME FROM @SNOWPIPE_DEMO_DB.PUBLIC.SNOWPIPE_STUDENT_STG_JSON );

2.9 The Results

As soon as we created the pipe, Let’s try uploading the file, as of now, the records count is 0 in a table.

Upload file in blob
Upload file in blob

Use the below command to make sure our file is available in staging.

LS @SNOWPIPE_STUDENT_STG_JSON;

If the tables are not loaded with values try refreshing the pipes using the below commands.

ALTER PIPE DEMO_SNOWPIPE_STUDENTS_JSON REFRESH;

Our file was added to the stage, now let’s check target table is loaded with new data by executing the below command.

SELECT * FROM SNOWPIPE_DEMO_DB.PUBLIC.STUDENTS_JSON;
Target table
Target table

Yes, our table is loaded with new values, we have now successfully set up a data pipeline for continuous data ingestion from Azure Blob Storage to Snowflake.

Finally, We have accomplished loading the data from Azure blob storage into snowflake using Snowpipe. It uses the Microsoft Azure Queue and Event Subscriptions and a Snowflake container to load the file into Blob which acts as Source to the Snowpipe and target table to load the data.

Have you ever tried load data into a snowflake table using Snowpipe? What was the issue you faced? Leave a comment. We will try to help you.

Reference Links :
https://docs.snowflake.com/en/user-guide/data-load-snowpipe-auto-azure.html

About Us

Bi3 has been recognized for being one of the fastest-growing companies in Australia. Our team has delivered substantial and complex projects for some of the largest organizations around the globe and we’re quickly building a brand that is well known for superior delivery.

Website : https://bi3technologies.com/

Follow us on,
LinkedIn : https://www.linkedin.com/company/bi3technologies
Instagram :
https://www.instagram.com/bi3technologies/
Twitter :
https://twitter.com/Bi3Technologies

--

--