Automating SnowPipe To Load Data From S3 To Snowflake

Jayashree Baskaran
BI3 Technologies
Published in
5 min readSep 17, 2021

In this blog, you will be learning to automate the snowpipe while uploading multiple files. Snowpipe is Snowflake’s server less, automated ingestion service that allows you to load your continuously generated data into Snowflake automatically. Automated data loads are based on event notifications for cloud storage to notify Snowpipe at the arrival of new data files to load. Snowpipe copies files into a queue, from which they are loaded into the target table continuously and without a server depending on the settings defined in a specified pipe object. You can check the data in snowflake by querying the table.

Flow of SnowPipe

Flow of snowpipe process

Creating S3 Bucket in AWS
1. Log into the AWS Management Console.
2. From the home dashboard, choose buckets

Creating Bucket

3. Click on the create bucket option
4. Set the bucket name as s3-bucket-snowpipe and select the region for storing our bucket.
5. For block public access settings, select the option Block all public access and set Bucket versioning and Default encryption into Disable. And now create the bucket.

Bucket creation

Creating Policy in IAM
1. From the home dashboard, choose IAM (Identity & Access Management)
2. Select the policy and click on the create a new policy.
3. Select the JSON and use the below code.

JSON Code
{
“Version”: “2012–10–17”,
“Statement”: [
{
“Effect”: “Allow”,
“Action”: [
“s3:GetObject”,
“s3:GetObjectVersion”
],
“Resource”: “arn:aws:s3:::s3-snowpipe-bucket/ingest/*”
},
{
“Effect”: “Allow”,
“Action”: “s3:ListBucket”,
“Resource”: “arn:aws:s3:::s3-snowpipe-bucket”,
“Condition”: {
“StringLike”: {
“s3:prefix”: [
“*”
]
}
}
}
]
}
Creating Policy

4. Click on Next tags and Next review, Set the policy name as snowpipe-policy and create the policy.

Creating User in IAM
1. From the home dashboard, choose IAM (Identity & Access Management).
2. Select the user, click on create user and set the username as snowpipe-user.

Creating an user

3. Select Programmatic access in Access type and click on Next Permissions.
4. Click on Attach existing policies directly then select the policy which created earlier named as snowpipe_policy and click on Next tags and create the user and download the csv file which contains the access key ID and secret access key.

Attaching the policies

Creating table in snowflake
1. Redirect to Snowflake account.
2. Creating a table in snowflake to store the data.

create or replace table Prsnl_dtls(id int, Name varchar(100), City varchar(100),Country varchar(100));

3. Creating a stage using S3 bucket name with proper file format that is going to be loaded.

create or replace stage demo_db.public.snowpipe_stage
url = ‘s3://s3-bucket-snowpipe/ingest/’
credentials=(aws_key_id=’ ‘ aws_secret_key=’ )
file_format = (type = ‘CSV’);

4. Create a pipe and use copy query to copy the data into table from created stage.

create or replace pipe demo_db.public.snowpipe_pipe auto_ingest=true as
copy into demo_db.public.Prsnl_dtls
from @demo_db.public.snowpipe_stage
file_format = (type = ‘CSV’);
Snowflake Queries

5. Use the show pipes command and copy the arn code from the notification channel column values
show pipes;

Selecting ARN

6. Redirect to the s3 bucket page and select the event notification under the properties.
7. Click on the Event notifications and select create event notifications.
8. Set the event name as snowpipe-event and select the All object create event from Event Types.

Creating Event

9. Select SQS Queue under Destination and select Enter SQS Queue arn and paste the arn copied from snowflake and click on save changes.

Setting up SQS Queue

10. Upload the files into s3 bucket.

Files that are uploaded into s3 bucket

11. Redirect to snowflake and query the table to verify ingested data.

Data that copied from files

The data available in the uploaded files are successfully loaded into snowflake table.
Benefits of Snowpipe

Snowpipe eliminates the challenges associated with legacy systems and provides the following benefits to the organizations.

1) Constant experiences: Snowpipe consistently gives new business information across every one of the offices without making any responsibility issues.

2) Limits cost: It is extremely affordable and charges clients each second dependent on the calculation time.

3) Convenience: It is extremely easy to utilize. You should simply interface it with the S3 and information will stack naturally.

4) Adaptability: Snowpipe is profoundly adaptable and permits simple customizations to stack information utilizing automatic REST Programming interface, utilizing Python, and Python SDKs.

5) Zero Administration: Snowpipe consequently increases and scales down dependent on the information stacking necessities. There is no compelling reason to oversee anything.

Conclusion
SnowPipe is one of the great ways to ingest files and minimizes the cost when used with appropriate measures and planning.

For more reference click on Automating Snowpipe .

About Us

Bi3 has been recognized for being one of the fastest-growing companies in Australia. Our team has delivered substantial and complex projects for some of the largest organizations around the globe and we’re quickly building a brand that is well known for superior delivery.

Website : https://bi3technologies.com/

Follow us on,
LinkedIn : https://www.linkedin.com/company/bi3technologies
Instagram :
https://www.instagram.com/bi3technologies/
Twitter :
https://twitter.com/Bi3Technologies

--

--