Snowflake Partner Connect — 3 Steps for Connecting to Fivetran for Data Pipelining
A Quick Start Guide for Building Snowflake Connections from Scratch
Moving to OKC earlier this year has given me the opportunity to work with clients across a variety of industries and I continue to hear a common ask across all clients…
“Can you help us simplify our data flow, data movement, data integration, and data pipelining process so that we can quickly deliver value and business outcomes?”
An increasingly popular approach to solving this problem is by using cloud-based ETL and Data Warehousing solutions. These tools make it easy, fast, and reliable to migrate your disparate and siloed data into a unified storage solution that scales with your needs.
Two leaders in this space are Fivetran for managed ETL and Snowflake for cloud data warehousing.
If you haven’t tried it yet, Fivetran is a fully-managed, cloud-based ETL service that seamlessly integrates a huge number of data sources and destinations. By now, you’ve probably heard a lot about Snowflake, but if not, I’ve provided some additional Snowflake stories at the end of this post.
Let’s Get To It
The purpose of this post is to quickly explain how to connect your Snowflake Cloud Data Warehouse instance to an existing Fivetran account. It also serves to show how much work is automated for you when you use Snowflake’s Partner Connect tool as we’ll be recreating that behavior manually with SQL commands.
Snowflake has seamless integration with Fivetran. Their partnership has led to a one-click setup within the Snowflake Partner Connect tool, but this only works for new Fivetran accounts.
Luckily, everything you can do in Snowflake can be expressed as SQL. Let’s dive into all the setup required to start streaming your Fivetran ETL data into Snowflake’s Cloud Data Warehouse!
#1 Snowflake Setup
Before connecting Fivetran, we need to make sure that Fivetran has a place to put data and permissions to do so.
Let’s start by opening a new Snowflake worksheet. You’ll need access to the
SECURITYADMIN roles for this.
Use the following code to create the necessary objects :
// set role
USE ROLE SYSADMIN;// database
CREATE DATABASE IF NOT EXISTS
COMMENT='Database for Fivetran ETL data ingestion';
CREATE WAREHOUSE IF NOT EXISTS
COMMENT='Warehouse for Fivetran ETL data ingestion'
AUTO_SUSPEND=60 // shut this bad boy down as fast as possible in between jobs
// set role
USE ROLE SECURITYADMIN;// user
CREATE USER IF NOT EXISTS
COMMENT='Account for Fivetran ETL data ingestion'
PASSWORD="my super cool password." // use your own password, dummy
CREATE ROLE IF NOT EXISTS
COMMENT='Role for Fivetran ETL data ingestion';
NOTE: Please make sure that you choose a secure password for this. Your Fivetran integration won’t be able to change the password, so also ensure that you explicitly declare the password as not needing to be changed.
Next, run the following SQL to manage permissions for the new objects we just created:
// set role
USE ROLE SECURITY ADMIN;// grant db and warehouse access to role
GRANT USAGE, CREATE SCHEMA ON DATABASE FIVETRAN_DATABASE TO ROLE FIVETRAN_ROLE;
GRANT USAGE ON WAREHOUSE FIVETRAN_WAREHOUSE TO ROLE FIVETRAN_ROLE;
// grant role access
GRANT ROLE FIVETRAN_ROLE TO USER FIVETRAN_USER_ACCOUNT;
GRANT ROLE FIVETRAN_ROLE TO ROLE SYSADMIN;// Set the default role and warehouse for the user account
It is really important to remember to set the default role and warehouse for the
FIVETRAN_USER_ACCOUNT. If you don’t do this, you’ll experience errors when connecting to your database. This happens because Fivetran will connect with the
PUBLIC role if you don’t specify a default role in Snowflake and because it’ll need a warehouse to use when copying data.
With this done, you have successfully recreated the major functions that the Snowflake Partner Connect tool would have performed for you.
Now we’ll head over to Fivetran to finish the integration.
#2 Fivetran Setup
First, let’s add Snowflake as a warehouse in Fivetran.
Log in to your Fivetran account, go to the warehouse section, and create a new warehouse.
Name the warehouse whatever you like and select Snowflake as the warehouse type. Configure the Snowflake using the following:
Use your own host URL (this is the URL you use to sign into the Snowflake web UI). The password field will need to be the value you set when creating the user account.
You should then click SAVE & TEST to confirm your connection details. If you have any issues, check out the fantastic Fivetran docs here.
From this point, you should be able to begin sending data to Snowflake!
#3 Ingesting Spotify Data
To make sure everything works, let’s use Fivetran’s dead-simple CSV upload feature to ingest some Spotify Data from Kaggle.
First, download and extract the data from Kaggle. The CSV file should be named
data.csv. This is a little generic, so I renamed mine to
spotify_data.csv like a good data engineer.
Next, log in to your Fivetran account and open your new warehouse. From here, select the Uploads tab:
In Uploads, select +SCHEMA, and configure your new schema and table. Name them whatever you like, but I used the following:
These values will be used by Fivetran to create a new schema and table within your snowflake
Next, select your new schema, select your table, then upload the
spotify_data.csv file from earlier. The file will upload and Fivetran will offer to do some basic data cleaning and type inference. I found the default inference to be perfect for this data set.
Confirm your selection and wait for the file to be uploaded into Snowflake. The whole process should take about 10 minutes (depending on your internet speed).
You’re Ready — Start Exploring Your Datasets
At this point, you’re ready to begin exploring your newly uploaded data! See our previous blog post for how to convert this Spotify data into a live PowerBI dashboard directly from Snowflake.
Definitely, use Snowflake Partner Connect for anything you can — it’s quick, easy, and ready to go, but if the connection is not available (like in this instance for existing Fivetran customers), I hope that you can use this efficient 3 step guide to get up and running quickly.
Need Cloud Data Warehousing, Migration, and Data Engineering/Pipelining Assistance?
If you’d like additional help in this area, Hashmap offers a range of enablement workshops and assessment services, cloud migration services, and consulting service packages as part of our data and cloud service offerings — we would be glad to work through your specific requirements — please reach out.
Some of our other Snowflake stories and podcasts are below:
How to Improve Cloud Cost Monitoring — Snowflake + Tableau
A Guide for Using Snowflake‘s Account Usage Dataset to Create Automatically-Updating Cost Monitoring Dashboards
How to Ingest & Enrich IoT Data at Scale into Snowflake with Apache NiFi
No, You Don’t Need To Work Long Hours to Make This Happen
Business Won’t Wait — Migrating to Azure for Data & Analytics
Accelerating Business Outcomes Using Azure — Databricks, Data Factory, PowerBI, & Snowflake Cloud Data Warehouse
Amplifying Outcomes with Snowflake
Quickly Build and Use a Snowflake Cloud Data Warehouse with Stitch, PowerBI, and a Kaggle Dataset in AWS
SnowAlert! Data Driven Security Analytics using Snowflake Data Warehouse
This is Worth Trying Out — An Open Source Project for Security Analytics with Snowflake
6 Steps to Secure PII in Snowflake’s Cloud Data Warehouse
How to Use Built-In Snowflake Security Features to Secure Personally Identifiable Information (PII)
Quick Tips for Using Snowflake with AWS Lambda
A Working Example Using AWS Lambda Serverless Compute and Snowflake’s Cloud Data Warehouse Together
Snowflake’s Cloud Data Warehouse — What I Learned and Why I’m Rethinking the Data Warehouse
Achieving Performance and Simplicity in the Cloud for ORC Data Loads and Tableau Visualization
Randy Pitcher is a Cloud and Data Engineer (and OKC-based Regional Technical Expert) with Hashmap providing Data, Cloud, IoT, and AI/ML solutions and consulting expertise across industries with a group of innovative technologists and domain experts accelerating high value business outcomes for our customers.
Be sure and connect with Randy on LinkedIn and reach out for more perspectives and insight into accelerating your data-driven business outcomes or to schedule a hands-on workshop to help you go from Zero to Snowflake.