10 Minute Beginner’s Guide to Snowflake Cloud Data Warehouse

Randy Pitcher II
Hashmap, an NTT DATA Company
6 min readJun 27, 2019

Hashmap kicked off the OKC Snowflake User Group Meetup a couple of months ago in order to assist a rapidly growing group of OKC Snowflake users in coming up to speed quicker on Snowflake while accelerating an ability to deliver meaningful business outcomes.

One of the questions that we get asked a lot is:

How will my SQL knowledge and skills transition into our new Snowflake world and user experience?

Great question — here is the simple answer…

The Snowflake Cloud Data Warehouse is the best way to convert your SQL skills into cloud-native data solutions.

This guide will explain everything you need to know to get data into Snowflake and start running queries.

From this point on, we’ll assume you know what Snowflake is and are ready to dive right in. You should have access to a Snowflake instance and should be able to run queries. If not, you can get a trial account with free credits here.

Getting Started

First, let’s build our basic Snowflake environment.

Start by logging in to the Snowflake Web UI and open a new worksheet. Run the following SQL to build your first workspace:

//============================
// setup security objects
//============================
USE ROLE SECURITYADMIN;

CREATE ROLE BEGINNER_ROLE;

GRANT ROLE BEGINNER_ROLE TO ROLE SYSADMIN;
//============================


//============================
// setup snowflake objects
//============================
USE ROLE SYSADMIN;

CREATE DATABASE IF NOT EXISTS BEGINNER_DB;

CREATE WAREHOUSE IF NOT EXISTS
BEGINNER_WH
COMMENT='Warehouse for beginner activities'
WAREHOUSE_SIZE=XSMALL
AUTO_SUSPEND=60
INITIALLY_SUSPENDED=TRUE;

// grant ownership to our beginner role
GRANT OWNERSHIP ON DATABASE BEGINNER_DB TO ROLE BEGINNER_ROLE;
GRANT OWNERSHIP ON WAREHOUSE BEGINNER_WH TO ROLE BEGINNER_ROLE;
//============================


//============================
// create beginner resources
//============================
USE ROLE BEGINNER_ROLE;
USE WAREHOUSE BEGINNER_WH;

// build a home for our raw data
CREATE SCHEMA IF NOT EXISTS BEGINNER_DB.RAW_DATASETS;

// create a file format definition for loading raw data later
CREATE FILE FORMAT
BEGINNER_DB.RAW_DATASETS.CSV_WITH_HEADER_AND_TEXT_FIELDS
SKIP_HEADER = 1
FIELD_OPTIONALLY_ENCLOSED_BY = '\042';
//============================

Loading Data Into Snowflake

Start by downloading this CSV file. This file is a Kaggle dataset that categorizes episodes of The Joy of Painting with Bob Ross.

Next, in a new Worksheet in the Snowflake Web UI, run the following SQL to create a table where we can upload our data:

// set context
USE ROLE BEGINNER_ROLE;
USE WAREHOUSE BEGINNER_WH;

// define the table
CREATE TABLE IF NOT EXISTS
BEGINNER_DB.RAW_DATASETS.BOB_ROSS(
EPISODE STRING,
TITLE STRING,
APPLE_FRAME BOOLEAN,
AURORA_BOREALIS BOOLEAN,
BARN BOOLEAN,
BEACH BOOLEAN,
BOAT BOOLEAN,
BRIDGE BOOLEAN,
BUILDING BOOLEAN,
BUSHES BOOLEAN,
CABIN BOOLEAN,
CACTUS BOOLEAN,
CIRCLE_FRAME BOOLEAN,
CIRRUS BOOLEAN,
CLIFF BOOLEAN,
CLOUDS BOOLEAN,
CONIFER BOOLEAN,
CUMULUS BOOLEAN,
DECIDUOUS BOOLEAN,
DIANE_ANDRE BOOLEAN,
DOCK BOOLEAN,
DOUBLE_OVAL_FRAME BOOLEAN,
FARM BOOLEAN,
FENCE BOOLEAN,
FIRE BOOLEAN,
FLORIDA_FRAME BOOLEAN,
FLOWERS BOOLEAN,
FOG BOOLEAN,
FRAMED BOOLEAN,
GRASS BOOLEAN,
GUEST BOOLEAN,
HALF_CIRCLE_FRAME BOOLEAN,
HALF_OVAL_FRAME BOOLEAN,
HILLS BOOLEAN,
LAKE BOOLEAN,
LAKES BOOLEAN,
LIGHTHOUSE BOOLEAN,
MILL BOOLEAN,
MOON BOOLEAN,
MOUNTAIN BOOLEAN,
MOUNTAINS BOOLEAN,
NIGHT BOOLEAN,
OCEAN BOOLEAN,
OVAL_FRAME BOOLEAN,
PALM_TREES BOOLEAN,
PATH BOOLEAN,
PERSON BOOLEAN,
PORTRAIT BOOLEAN,
RECTANGLE_3D_FRAME BOOLEAN,
RECTANGULAR_FRAME BOOLEAN,
RIVER BOOLEAN,
ROCKS BOOLEAN,
SEASHELL_FRAME BOOLEAN,
SNOW BOOLEAN,
SNOWY_MOUNTAIN BOOLEAN,
SPLIT_FRAME BOOLEAN,
STEVE_ROSS BOOLEAN,
STRUCTURE BOOLEAN,
SUN BOOLEAN,
TOMB_FRAME BOOLEAN,
TREE BOOLEAN,
TREES BOOLEAN,
TRIPLE_FRAME BOOLEAN,
WATERFALL BOOLEAN,
WAVES BOOLEAN,
WINDMILL BOOLEAN,
WINDOW_FRAME BOOLEAN,
WINTER BOOLEAN,
WOOD_FRAMED BOOLEAN
);

Now that we have a table with a defined structure, let’s upload the CSV we downloaded.

In the Snowflake Web UI, do the following:

  1. click on your username in the top right of the page and switch your role to BEGINNER_ROLE
  2. click on the Databases tab in the top left of the page
  3. click on the BEGINNER_DB database
  4. click on the BOB_ROSS table
  5. click Load Table to launch the file upload wizard

Lastly, follow the steps in the wizard. Make sure to select the CSV file you downloaded earlier along with the file format we defined in our environment setup.

If successful, you should see that 403 rows were loaded.

Running Queries in Snowflake

With our fresh data ready to view, let’s go back to the Worksheets tab in the Snowflake Web UI and run the following queries in a new worksheet to explore our data:

// set context
USE ROLE BEGINNER_ROLE;
USE WAREHOUSE BEGINNER_WH;
// explore the data a bit
SELECT * FROM BEGINNER_DB.RAW_DATASETS.BOB_ROSS LIMIT 10;
// extract season and episode
SELECT
SUBSTRING(EPISODE,2,2) AS SEASON,
SUBSTRING(EPISODE,5,2) AS EPISODE
FROM
BEGINNER_DB.RAW_DATASETS.BOB_ROSS
LIMIT 10;
// get the number of episodes per season that involve trees
SELECT
SUBSTRING(EPISODE,2,2) AS SEASON,
COUNT(*) AS NUMBER_OF_EPISODES
FROM
BEGINNER_DB.RAW_DATASETS.BOB_ROSS
WHERE
TREES
GROUP BY
SEASON
ORDER BY SEASON ASC;

This is really basic SQL, but it should be enough to get you started asking your own questions (for example, what is the most popular feature across all episodes? across each season?)

Need Snowflake Cloud Data Warehousing and Migration Assistance?

If you’d like additional assistance in this area, Hashmap offers a range of enablement workshops and consulting service packages as part of our consulting service offerings, and would be glad to work through your specifics in this area.

How does Snowflake compare to other data warehouses? Our technical experts have implemented over 250 cloud/data projects in the last 3 years and conducted unbiased, detailed analyses across 34 business and technical dimensions, ranking each cloud data warehouse.

To listen in on a casual conversation about all things data engineering and the cloud, check out Hashmap’s podcast Hashmap on Tap as well on Spotify, Apple, Google, and other popular streaming apps.

Other Tools and Content You Might Like

Randy Pitcher is a Cloud and Data Engineer (and OKC-based Regional Technical Expert) with Hashmap providing Data, Cloud, IoT, and AI/ML solutions and consulting expertise across industries with a group of innovative technologists and domain experts accelerating high value business outcomes for our customers.

Be sure and connect with Randy on LinkedIn and reach out for more perspectives and insight into accelerating your data-driven business outcomes or to schedule a hands-on workshop to help you go from Zero to Snowflake.

--

--