Improving Cloud Cost Monitoring with Snowflake and Tableau

A Guide for Using Snowflake Account Usage data to Create Automatically-Updating Cost Monitoring Dashboards

Randy Pitcher II
Hashmap, an NTT DATA Company
8 min readMay 1, 2019

--

The purpose of this post is to provide a starting point for cost monitoring in the Snowflake Cloud Data Warehouse using the power of Tableau for visual analytics.

Snowflake’s pricing is consumption based and requires little to no upfront spending. This is incredibly valuable for organizations that are new to cloud computing (like many of our clients). It is also a new way to think about IT spend. Having a detailed and up-to-date view of spending behavior can help ease the transition for organizations that are more comfortable with legacy IT cost control.

What Comes Standard?

Snowflake provides some very light dashboarding in the web interface, but I’ve often found the visualizations to be somewhat basic with regard to the some of the details for costs that I’d like to review on a regular basis:

This dashboard fails to show my costs in dollars, doesn’t show how my usage is changing month-to-month, and doesn’t clearly communicate my annual or quarterly spending (which can be an important metric for larger organizations).

Snowflake does include some flexible alerting and throttling through the use of Resource Monitors, but these too are limiting in what they can tell you about your organization’s spending behavior.

Luckily, this underlying cost data is available in Snowflake through the fantastic ACCOUNT_USAGE schema.

In this guide, we’ll configure a live Tableau connection to Snowflake account usage data to create automatically-updating cost monitoring dashboards.

Setup Snowflake Objects

First, we’ll need to create the Snowflake objects we’re going to use for storing and querying data. You’ll need access to the SYSADMIN role for this.

Snowflake makes automatically-updated account usage data available in the SNOWFLAKE shared database under the ACCOUNT_USAGE schema. For this guide, we’re going to focus on storage usage from the STORAGE_USAGE table and warehouse usage from the WAREHOUSE_METERING_HISTORY table.

Run the following SQL script to create the Snowflake objects necessary to view warehouse and storage data in a unified time log:

//==========================================================
// create objects
//==========================================================
USE ROLE SYSADMIN;

// db
CREATE DATABASE IF NOT EXISTS TABLEAU_DATABASE;

// schema
CREATE SCHEMA TABLEAU_DATABASE.SNOWFLAKE_MONITORING;

// views
CREATE OR REPLACE VIEW TABLEAU_DATABASE.SNOWFLAKE_MONITORING.STORAGE_USAGE AS SELECT * FROM SNOWFLAKE.ACCOUNT_USAGE.STORAGE_USAGE;
CREATE OR REPLACE VIEW TABLEAU_DATABASE.SNOWFLAKE_MONITORING.WAREHOUSE_USAGE AS SELECT * FROM SNOWFLAKE.ACCOUNT_USAGE.WAREHOUSE_METERING_HISTORY;
CREATE OR REPLACE VIEW
TABLEAU_DATABASE.SNOWFLAKE_MONITORING.SNOWFLAKE_USAGE
AS SELECT
TO_TIMESTAMP_NTZ(CONVERT_TIMEZONE('UTC', CASE WHEN (START_TIME IS NULL) THEN USAGE_DATE ELSE START_TIME END)) AS DATE,
STORAGE_BYTES,
STAGE_BYTES,
FAILSAFE_BYTES,
WAREHOUSE_ID,
WAREHOUSE_NAME,
CREDITS_USED
FROM
TABLEAU_DATABASE.SNOWFLAKE_MONITORING.STORAGE_USAGE A
FULL OUTER JOIN
TABLEAU_DATABASE.SNOWFLAKE_MONITORING.WAREHOUSE_USAGE B
ON
A.USAGE_DATE = B.START_TIME;

// warehouse
CREATE WAREHOUSE IF NOT EXISTS
TABLEAU_WAREHOUSE
COMMENT='Warehouse for Tableau dashboard development'
WAREHOUSE_SIZE=XSMALL
AUTO_SUSPEND=60 // shut this bad boy down as fast as possible in between queries
INITIALLY_SUSPENDED=TRUE;
//==========================================================

We’re using views instead of linking directly to the underlying tables so that we keep Tableau’s read access tightly scoped to just data accessible in the TABLEAU_DATABASE.

The target view for Tableau consumption will be SNOWFLAKE_USAGE.

We should quickly discuss the timezone mess in our SNOWFLAKE_USAGE view definition. Tableau is generally not great with timezones. This causes issues in the Tableau Snowflake connector when using dates for filters.

To avoid this, we must first cast all date columns to UTC and then remove the timezone from the column using the TO_TIMESTAMP_NTZ function (ntz=no time zone). This will make our final dashboards display in UTC (which is not ideal), but the filtering will work as expected. You could address this manually on the frontend with more work, but that’s a little beyond the scope of this discussion.

Setup Security Objects

Now, we need to create security objects that allow Tableau to securely access the data. You’ll need access to the SECURITYADMIN role for this.

Run the following SQL to create a user, a role, and grant permissions to access our recently-created Snowflake objects:

//==========================================================
// create user and role
//==========================================================
USE ROLE SECURITYADMIN;

// user
CREATE USER IF NOT EXISTS
TABLEAU_DEV_SERVICE_ACCOUNT
COMMENT='Account for Tableau dashboard development'
PASSWORD="my super cool password." // use your own password, dummy
MUST_CHANGE_PASSWORD=false;

// role
CREATE ROLE IF NOT EXISTS
TABLEAU_ROLE
COMMENT='Role for Tableau dashboard development';
//==========================================================


//==========================================================
// Assign permissions to new tableau objects
//==========================================================
USE ROLE SECURITYADMIN;

// grant db and warehouse access to role
GRANT USAGE ON DATABASE TABLEAU_DATABASE TO ROLE TABLEAU_ROLE;
GRANT USAGE ON SCHEMA TABLEAU_DATABASE.SNOWFLAKE_MONITORING TO ROLE TABLEAU_ROLE;
GRANT SELECT ON ALL VIEWS IN SCHEMA TABLEAU_DATABASE.SNOWFLAKE_MONITORING TO ROLE TABLEAU_ROLE;
GRANT SELECT ON FUTURE VIEWS IN SCHEMA TABLEAU_DATABASE.SNOWFLAKE_MONITORING TO ROLE TABLEAU_ROLE;
GRANT USAGE ON WAREHOUSE TABLEAU_WAREHOUSE TO ROLE TABLEAU_ROLE;

// grant role to user and sysadmin
GRANT ROLE TABLEAU_ROLE TO USER TABLEAU_DEV_SERVICE_ACCOUNT;
GRANT ROLE TABLEAU_ROLE TO ROLE SYSADMIN;

// Set the default role and warehouse for the user account
ALTER USER
TABLEAU_DEV_SERVICE_ACCOUNT
SET
DEFAULT_ROLE=TABLEAU_ROLE
DEFAULT_WAREHOUSE=TABLEAU_WAREHOUSE;
//==========================================================

NOTE: It is extremely important that you choose your own password for your service account. Remember this password so you can use it to connect from Tableau.

It is not necessary to set user defaults for your service account as Tableau’s Snowflake connector will let you choose a role and warehouse when connecting, but I find using defaults makes for less typing later on.

At this point, you’re ready to configure Tableau.

Setup Tableau

Make sure you’re using the professional version of Tableau (the free version doesn’t include the Snowflake connector).

From the “Data Source” tab on a new workbook, select the Snowflake connector (this will be in Connect -> To a Server -> More… -> Snowflake).

This will bring up the Snowflake connector configuration dialog. Enter the following information:

Make sure to use the correct URL for your server (this will be the same URL shown in the browser when you’re in the Snowflake Web UI).

The “Username” and “Password” values are the same that you used to create your Tableau service account. If you didn’t set a default role for your account, you’ll also want to enter TABLEAU_ROLE as the value for the “Role” field.

From here, select your warehouse, database, schema, and table (which is actually a view in our case):

At this point, you’re ready to begin visualizing data!

Visualizing Snowflake Usage Data

Download a copy of the Tableau workbook I’m using here if you’d like your own copy. You should be able to replace the data source with your own live data using instructions here (as long as you’ve used the same View schema for SNOWFLAKE_USAGE).

This workbook is meant to build 2 different dashboards. The first dashboard shows annual spending across the platform with 30-day rolling averages for both compute and storage spending. The following image shows actual usage data from one of our demo Snowflake instances:

This shows that our average run rate on Snowflake is ~$4/day (not bad!).

The second dashboard is meant to allow for exploring the main driver of our costs in Snowflake: warehouse usage.

This shows our daily warehouse usage and a percent breakdown of which warehouses contributed to the usage.

The real value of this dashboard comes from being able to select peaks in the usage chart to explore which warehouses are driving the most cost over time.

Here you can see that the warehouse development I performed for our Snowflake + Stitch blog post drove a spike in usage at the end of February. You’ll also notice that I used some expensive ingestion schedules for a week before turning Stitch down to a cheaper frequency.

Use the Guide and Get Started

These dashboards are a simple but powerful way to begin understanding the cost drivers in Snowflake. They also serve as a useful starting point for building other Tableau dashboards on live Snowflake data.

The cloud-native nature of Snowflake, combined with powerful automatic indexing and caching, make Snowflake a perfect fit for enabling complex reporting in Tableau with minimal latency. I hope you’ll give it a try!

Need Snowflake Cloud Data Warehousing and Migration Assistance?

If you’d like additional assistance in this area, Hashmap offers a range of enablement workshops and assessment services, cloud migration services, and consulting service packages as part of our Snowflake service offerings — and we would be glad to work through your specific requirements — please reach out.

Feel free to share on other channels and be sure and keep up with all new content from Hashmap by following our Engineering and Technology Blog and subscribing to our IoT on Tap podcast.

Some other Hashmap Snowflake stories are below:

Randy Pitcher is a Cloud and Data Engineer (and OKC-based Regional Technical Expert) with Hashmap providing Data, Cloud, IoT, and AI/ML solutions and consulting expertise across industries with a group of innovative technologists and domain experts accelerating high value business outcomes for our customers.

Be sure and connect with Randy on LinkedIn and reach out for more perspectives and insight into accelerating your data-driven business outcomes or to schedule a hands-on workshop to help you go from Zero to Snowflake.

--

--