Snowflake Warehouse Management for Small BI and Analytics Teams

In 2021, we embarked on our Snowflake rollout as an enterprise customer where we forecasted budget and credits based on actual use cases. Interestingly enough, I was formally instructed to not use Snowflake because it was thought to be “expensive.” 2 years later, I am running a Snowflake data platform that is more performant for 1/4 the cost of our legacy on-premise data infrastructure. Here is what I learned about Snowflake warehouse management.

Snowflake Credit Burn

What is a Snowflake Warehouse?

The term “Warehouse” in Snowflake means virtual compute resources and shouldn’t be confused with a “data warehouse”. A majority of the cost for Snowflake is based on warehouse (compute) utilization. The separation of compute from storage is an integral part of Snowflake’s architecture and design. Because credits and spend is attributed to the Warehouse utilization, you need to be thoughtful how you design and deploy your Warehouses.

Warehouses Segmentation by Use Case and Spend Categorization

Our Snowflake instance has 5 warehouses. In all cases, we started with X-Small or Small instances processing thousands up to tens of millions of records. I do not recommend over segmenting and creating many warehouses. You will unnecessarily run more concurrent warehouse instances making your detailed spend tracking a very expensive endeavor.

Here is the inventory of Warehouses I have employed and what I learned:

REPORTING-SM — This is the warehouse we assign to any BI / reporting applications where there is a system user for generalized executive and managerial dashboards and reports in PowerBI or Tableau.

Increased usage and spending in this area is a good thing. I explained to our executive team that if I am not increasing my spend 20% every year due to increased adoption of our Snowflake investment driving more reporting and analytics use cases then we are not doing a great job.

LOADING-XS — This is the warehouse we use for all processes that pull data into Snowflake or simply push data out. Unlike my Reporting warehouse, when my Loading usage skyrocketed, I was panicking. The value and ROI of loading and pushing data in and out of your Snowflake instance is low so burning a lot of credits for this activity due to poor design is not good.

At one point I realized my utilization increased 40% YoY because a vendor was using JDBC to push data rather than Snowpipe.

TRANSFORMING-SM — This warehouse is where we do our data preparation work, notebooks, and jobs that we want to isolate from reporting and simply loading data. For example, we linked our ETL and Datameer tools into Transforming warehouse so we can ensure there are no processes abusing and hanging. This is a low % of overall usage but important to monitor user or application abuse of credits.

EVALUATION-XS — I created this role specifically for new projects and vendor evaluation and proof of value. Poorly designed Snowflake vendor integrations will burn a hole through your wallet. If you evaluate a vendor, make sure you review their utilization of your snowflake credits as part of your total cost of ownership.

Warehouse Sizing Findings

I recommend reading all of the Warehouse setting and understand all of the options at your disposal, as Snowflake has added a number of options to fine tune your Warehouse

Start Extra Small and Grow

Use data to drive your sizing decisions. I run queries that load hundreds of millions of records daily on an XS instance. Snowflake provides analysis on your warehouse so you can thoughtfully manage your credit burn.

Tracking Spend

One of the limitations of Snowflake is the inability to monitor credit spend by individual users. If you need to establish SLAs and budgets you may find yourself creating warehouses by line of business.

Suspend After 5 Min

We had fairly low concurrency so it made sense to reduce the time to auto-suspend compute resources down to 5 min. This adds up over the course of a year.

I know my learning's are based on small enterprise deployment so I am very interested to learn what Warehouse schemes and lessons you have learned?

--

--

Ryan Goodman
Snowflake Builders Blog: Data Engineers, App Developers, AI/ML, & Data Science

I have been turning data into knowledge for 20 years. I am here to share my journey as entrepreneur, technologist and data geek. https://tinyurl.com/yp5urkrp