Is your warehouse half empty?

Snowflake Warehouse Utilization: What it is and why it matters

TL;DR: Install OpsCenter, reduce your auto-suspend times, stop creating new warehouses at the drop of a hat and take utilization seriously. Your wallet will thank you.

I recently shared a statistic on LinkedIn: most Snowflake warehouses are idle half the time. Sure, it’s a little click-baity but most Snowflake users we talk to are surprised by their Snowflake utilization. The biggest reason for this surprise is the same as any metric: until you monitor it, you shouldn’t assume you know what is going on (and you will definitely struggle to improve it).

Warehouse Utilization explained

Warehouse utilization quantifies how effectively you’re using Snowflake compute spend. For most organizations, Snowflake compute spend is the largest component of one’s overall Snowflake bill. Each time you start a warehouse, you start incurring costs until that warehouse is suspended. Warehouse utilization describes how well you are using the warehouse while it is running. For example:

  • 0% Utilization => No queries ran on the warehouse while it was up.
  • 100% Utilization => Queries utilized all the resources on the warehouse while it was up.

Most warehouses sit somewhere in the middle. You’d rarely start a warehouse without at least one query running on it. It’s also very difficult to ensure every cpu cycle is used in a warehouse from start to stop.

So let’s draw a more realistic example: a single query starts a warehouse. The query runs for 10 seconds, then stops. The warehouse has the Snowflake-default 10 minute auto-suspend value. In this case, if the query used all resources during that ten seconds, we’d calculate a warehouse utilization as follows:

  • Query Seconds: 10
  • Warehouse Seconds: 610 (10 seconds running + 10 minutes idle)
  • Utilization: 1.6% (10/610)

However, note that above we said “if the query used all resources during that ten seconds”. In reality, a single query rarely saturates a warehouse. Warehouses are built to run concurrent queries. So when doing this calculation, we also need to include the query’s actual consumption of warehouse capacity. Let’s assume this query used 25% of the warehouse while it ran. This results in the following updated numbers:

  • Query Seconds: 10
  • Query-warehouse Seconds: 2.5
  • Warehouse Seconds: 610
  • Utilization: 0.4% (2.5/610)

If this was a medium warehouse at standard rates, you would have spent ~$1.33 for this query and of that, less than a penny would have gone towards actually running the query.

This really shows the impact that warehouse utilization can have on overall efficiency and spend.

Auto-suspend to the rescue, sort of?

Snowflake’s automatic suspension is great and drastically reduces resource waste. However, it can’t eliminate under-utilization. It’s important to remember:

  • If you’re continuously running queries on a warehouse, that doesn’t mean that warehouse is fully utilized. A big warehouse with small queries may never use most of the capacity of that warehouse.
  • Suspension isn’t immediate. This value defaults to 10 minutes and the timers backing suspension can only be tuned down to ~60 seconds.
  • A warehouse always costs at least 60 seconds of credits, even if a query runs for 1s and then the warehouse is manually suspended.
  • One or two queries rarely saturate a warehouse. In most cases, you need multiple concurrent queries on a warehouse to leverage its entire capacity.
  • Dividing query activity into a larger number of warehouses increases the likelihood of under-utilization on any one warehouse.

So while auto-suspend helps, there are several dynamics that can still reduce utilization (and spend efficiency).

What’s a good utilization target?

Target low latency can hurt utilization

Utilization targets depend on a couple of things. Maximizing warehouse utilization is a bin packing problem with a deadline component. This is an age-old resource problem: leave capacity for bursts of activity or don’t. In Snowflake that converts to: how important is latency versus maximum completed work per credit? In most organizations, it depends on the workload. Most queries will fall into the following categories:

  1. Machine-driven operations. People are not directly involved in immediately consuming the outputs. This includes things like ETL and dbt jobs.
  2. Somewhat latency sensitive human-driven operations. This includes things like interactive analysis and drilldown.
  3. Highly latency sensitive human-driven operations. This includes things like dashboards.

As latency and predictability needs increase, you’ll want to maintain additional capacity to quickly respond to bursts of activity. As such, we recommend targeting utilizations for each type of workload differently. For machine-driven workloads like #1, you want to target a value close to 95% utilization. To achieve this, you often must queue workloads, and run them in parallel while mostly avoiding running warehouses until absolutely necessary (defer warehouse resume until queues are full). For the human-driven operations you are almost always going to have lower utilization since you want to minimize user latency (which means running queries even if you don’t have enough to maximize capacity (so maybe 80% utilization for #2 and 60% for #3).

How do I monitor utilization?

Unfortunately, Snowflake doesn’t yet provide direct reports on warehouse utilization. There is some talk about a new metric being shared but nothing is yet generally available. We built Sundeck OpsCenter, a free and open Snowflake Native app that helps understand this. You can install it from Snowflake’s marketplace and use it for free and we strongly suggest you do.

If you aren’t familiar with native apps, they run entirely within your Snowflake account so you don’t have to worry about your query activity being shared with anyone else while using OpsCenter.

One useful item OpsCenter provides is a utilization heatmap. This gives you an understanding of utilization over the last several days and weeks on a per-warehouse basis. This will allow you to start understanding where your biggest opportunities for improvement to utilization exist.

Sundeck OpsCenter utilization heatmap

Our Detailed Calculation

Because OpsCenter is a community project, you can fully understand this calculation by looking at the OpsCenter code on github. To understand and monitor utilization, we built Sundeck OpsCenter, which calculates two key values:

  1. Metered compute credits per day/warehouse, a value we typically refer to as loaded cost. Loaded cost includes both query time and waste (e.g. idle time). This value is calculated by looking at actual billing amounts per warehouse in WAREHOUSE_METERING_HISTORY. Note, we avoid the WAREHOUSE_EVENTS_HISTORY view for this calculation because it is notoriously unreliable (and it also isn’t necessary).
  2. Using QUERY_HISTORY, we multiply the runtime of each query by its reported QUERY_LOAD_PERCENT, and sum this value for all queries run during the day. If a query spans midnight, we only count the portion that falls within the day being computed.

We then join our unloaded and loaded cost rollups by warehouse_id and date, to compute the utilization percent by day. The specific queries can be found here with supporting views here and here.

We’ve explored approaches to deriving total credits consumed from per-query cost estimates, but have since decided the question is not about Snowflake’s billing (which is quite accurate). The question is: “Out of the amount of compute that we paid for, what amount did we actually use?

How accurate can we get?

Sundeck’s model reliably represents warehouse utilization in most cases. However, there are a few challenges with the metric that you should be aware of.

  • Poor QUERY_LOAD_PERCENT granularity: This metric helps us to determine the relative consumption of a warehouse to understand whether more queries could be run. However, this metric is a single value for an entire query. Most queries vary their consumption of resources over time and using a single metric means that this is probably some form of average or peak.
  • Nested queries are difficult to account for. Snowflake’s QUERY_HISTORY view doesn’t specify that one query triggered another (most common when using Snowflake scripting blocks). In this scenario, calculation may overestimate the actual load on the cluster (over-estimating utilization).

It’s also important to remember that warehouse compute is only one component of overall Snowflake spend, and utilization is solely focused on compute. If your scenario is unusual (say 99% of your bill is for storage), raising warehouse utilization probably won’t improve your bottom line as much as it does for most.

How does the OpsCenter metric compare with what Snowflake provides?

Snowflake recently announced that they were planning to provide granular warehouse utilization insights. To date, this feature is still in a relatively narrow private preview. Unfortunately this means we can’t yet report on how it compares or when that will be more generally available.

What should I do today to increase my utilization?

  • Install OpsCenter. It’s free, easy, and will ensure you know what your utilization is. We even have a useful quickstart to get you going.
  • Set auto-suspend time on all warehouses to 60 seconds. It’s the easiest way for most organizations to improve their utilization.
  • Avoid creating warehouse until absolutely necessary. As noted above, more warehouses almost always results in lower overall utilization. If you need fine-grained metrics on cost, you can use Sundeck’s workload tagging to avoid tying billing management to cost considerations. (Use separate warehouses only when you have query efficiency/performance reasons to do so.)

And if you’re interested in this kind of thing, follow our Snowflake blog. We spend a lot of time thinking about ways to make your Snowflake more efficient and more capable.

--

--