Best practices to optimize Snowflake spend

Snowflake is constantly improving the value of a Snowflake credit via ongoing transparent platform improvements. Customers are beneficiaries of these improvements, quantified by Snowflake Performance Index without opting in or doing any work.

This post details best practices for customers to optimize Snowflake spend over and above the ongoing platform improvements delivered by Snowflake.

Components of Snowflake cost

Snowflake credits are used to pay for the consumption of resources on Snowflake which can be from three broad areas:

1/ Compute which is comprised of: a/ Virtual Warehouses, b/ Cloud Services, and c/ Serverless services. Virtual Warehouses are typically responsible for most of an account’s spend. Virtual Warehouses are the primary compute primitive in Snowflake with elastic & instant-on, scale up & out, auto-suspend & resume properties and are billed on a per second basis. Cloud services cost include the cost footprint of Snowflake’s control plane, SHOW commands, query compilation, result caches, etc. and are only charged when its footprint exceeds 10% of daily compute footprint from warehouses (which is rare). Serverless services include Query Acceleration, Automatic Clustering, Materialized Views, Search Optimization, Snowpipe, Replication, etc. which are paid services that provide meaningful value.

2/ Storage is usually a small fraction of the overall bill and benefits from a “pass-through pricing” model where Snowflake passes along cloud provider costs. Storage on Snowflake can also benefit from compression, which can reduce footprint by up to 7x.

3/ Data Transfer & Egress is also usually a small fraction of the overall bill. This is also cloud-provider specific, follows a “pass through model” and accrued when functionalities such as COPY, replication, external functions are used.

Best practices to optimize spend

Our best practices to optimize Snowflake spend include:

1/ Obtain visibility into cost & implement chargeback

We recommend that customers track usage at multiple granularities (such as organization, account, workloads, warehouses, users, tasks, etc.) and attribute spending to specific teams, customers, or cost centers (via tagging or other cost attribution features) to actualize “chargeback” or “showback” scenarios.

Snowflake provides granular usage data via Account and Organization usage views. This includes data meant for both finance teams and Snowflake practitioners to be able to realize a robust FinOps practice.

Granular usage data for both Snowflake practitioners and finance personas.

A number of pre-built queries can be used for common actions around gaining visibility into spend at multiple levels of granularity (per-warehouse, per-user, per-job, per-user, etc.). Additionally, tagging provides a mechanism to attribute and reason about spend in logical groupings.

This query shows the credit consumption of warehouses over time to help understand consumption trends.

In addition, a UI-based cost exploration experience enables quick and easy visual examination of dashboards detailing usage, usage trends, and timeline based drill-downs of this usage data.

UI-based cost explorer showing visualization of usage data, cost attribution via tagging, and usage trends.

Furthermore, the Snowflake Insights Streamlit App can be used to visualize usage data in Snowflake to gain richer insights into spend via visualization such as heatmaps and flame graphs for spend.

Visualizations from the Snowflake Usage Insights Streamlit app showing warehouse usage heatmaps, scatter plots (# of queries vs execution time), and more to help explore spend.

2/ Proactive budgeting & alerting

We recommend setting up budgets at the level of an account and/or custom groupings of resources and setting both proactive & reactive alerts on spending limits along with guards against overspending.

Budgets can be used to define a spending limit over a time interval for an account or a custom grouping of credit-consuming objects.

Budgets UI experience.

Budgets help monitor warehouse and serverless usage (i.e., automatic clustering, materialized views, search optimization, pipe, and replication). When the spending limit is projected to be exceeded, a daily notification is sent.

Budgets allow setting spending limits for both accounts and custom groupings of credit consuming resources.

In addition, custom alerts can be set up to monitor and programmatically notify at based on changes at finer granularities.

3/ Workload segmentation based on Return-on-Investment

We recommend understanding the cost and return-on-investment of workloads to make informed decisions on re-architecture, decrease/increase of usage, or even retire certain workloads with the goal of ensuring the most return on spend.

Snowflake’s cost visibility features help understand the level of investment in each workload, the return on it is subjective and dependent on the business value of the use case. Based on the the return-on-investment, we recommend making decisions on re-architecture, decrease/increase of usage, or even retire certain workloads with the goal of ensuring the most return on spend.

Some of the metrics exposed by Snowflake such as warehouse load metrics, warehouse utilization metrics (in private preview), metrics on data spilling to disk/object storage, and warehouse events history (in public preview) can inform and guide optimizations to “right size” compute. Additionally, per-query costs can identify the most expensive queries to prioritize for rewriting/refactoring in the interest of cost efficiency.

Utilization data for each cluster of the virtual warehouse as a percentage (private preview) can help identify idle capacity and inform rightsizing decisions, when used with warehouse load metrics.

Query & operator level metrics with accompanying visualizations in the query profile view provide visibility to understand opportunities for performance improvements (e.g., inefficient pruning, “exploding” joins, and identifying patterns in expensive queries).

Snowflake Query Profile UI

In addition to the query profile view in the Snowflake UI, the get_query_operator_stats function can also provide these metrics.

This query provides an ordered list of users that run queries that scan a lot of data. This is a potential opportunity to train the user on authoring more efficient queries or enable clustering.

Virtual warehouse configurations such as their size, and number of clusters provide the ability to balance concurrency, throughput, and cost in a manner that is best suited for the workload in question.

Virtual Warehouse controls such as size and cluster provide controls to balance latency, throughput, and cost.

Furthermore, in many cases, it is possible to unlock superior price-performance through the judicious use of serverless services (such as query acceleration, automatic clustering, materialized views, search optimization, etc.) and/or Snowpark-optimized warehouses.

Snowpark-optimized warehouses help unlock ML training and memory-intensive analytics use cases, especially when using Snowflake’s secure Python/Java sandbox. They provide 16x more memory and 10x more local SSD cache per node compared to standard warehouses. The larger memory speeds up computations and larger local storage provides speedup when cached intermediate results and artifacts such as Python packages and JARs are reused on subsequent runs. Metrics exposed under the QUERY_HISTORY view such as BYTES_SPILLED_TO_LOCAL_STORAGE and BYTES_SPILLED_TO_REMOTE_STORAGE indicate the extent of memory pressure, which in many cases, can be addressed in a cost-efficient manner by moving to Snowpark-optimized warehouses of the same size.

Sample query to identify the top offending queries in terms of memory spilling. In such cases, moving to a Snowpark-optimized warehouse of the same size can result in superior price-performance over moving to the next size of standard warehouse (especially when the workload is bound by memory more than CPU).

Automatic Clustering helps reorganize table data to align with query patterns to process only the relevant data from large tables, thereby speeding up queries which results in fewer compute credits. The pricing for auto clustering is based on credits for background clustering maintenance.

Automatic clustering is performed in an incremental and non-blocking manner in the background without hindering foreground workloads.

Materialized Views store frequently used projections and aggregations to avoid expensive re-computations resulting in faster queries, which results in lower spend. The results from Materialized Views are guaranteed to be up-to-date via asynchronous and incremental maintenance. The pricing is based on refreshes and additional storage.

A materialized view is a pre-computed data set derived from a query specification and stored for later use. Querying a materialized view is faster than executing a query against the base table of the view.

Search Optimization Service makes it possible to quickly find “needles in the haystack” to return a small number of rows on large tables. It accelerates point lookup queries on all columns of supported types on large tables. This results in faster query performance on data exploration and filtering workloads, which can result in lower credit consumption.

In the example above, Search Optimization helps reduce the number of partitions to be scanned from 15 to 1, helping speed up a query to find all rows with a certain name.

Query Acceleration Service (QAS) provides elastic scale-out of compute and reduces query runtime without changing warehouse size. It runs alongside the existing warehouse to accelerate the scan & aggregation portions of queries in near real time. QAS compute resources billed per second. The SYSTEM$ESTIMATE_QUERY_ACCELERATION function and QUERY_ACCELERATION_ELIGIBLE View help identify queries that might benefit from QAS.

When a warehouse has outlier queries, QAS can help improve the performance of the warehouse’s other queries by offloading the extra computing demands of the outlier queries to additional compute resources.

4/ Throttle/control resource allocation & scaling permissions

For sufficiently large and mature Snowflake use cases delivered via a central platform team, we recommend policies that restrict which users or teams can perform credit consuming operations such as warehouse creation, sizing, and scaling limits to control costs and reduce indeterminacy from unintended spend by additional users.

Snowflake provides the ability to set permissions to use, create, and modify credit consuming resources (such as warehouse creation, sizing, scaling limits, QAS scale factor, etc.) to control costs and reduce indeterminacy.

To limit the spend of workloads, virtual warehouse configurations such as its size, number of clusters, scaling policy, auto-suspend timeout, and statement timeouts for running and queuing directly place an upper bound on spend. When multi-cluster warehouses are used, the standard scaling policy optimizes for minimizing queuing over conserving credits — every successive cluster spin-up waits for at least 20 seconds. Alternatively, the economy scaling policy optimizes for conserving credits by keeping running clusters fully loaded; new clusters are spun up only if the system estimates there will be enough jobs to keep the cluster busy for at least 6 minutes.

Pricing for virtual warehouses.

Snowflake enables a “scale to zero” model with auto-suspend and auto-resume capabilities which are enabled by default. This capability turns warehouses off when they are idle, preventing paying for idle time and thus saves customers money. For use cases such as ELT and data loading which are unlikely to use query caching, the recommendation is to suspend warehouse immediately if idling. Use cases that take advantage of query caching (e.g., Business Intelligence, dash boarding) can benefit from longer suspend times.

This query identifies all warehouses that do not have auto-suspend enabled. By default, all warehouses have auto-suspend enabled to ensure they’re not causing credit spend when not processing queries.

Resource Monitors provide alerting and hard limits to prevent overspend via credit quotas for individual warehouses. When warehouses reach a limit a notification and/or suspension can be initiated. Resource monitors can also be set up on a schedule to track and control credit usage by virtual warehouses.

Resource Monitors help control costs and avoid unexpected credit usage from running warehouses.

Resource monitors can help prevent certain warehouses from unintentionally consuming more credits than expected.

This query identifies all warehouses without resource monitors which could be prone to unanticipated spend.

Query Acceleration scale factor sets an upper bound on the amount of compute resources a warehouse can lease for acceleration. The scale factor is a multiplier based on warehouse size and cost. For example, if the scale factor is 5 for a Medium-sized warehouse (4 credits/hour), the warehouse can lease compute up to 5 times its size (i.e., 4 * 5 = 20 credits/hour). By default, the scale factor is set to 8 when Query Acceleration Service is used.

Conclusion and recap

To recap, our best practices for optimizing costs on Snowflake are: 1/ Obtain visibility into cost & implement chargeback, 2/ Proactive Budgeting & Alerting, 3/ROI-based workload segmentation, and 4/ Throttle/control resource allocation & scaling permissions

To this effect, we’ve organized our investments into the following areas:

Organization of Snowflake cost governance and workload optimization capabilities.

We are deeply committed to helping customers get the most return on their Snowflake credits and are working on an exciting roadmap of functionality to further this aim.

--

--