Understanding Snowflake Optimization services

Thank you for reading my earlier blogs on Snowflake cost series. If you havent read the earlier blogs in the series, you can refer them here —

  1. Cost optimization series — https://medium.com/@poojakelgaonkar/snowflake-cost-optimization-series-7aac34697608
  2. Understanding Snowflake Cost — https://medium.com/snowflake/understanding-snowflake-cost-10165aea767f
  3. Understanding Warehouse Cost and Optimization — https://medium.com/snowflake/understanding-warehouse-cost-and-optimization-8fdffe2f68e6

You can also subscribe and read my earlier blogs on Snowflake and Data topics if not done already.

This is third blog in the series where you will learn more on the optimization services available in Snowflake.

As you know, Snowflake is a three-layered architecture and these are the major components in Snowflake cost computation as well. There are also bunch of serverless services that contributes to the Snowflake cost. There are various optimization techniques and services offered by Snowflake to improve the performance of your workloads, warehouse etc. This blog focuses more on optimization services - understanding these services and their usage to optimize the performance. There are bunch of other optimization techniques as well as recommendations on top of these services to save the cost. You will learn more about optimization techniques in the next blog.

What are optimization services?

Snowflake optimization services are the set of services that are designed and recommended to be used to optimize the performance. Search optimization, query acceleration services are the most commonly used services to optimize the performance. These are enterprise features and available to enterprise and above Snowflake editions.

What kind of optimization is achievable with these services?

There are various types of optimizations that can be implemented however the set of optimization services used to improve the query performance, warehouse performance and workload processings. This will help to improve the performance as well as reduce the Warehouse cost.

How optimization services are billed?

These optimization techniques are used as serverless services and billed separately. This is an additional component of serverless billing to the Snowflake’s billing components — warehouse + storage and cloud services. These are billed based on the usage.

What is Search Optimization Service?

The search optimization service can significantly improve the performance of certain types of lookup and analytical queries that use an extensive set of predicates for filtering. The search optimization service aims to significantly improve the performance of certain types of queries on tables, including:

  1. Selective point lookup queries on tables.
  2. Substring and regular expression searches (e.g. LIKE, ILIKE, RLIKE, etc.).
  3. Queries on fields in VARIANT, OBJECT, and ARRAY columns that use certain types of predicates like IN ,eqauality or contains
  4. Queries that use selected geospatial functions with GEOGRAPHY values.

How Search Optimization Service works?

To improve performance for point lookups, the search optimization service creates and maintains a persistent data structure called a search access path. The search access path keeps track of which values of the table’s columns might be found in each of its micro-partitions, allowing some to be skipped when scanning the table. A maintenance service is responsible for creating and maintaining the search access path.

How to identify cost of Search Optimization Service?

You can find the cost of the service and usage by visiting the billing page on Snowsight where this is added as an component of billing. In traditional UI, you can also get this listed with overall Snowflake billing and Cost.

What is Query Acceleration Service?

The query acceleration service can accelerate parts of the query workload in a warehouse. When it is enabled for a warehouse, it can improve overall warehouse performance by reducing the impact of outlier queries, which are queries that use more resources than the typical query. The query acceleration service does this by offloading portions of the query processing work to shared compute resources that are provided by the service.

When Query Acceleration Service can be used?

If you have queries that satisfies below type or pattern of queries —

  1. Queries with filters or aggregation. If the query includes LIMIT, the query must also include ORDER BY.
  2. The filters must be highly selective, and the ORDER BY clause must have a low cardinality.
  3. Query acceleration works well with ad-hoc analytics, queries with unpredictable data volume, and queries with large scans and selective filters.

How to identify if queries can be optimized using Query Acceleration?

To identify the queries that might benefit from the query acceleration service, you can use the SYSTEM$ESTIMATE_QUERY_ACCELERATION function or query the QUERY_ACCELERATION_ELIGIBLE View. The QUERY_ACCELERATION_ELIGIBLE view also identifies warehouses that might benefit from the query acceleration service.

You can also use, SYSTEM$ESTIMATE_QUERY_ACCELERATION function can help determine if a previously executed query might benefit from the query acceleration service. If the query is eligible for query acceleration, the function returns the estimated query execution time for different query acceleration scale factors.

What type of queries supported for Query Acceleration Service?

The query acceleration service supports the following SQL commands:

  1. SELECT
  2. INSERT (when the statement contains a SELECT statement) — CREATE TABLE AS SELECT (CTAS

A query, or a portion of a query (i.e. subquery or clause), with a supported SQL command might be accelerated by the query acceleration service if it is eligible for acceleration.

How to get details of Query acceleration usage?

To see the effects of query acceleration on a query, you can use the following columns in the QUERY_HISTORY View.

  • QUERY_ACCELERATION_BYTES_SCANNED
  • QUERY_ACCELERATION_PARTITIONS_SCANNED
  • QUERY_ACCELERATION_UPPER_LIMIT_SCALE_FACTOR

How to view Billing og Query Acceleration service?

You can view the billing from web UI- classic or snowsight. Refer to below image to get details on billing view —

What are the other techniques for query optimization?

Snowflake supports additional featues to improve the query performance. If you have other type of workloads where these services may not be applicable then you can plan on using materialized views or clustered tables.

Materialized Views — These can be used in case of queries that supports —

  • Equality searches.
  • Range searches.
  • Sort operations.

Note: Performance can be improved only for the subset of rows and columns included in the materialized view.

Clustered Table — All the tables data is stored in the form of micro partitions and data is pruned as and when queries are run to access the data. You can enable the auto-clustering to setup the clusters of the data, store clustered data to improve the performance. You can implement these for type of queries that includes —

  • Equality searches.
  • Range searches.

Note: A table can be clustered only on a single key, which can contain one or more columns or expressions.

Hope this blog helps you to understand the optimization services. You will learn the recommendations, techniques to improve the performance and optimize the overall cost and billing.

The overall cost series will help you to understand the Snowflake’s costing model, optimization techniques, services that can be used to improve the performance optimizations, costing of serverless services, designing and implementing appropriate cost monitoring and alerts to avoid any unprecdicted cost to the account.

Please follow my blog series on Snowflake topics to understand various aspects of data design, engineering, data governance, cost optimizations etc.

About Me :

I am one of the Snowflake Data Superheroes 2023. I am also one of the Snowflake SnowPro Core SME- Certification Program. I am a DWBI and Cloud Architect! I am currently working as Senior Data Architect — GCP, Snowflake. I have been working with various Legacy data warehouses, Bigdata Implementations, and Cloud platforms/Migrations. I am SnowPro Core certified Data Architect as well as Google certified Google Professional Cloud Architect. You can reach out to me LinkedIn if you need any further help on certification, Data Solutions, and Implementations!

--

--