Snowflake Search Optimization Service (SOS)

Fru
DemoHub Tutorials
Published in
5 min readJul 18, 2024

Note: This tutorial is self-contained, with sample data and scripts provided to ensure you can follow the exercise in your practice environment. For more tutorials, visit tutorials.demohub.dev

Imagine slicing through a massive haystack to find a single needle — that’s what it can feel like when you’re searching for specific records within vast tables in your Snowflake data warehouse. But what if you had a tool that could instantly narrow down your search, making it 60% faster? That’s precisely what Snowflake’s Search Optimization Service (SOS) delivers.

By creating a specialized index (search access path), SOS streamlines the retrieval of specific rows, leading to dramatic performance improvements, especially for point lookups and selective filtering queries.

Let’s explore how SOS works and how you can leverage it to supercharge your queries.

Why Search Optimization Matters

  • Pinpoint Precision: SOS shines when you need to quickly find specific rows in massive datasets based on exact matches or specific values.
  • Performance Boost: By creating optimized search access paths, SOS enables Snowflake to efficiently locate and retrieve relevant data, leading to dramatic query performance improvements.
  • Resource Efficiency: With faster query execution, you consume fewer compute resources, contributing to cost savings.
  • User Experience Enhancement: Rapid response times for queries translate to a more seamless and productive user experience.

How Search Optimization Works

SOS works behind the scenes, employing a persistent data structure called a search access path. This path acts as an index, keeping track of which values are present in each micro-partition of your table. When you run a query that benefits from SOS, the Snowflake optimizer consults the search access path to determine which micro-partitions to scan, significantly reducing the amount of data processed.

When to Use Search Optimization

Consider using SOS for queries that involve:

  • Equality Filters: WHERE column = value
  • IN Filters: WHERE column IN (value1, value2, …)
  • Range Filters: WHERE column BETWEEN value1 AND value2
  • Geography Filters: WHERE ST_DISTANCE(column, ST_MAKEPOINT(longitude, latitude)) < radius
  • Substring Filters: WHERE column LIKE ‘%value%’

Enabling Search Optimization

You can enable SOS for specific columns or for an entire table:

-- Enable for a whole table (all eligible columns)
ALTER TABLE my_large_table ADD SEARCH OPTIMIZATION;

-- Enable for specific columns with specific search methods
ALTER TABLE my_large_table ADD SEARCH OPTIMIZATION
ON (column1 USING EQUALITY) - Optimize for equality and IN filters
ON (column2 USING SUBSTRING); - Optimize for substring filters

Practical Hands-on Example

Let’s explore how Snowflake Search Optimization Service (SOS) can drastically improve query performance for point lookups and selective filtering. We’ll use your OrdersDB data model to demonstrate the impact.

⛁ Sample Data Model: ordersdb-data-model

Make sure you have the ordersdb sample data model loaded into Snowflake. Then, set the context to the ordersdb.public schema, and proceed with the rest of the exercises.

-- Assign the database schema
USE SCHEMA ordersdb.public;

1. Preparation: Create Test Tables

First, let’s create two identical copies of the sales_order_item table:

-- Create a copy of the original sales_order_item table without SOS
CREATE OR REPLACE TABLE without_sos_sales_order_item AS
SELECT * FROM OrdersDB.PUBLIC.SALES_ORDER_ITEM;

-- Create another copy of the table to enable SOS on
CREATE OR REPLACE TABLE with_sos_sales_order_item AS
SELECT * FROM OrdersDB.PUBLIC.SALES_ORDER_ITEM;

You can confirm that both tables have the same data using this query:

SHOW TABLES;

2. Enable Search Optimization (SOS)

Next, enable SOS on one of the tables.

-- Enable for a whole table (all eligible columns)
ALTER TABLE with_sos_sales_order_item ADD SEARCH OPTIMIZATION;

-- Enable for specific columns with specific search methods
ALTER TABLE with_sos_sales_order_item ADD SEARCH OPTIMIZATION ON EQUALITY(ORDER_ITEM_ID,order_id);

This enables SOS for the ORDER_ITEM_ID and ORDER_ID columns, optimizing them for equality-based queries.

Verify SOS enabled on Tables.

select $2 as table_name, $14 as search_optimization, $15 as search_optimization_progresss, $16 as search_optimization_bytes, *
from table(result_scan())
where $2 in ('WITHOUT_SOS_SALES_ORDER_ITEM','WITH_SOS_SALES_ORDER_ITEM');

3. Select a Test Order

--  Randomly select an order_id for testing 
SELECT order_id
FROM sales_order
SAMPLE (1 rows);

We will use the order_id from the query above to compare the performance between the tables with and without SOS.

4. Run Test Queries with and without SOS

Now, execute two identical queries — one on each table — to compare their performance — using the same warehouse and everything else being equal.

Without SOS:

SELECT * 
FROM without_sos_sales_order_item
WHERE order_id = 3885638; - Replace with your order_id

With SOS:

SELECT * 
FROM with_sos_sales_order_item
WHERE order_id = 3885638; - Replace with your order_id

5. Analyzing the Results

  • Snowsight Query Profile: The Query Profile in Snowsight will show you a breakdown of each query’s execution time. Look to see how SOS impacted the query with SOS enabled.
  • System View: You can also check the SEARCH_OPTIMIZATION_HISTORY view to see detailed statistics about search optimization usage

Comparing and visualizing the results; we see a significant performance improvement in the query where SOS was enabled. In this illustrative example, the query with “sos” being 60.42% faster.

This is because SOS dramatically reduces the amount of data that Snowflake needs to scan to find the matching rows.

Monitoring Costs and Optimizing Usage

SOS consumes credits, so it’s important to monitor its usage. You can find cost information in the SEARCH_OPTIMIZATION_HISTORY view.

select * from snowflake.account_usage.search_optimization_history;

Additional Tips:

  • Varying Data Sizes: Experiment with different table sizes to see how QAS performance scales.
  • Snowsight: Use Snowsight’s “Query Profile” to visualize the query execution plan and identify where QAS is being utilized.

Key Takeaways:

  • Search Optimization can greatly improve query performance for specific use cases.
  • You can selectively enable SOS on columns that are frequently used for point lookups or filtering.
  • Use Snowsight and system views to monitor the impact of SOS on your queries and overall workload.

Resources

Originally published at https://tutorials.demohub.dev.

--

--

Fru
DemoHub Tutorials

🎥 800K+ YouTube views 📺 6K+ Subscribers 🌐 10K+ DemoHub.dev 💡 Tech Simplifier 🚀 Innovation Catalyst 🤝 Data Analytics & AI Leader | 🔍 All views are my own