Snowflake Automatic Clustering Cost Estimation

Automatic Clustering is one of the most powerful ways to tune Snowflake performance. By setting a clustering key, you can optimize specific access patterns to dramatically improve query performance. The Automatic Clustering cost is dependent on a wide variety of factors such as clustering key cardinality, table size, and DML patterns. As such, it can be challenging to understand the cost impact of clustering key changes.

We’re excited to announce that you can use Automatic Clustering cost estimation, now in Public Preview, to estimate Automatic Clustering costs. You can run SYSTEM$ESTIMATE_AUTOMATIC_CLUSTERING_COSTS prior to making clustering key changes in order to forecast the resulting Automatic Clustering costs.

Function usage

You can run Automatic Clustering cost estimation for any tables for which you have permission to modify the clustering key. Below is an example SQL statement to run the function on a table named myTable with a potential clustering key of tenantId

SELECT SYSTEM$ESTIMATE_AUTOMATIC_CLUSTERING_COSTS(‘myTable’, ‘(tenantId)’);

In this example, the function would produce output showing a breakdown of the initial costs and maintenance costs.

There are two parts of the cost estimate:

These two costs are independent from each another and each assume that table size and DML patterns remain the same. For example, a new clustering key that only slightly deviates from the current clustering key may have a low initial cost. However, that same clustering key’s maintenance cost could be higher if the table has frequent updates. By breaking down the Automatic Clustering costs, you can understand the potential cost impact of experimenting with new clustering keys.

Next steps

Try the Automatic Clustering cost estimation function and use Automatic Clustering to improve query performance.

--

--