Snowflake Snowpro Core Preparation Part 11-Performance Optimization — Query Profile & Query History & Resource Monitors

Ganapathy Subramanian.N
9 min readJun 4, 2024

--

In this blog, we are going to discuss about Performance Optimization using Query Profile & Query History. Also, we are going to discuss about Resource Monitors.

Query Profile:

● Provides execution details for a query.

● Provides a graphical representation of the main components of the processing plan for the query.

● Powerful tool for understanding the mechanics of queries.

● Used to know more about the performance or behavior of a particular query.

● Helps to spot typical mistakes in SQL query expressions to identify potential performance bottlenecks.

● Elements of Query Profile — Steps, Operator tree, Node list & Overview.

Explain Plan/Operator Tree:

Understanding Explain Plan, Execution Time & Statistics:

Pruning & Spilling:

Pruning — The micro-partition metadata maintained by Snowflake enables precise pruning of columns in micro-partitions at query run-time, including columns containing semi-structured data. In other words, a query that specifies a filter predicate on a range of values that accesses 10% of the values in the range should ideally only scan 10% of the micro-partitions.

Spilling — Performance degrades drastically when a warehouse runs out of memory while executing a query because memory bytes must “spill” onto local disk storage. If the query requires even more memory, it spills onto remote cloud-provider storage, which results in even worse performance.

> Bytes spilled to local storage — Volume of data spilled to local disk.

> Bytes spilled to remote storage — Volume of data spilled to remote disk.

Common Query Problems Identification Using Query Profile:

Exploding Joins — Joining tables without providing a join condition (resulting in a “Cartesian product”) or providing a condition where records from one table match multiple records from another table.

UNION Without ALL — UNION ALL simply concatenates inputs, while UNION does the same, but also performs duplicate elimination. A common mistake is to use UNION when the UNION ALL semantics are sufficient. These queries show in Query Profile as a Union All operator with an extra Aggregate operator on top (which performs duplicate elimination).

Queries Too Large to Fit in Memory — Same Spilling concept — This spilling can have a profound effect on query performance (especially if remote disk is used for spilling) — To alleviate this:

> Using a larger warehouse and/or

> Processing data in smaller batches.

Inefficient Pruning — If the pruning statistics do not show data reduction, but there is a Filter operator above Table Scan which filters out a number of records, this might signal that a different data organization might be beneficial for this query.

Query History(UI):

● All these details are fetched from QUERY_HISTORY View only.

● Shows last 14 days of data.

● Users can view other user queries but not their Query results.

Query History View:

● This Account Usage view can be used to query Snowflake query history by various dimensions (time range, session, user, warehouse, etc.) within the last 365 days (1 year).

● ACCOUNT_USAGE views are not updated immediately after running a query or task. If you want to check the execution time of a query right after running it, use Snowsight to view its performance. The Information Schema is also updated quicker than the ACCOUNT_USAGE views.

● Information Schema Table Functions:

> QUERY_HISTORY returns queries within a specified time range.

> QUERY_HISTORY_BY_SESSION returns queries within a specified session and time range.

> QUERY_HISTORY_BY_USER returns queries submitted by a specified user within a specified time range.

> QUERY_HISTORY_BY_WAREHOUSE returns queries executed by a specified warehouse within a specified time range.

Finding Long Running Repeated Queries
Track the average performance of a query over time

Warehouse Load History View:

● This Account Usage view can be used to analyze the workload on your warehouse within a specified date range.

● Load history is shown in 5-minute intervals.

This query provides insight into the total load of a warehouse for executed and queued queries. These load values represent the ratio of the total execution time (in seconds) of all queries in a specific state in an interval by the total time (in seconds) for that interval. For example, if 276 seconds was the total time for 4 queries in a 5-minute (300 second) interval, then the query load value is 276 / 300 = 0.92.

All the details about Snowflake’s Query Profile and Query history are available in this Youtube Video:

Resource Monitors:

● A resource monitor can help control costs and avoid unexpected credit usage caused by running warehouses.

● Use a resource monitor to monitor credit usage by virtual warehouses and the cloud services needed to support those warehouses.

● Set up a resource monitor to suspend a user-managed virtual warehouse when it reaches a credit limit.

● Warehouse credits are calculated based on 3 factors:

> Number of Virtual warehouses in use.

> Size of the Warehouse.

> Length of Time warehouse runs.

● Credit usage limits can be set for a specified interval or date range.

● When a limit is reached and/or reaches a specified threshold, the resource monitor can trigger various actions, such as sending alert notifications and/or suspending user-managed warehouses.

● Only users with the ACCOUNTADMIN role can create a resource monitor, but an account administrator can grant privileges to other roles to allow other users to view and modify resource monitors.

Resource Monitor Properties:

Credit Quota — Specifies the number of Snowflake credits allocated to the monitor for the specified frequency interval. Any number can be specified.

> Credit quota accounts for credits consumed by both user-managed virtual warehouses and virtual warehouses used by cloud services.

> For example, your resource monitor limit is set at 1000 credits, if your warehouse consumes 700 credits, and cloud services consume 300 credits within a specified interval or date range, an alert will be triggered.

Monitor Type — This property specifies whether the resource monitor is used to monitor your account or a specific set of individual warehouses.

> An account monitor monitors the credit usage of all the warehouses in the account. An account can only have one account monitor.

> A warehouse monitor monitors the credit usage of the warehouses assigned to the resource monitor. An account can have multiple warehouse monitors. A warehouse monitor can have one or more warehouses assigned to it, but each warehouse can only be assigned to one resource monitor.

Schedule — The default schedule for a resource monitor specifies that it starts monitoring credit usage immediately and the used credits reset back to 0(Zero) at the beginning of each calendar month (i.e., the start of the standard Snowflake billing cycle).

> Frequency — The interval at which the used credits reset relative to the specified start date. Supported values: Daily, Weekly, Monthly, Yearly and Never(used credits never reset).

> Start — Date and time (i.e., timestamp) when the resource monitor starts monitoring the assigned warehouses. Supported values — Immediately (i.e., current timestamp) & Later (i.e., any future timestamp).

> End — Date and time (i.e., timestamp) when Snowflake suspends the warehouses associated with the resource monitor. Supported values: Any future timestamp. Note that this property is not commonly used.

Actions — Also referred to as triggers, each action specifies a threshold, as a percentage of the credit quota for the resource monitor, and the action to perform when the threshold is reached within the specified interval. Note that actions support thresholds greater than 100.

● Notifications are sent to all account administrators with notifications enabled. Email notifications for resource monitors that monitor warehouses are also sent to any non-administrator user that is enabled to receive those notifications.

● Non-administrator users can only receive email notifications for warehouse monitors.

● Each resource monitor can have the following actions:

> One Suspend action.

> One Suspend Immediate action.

> Up to five Notify actions.

● A resource monitor must have at least one action defined; if no actions have been defined, nothing happens when the used credits reach the threshold.

Resource Monitors Assignment:

● A single monitor can be set at the account level to control credit usage for all warehouses in your account.

● A one or more warehouses can be assigned to a resource monitor.

● However, that one warehouse can be assigned to only a single resource monitor below the account level.

● The credit quota for the entire account is 5000 for the interval (month, week, etc.), as controlled by Resource Monitor 1; if this quota is reached within the interval, the actions defined for the resource monitor (Suspend, Suspend Immediate, etc. ) are enforced for all five warehouses.

● Warehouse 3 can consume a maximum of 1000 credits within the interval.

● Warehouse 4 and 5 can consume a maximum combined total of 2500 credits within the interval.

Warehouse Suspension & Resumption:

● The used credits for a resource monitor reflects the sum of credits consumed by all assigned warehouses within the specified interval, along with the cloud services used to support those warehouses during the same interval.

● If a monitor has a Suspend or Suspend Immediately action defined and its used credits reach the threshold for the action, any warehouses assigned to the monitor are suspended and cannot be resumed until one of the following conditions is met:

> The next interval, if any, starts, as dictated by the start date for the monitor.

> The credit quota for the monitor is increased.

> The credit threshold for the suspend action is increased.

> The warehouses are no longer assigned to the monitor.

> The monitor is dropped.

● When credit quota thresholds are reached for a resource monitor, the assigned warehouses may take some time to suspend, even when the action is Suspend Immediate, thereby consuming additional credits.

Snowflake’s recommendation:

> Utilize buffers in the quota thresholds for actions (e.g., set a threshold to 90% instead of 100%) — This will help ensure that your credit usage doesn’t exceed the quota.

> To more strictly control credit usage for individual warehouses, assign only a single warehouse to each resource monitor — When multiple warehouses are assigned to the same resource monitor, they share the same quota thresholds, which may result in credit usage for one warehouse impacting the other assigned warehouses.

All the details about Snowflake’s Resource Monitors are available in this YouTube Video:

Photo Courtesy:

https://docs.snowflake.com

https://select.dev/posts/snowflake-query-profile

Reference Links:

Query profile:

https://docs.snowflake.com/en/user-guide/ui-query-profile

Nice Blog — How to use snowflake query profile:

https://select.dev/posts/snowflake-query-profile

Optimizing performance in Snowflake(All details):

https://docs.snowflake.com/en/guides-overview-performance

Query History(UI):

https://docs.snowflake.com/en/user-guide/ui-history

Query History view usage:

https://docs.snowflake.com/en/user-guide/performance-query-exploring

Resource Monitors:

https://docs.snowflake.com/en/user-guide/resource-monitors

Catch you in the next blog — Snowflake Snowpro Core Preparation Part 12 — Query Acceleration,Search Optimization & Caching — https://medium.com/@trustngs/snowflake-snowpro-core-preparation-part-12-query-acceleration-search-optimization-caching-3bdcbdde4ce2

--

--

Ganapathy Subramanian.N

Director - Data Engineering @ Tiger Analytics. 100+ Technical Certifications. AWS-CB, Cloud(Architect,Devops,Data Engg,DW&ML),YouTuber & Blogger