Machine Learning-Based Alerts for Snowflake FinOps

Overview of GenAI and LLM capabilities in Snowflake

Overview

Cost management is a key component of every successful cloud strategy. Snowflake provides a range of built-in tools to effectively manage cost, including:

This article will demonstrate how you can leverage Snowflake alerts to automatically receive email notifications for anomalies detected in virtual warehouse compute usage using Snowflake Cortex ML-Based Functions.

The cost of using Snowflake platform can be broken down into:

  • Compute resources (virtual warehouse compute, serverless compute, cloud service compute)
  • Storage
  • Data transfer

Virtual warehouse compute usage can be monitored using warehouse_metering_history view in account usage schema. It shows hourly credit usage for virtual warehouses in your account within the last 365 days.

Sample result from querying warehouse_metering_history

To automatically detect anomalies in virtual warehouse compute usage, Snowflake users can leverage machine learning. Snowflake Cortex ML-based functions simplify the complexities associated with using ML models and enable organizations to quickly gain insights into their data.

To train a machine learning using Snowflake Cortex ML-based functions, let’s start with splitting the data into a training dataset and a test dataset. Data from the last 12 months of virtual warehouse compute usage will be used, with the test dataset comprising the most recent 2 months and the remaining 10 months forming the training dataset.

Build ML model using Snowflake Cortex ML-based functions

Step 1: Create the training dataset:

create or replace view warehouse_compute_usage_train as
select
to_timestamp_ntz(to_date(start_time)) as timestamp,
sum(credits_used_compute) as credits_used
from snowflake.account_usage.warehouse_metering_history
where timestamp between dateadd(day,-365,current_date()) and dateadd(day,-61,current_date())
group by all;

Step 2: Create the test dataset:

create or replace view warehouse_compute_usage_test as
select
to_timestamp_ntz(to_date(start_time)) as timestamp,
sum(credits_used_compute) as credits_used
from snowflake.account_usage.warehouse_metering_history
where timestamp between dateadd(day,-60,current_date()) and current_date()
group by all;

Step 3: Train an anomaly detection model using Snowflake Cortex ML-based functions:

create or replace snowflake.ml.anomaly_detection warehouse_usage_analysis(
input_data => system$reference('view', 'warehouse_compute_usage_train'),
timestamp_colname => 'timestamp',
target_colname => 'credits_used',
label_colname => ''
);

Step 4: Run model inference using the trained model:

call warehouse_usage_analysis!detect_anomalies(
input_data => system$reference('view','warehouse_compute_usage_test')
, timestamp_colname => 'timestamp'
, target_colname => 'credits_used'
);
Results of model inference

Step 5: Visualize model results using Snowsight:

Visualization of ML model inference using selected parameters.

The forecasted values are indicated by the red line, while the lower bounds and upper bounds of the forecasts are respectively represented by light blue and yellow lines.

The actual (observed) values are displayed using a dark blue line. Every point on this line that is not within the lower and upper bounds of the forecasts is marked an anomaly.

By running the SQL below, I am able to identify that they are 6 anomalies detected in the test dataset.

create table warehouse_usage_anomalies 
as select * from table(result_scan(last_query_id()));

select * from warehouse_usage_anomalies
where is_anomaly = true;

In the code used to call the model inference, the prediction_interval parameter value was not specified, therefore, the default value 0.99 was used. To mark more observations as anomalies, reduce the value of prediction_interval and set it to, for instance, 0.9. On the other hand, to mark fewer observations as anomalies, the value of prediction_interval parameter should be increased.

Create email notifications for newly detected anomalies using Snowflake Alerts and Tasks

To create automatic email notifications for new anomalies detected in virtual warehouse compute usage Snowflake Tasks & Alerts can be used.

Step 1: Create a task to retrain ML model on a weekly basis at 5 AM every Sunday LA time:

create or replace task train_warehouse_usage_anomaly_task
warehouse = demo_wh
schedule = 'USING CRON 0 5 * * 0 America/Los_Angeles'
as
execute immediate
$$
begin
create or replace snowflake.ml.anomaly_detection warehouse_usage_analysis(
input_data => system$reference('view', 'warehouse_compute_usage_train'),
timestamp_colname => 'timestamp',
target_colname => 'credits_used',
label_colname => ''
);
end;
$$;

Step 2: Create a task to call the anomaly detection model on a daily basis at 7 AM LA time and insert the result into warehouse_usage_anomalies table:

create or replace task inference_warehouse_usage_anomaly_task
warehouse = demo_wh
schedule = 'USING CRON 0 7 * * * America/Los_Angeles'
as
execute immediate
$$
begin
call warehouse_usage_analysis!detect_anomalies(
input_data => system$reference('view','warehouse_compute_usage_test')
, timestamp_colname => 'timestamp'
, target_colname => 'credits_used'
);
insert into warehouse_usage_anomalies
select * from table(result_scan(last_query_id()));
end;
$$;

Step 3: Set up an alert to check every day at 8 AM LA time if any new anomalies have been detected in warehouse compute usage:

create or replace alert warehouse_usage_anomaly_alert
warehouse = demo_wh
schedule = 'USING CRON 0 8 * * * America/Los_Angeles'
if (exists (select * from warehouse_usage_anomalies where is_anomaly=True and ts > dateadd('day',-1,current_timestamp())))
then
call system$send_email(
'warehouse_email_alert',
'test@domain.com',
'Warehouse compute usage anomaly detected',
concat(
'Anomaly detected in the warehouse compute usage. ',
'Value outside of confidence interval detected.'
)
);

Similarly to tasks, alerts are created in suspended state and need to be resumed in order to start running. After resuming the alert, this is the email I have received after some time:

Sample email notification received

Final note

Anomaly detection using Snowflake Cortex ML-based functions works with both single-series and multi-series data. While this article focuses on single-series data, it’s important to note that it’s also possible to use anomaly detection functions with multi-series data to build separate, independent ML models, for example, for each virtual warehouse object.

There are various approaches that can be taken to anomaly detection in Snowflake. The decision to implement ML-based anomaly detection should be assessed on case-by-case basis. In certain scenarios, simple approaches, such as threshold-based anomaly detection, might be more practical than implementing ML-based methods.

I am currently a Snowflake Solutions Consultant at Snowflake. Opinions expressed in this post are solely my own and do not represent the views or opinions of my employer.

--

--

Piotr Paczewski
Snowflake Builders Blog: Data Engineers, App Developers, AI/ML, & Data Science

Solutions Consultant at Snowflake. Opinions expressed are solely my own and do not represent views of my employer