Snowflake Builders Blog: Data Engineers, App Developers, AI/ML, & Data Science

Best practices, tips & tricks from Snowflake experts and community

Image generated using flux-softserve-anime on Replicate by Author.

How to Build an Interactive Scheduled Query Execution Report in Snowflake Notebooks

4 min readJan 28, 2025

--

Have you ever wondered how to monitor and analyze your scheduled query executions in Snowflake effectively? Managing and gaining insights from extensive query logs can be a daunting task. Without the right tools, identifying performance issues and failure patterns becomes challenging.

The solution lies in creating an interactive dashboard that visualizes your query execution data, making monitoring performance and troubleshooting issues easier.

In this blog, you’ll discover how to build an interactive scheduled query execution report using Snowflake Notebooks integrated with Streamlit.

By the end of this blog, you will learn to:

  • Retrieve and analyze scheduled query execution data
  • Create interactive visualizations with Streamlit in Snowflake Notebooks
  • Build heatmaps to visualize query execution patterns
  • Analyze query performance and failure metrics

Getting Started

To embark on this project, ensure you have access to a Snowflake account. Familiarity with SQL and Python, as well as an understanding of Snowflake tasks and scheduled queries, will be beneficial.

Before diving into the code, download the necessary notebook (Scheduled_Query_Execution_Report.ipynb) from GitHub and ensure you have the required Python packages.

Snowflake Notebooks come pre-installed with common libraries like numpy, pandas, and altair. For additional packages, use the Packages drop-down in the notebook interface.

Building the Interactive Query Execution Report

Let’s walk through building the interactive dashboard step by step.

Retrieving Query Execution Data

Start by writing a SQL query to fetch execution history from the task_history view. This query retrieves essential details about each task's execution over the past day.

SELECT 
name,
database_name,
query_id,
query_text,
schema_name,
scheduled_time,
query_start_time,
completed_time,
DATEDIFF('second', query_start_time, completed_time) as execution_time_seconds,
state,
error_code,
error_message
FROM snowflake.account_usage.task_history
WHERE scheduled_time >= DATEADD(days, -1, CURRENT_TIMESTAMP())
ORDER BY scheduled_time DESC;

This query returns a comprehensive dataset of your task executions, which will serve as the foundation for your interactive report:

Converting Data to a DataFrame

Transform the SQL results into a Pandas DataFrame to facilitate further analysis and visualization within the report.

sql_data.to_pandas()

Creating the Interactive Interface

Next, set up an interactive slider that allows users to select the number of days to analyze. This selection dynamically filters the data displayed in the dashboard.

import pandas as pd
import streamlit as st
import altair as alt

# Create date filter slider
st.subheader("Select time duration")
days = st.slider('Select number of days to analyze',
min_value=10,
max_value=90,
value=30,
step=10)

This slider widget enables users to adjust the time range for data analysis effortlessly:

Data Preparation

Reshape the data by calculating the frequency count by hour and task name. This preparation is crucial for creating an informative heatmap that visualizes execution patterns.

# Filter data according to day duration
latest_date = pd.to_datetime(df['SCHEDULED_TIME']).max()
cutoff_date = latest_date - pd.Timedelta(days=days)
filtered_df = df[pd.to_datetime(df['SCHEDULED_TIME']) > cutoff_date].copy()

# Prepare data for heatmap
filtered_df['HOUR_OF_DAY'] = pd.to_datetime(filtered_df['SCHEDULED_TIME']).dt.hour
filtered_df['HOUR_DISPLAY'] = filtered_df['HOUR_OF_DAY'].apply(lambda x: f"{x:02d}:00")

# Calculate frequency count by hour and task name
agg_df = filtered_df.groupby(['NAME', 'HOUR_DISPLAY', 'STATE']).size().reset_index(name='COUNT')

st.warning(f"Analyzing data for the last {days} days!")

Data Visualization

Create a heatmap and display a summary statistics table to gain insights into task performance and states such as SUCCEEDED, FAILED, and SKIPPED.

chart = alt.Chart(agg_df).mark_rect(
stroke='black',
strokeWidth=1
).encode(
x=alt.X('HOUR_DISPLAY:O',
title='Hour of Day',
axis=alt.Axis(
labels=True,
tickMinStep=1,
labelOverlap=False
)),
y=alt.Y('NAME:N',
title='',
axis=alt.Axis(
labels=True,
labelLimit=200,
tickMinStep=1,
labelOverlap=False,
labelPadding=10
)),
color=alt.Color('COUNT:Q',
title='Number of Executions'),
row=alt.Row('STATE:N',
title='Task State',
header=alt.Header(labelAlign='left')),
tooltip=[
alt.Tooltip('NAME', title='Task Name'),
alt.Tooltip('HOUR_DISPLAY', title='Hour'),
alt.Tooltip('STATE', title='State'),
alt.Tooltip('COUNT', title='Number of Executions')
]
).properties(
height=100,
width=450
).configure_view(
stroke=None,
continuousWidth=300
).configure_axis(
labelFontSize=10
)

# Display the chart
st.subheader(f'Task Execution Frequency by State ({days} Days)')
st.altair_chart(chart)

Here’s the resulting heatmap:

Adding Summary Statistics

Finally, calculate and display summary statistics of the execution history. This table provides a clear overview of task states and their respective counts.

st.subheader("Summary Statistics")
summary_df = filtered_df.groupby('NAME').agg({
'STATE': lambda x: pd.Series(x).value_counts().to_dict()
}).reset_index()

# Format the state counts as separate columns
state_counts = pd.json_normalize(summary_df['STATE']).fillna(0).astype(int)
summary_df = pd.concat([summary_df['NAME'], state_counts], axis=1)
st.dataframe(summary_df)

And here’s the DataFrame:

Conclusion

Congratulations! You’ve successfully built an interactive dashboard for analyzing scheduled query executions in Snowflake. This tool empowers you to monitor query performance, visualize execution patterns, and identify potential issues in your scheduled tasks. By leveraging Snowflake Notebooks and Streamlit, you can customize and expand this dashboard to suit your specific monitoring needs.

Resources

Videos:

Articles:

Documentation:

Happy coding!

--

--

Chanin Nantasenamat
Chanin Nantasenamat

Written by Chanin Nantasenamat

Data Professor on YouTube | Sr Developer Advocate | ex-Professor of Bioinformatics | Join https://data-professor.medium.com/membership

No responses yet