Snowsight Dashboard in Streamlit

Read Time:1 Minute, 13 Second

In this post, we’ll explore the dashboard crafted in the Streamlit application. Snowflake is simplifying processes for its valued customers and understands the importance of making data management easy and customer-oriented.

With the introduction of Snowsight, Snowflake empowers users with a seamless and intuitive data analytics experience. Snowsight also accelerates insights and decision-making through interactive visualizations called Dashboards.

These dashboards enable the creation of diverse visualizations directly within the Snowflake platform. Additionally Streamlit makes it easy to explore data in Snowflake, creating an interactive experience for users to discover the Snowflake’s data features.

Embarking on my journey with Streamlit, this is my initial attempt at crafting code within the Streamlit framework. While the code may appear basic to some, it’s the beginning of my learning journey.

In our scenario, the Snowsight dashboard will be transformed into a Streamlit dashboard, harnessing the latest features of Streamlit integrated into the Snowflake ecosystem (Streamlit in Snowflake).

Below is the dashboard we have developed in Snowsight:

Snowsight Dashboard

We tried to replicate the same in Streamlit’s application and below is the outcome of my code:ggg

Technical implementation:

1.First import the Plotly package. There is no additional setup require to install. Just scroll down the packages and search for required one.

Plotly package

Below is the code to implement the dashboard:

Final outcome shows as below:

Code Snippet:

# Import python packages
import streamlit as st
import pandas as pd
from snowflake.snowpark.context import get_active_session
from snowflake.snowpark.functions import col
import plotly.express as px

st.set_page_config(layout="wide",page_icon="chart_with_upwards_trend")

# Write directly to the app
st.title(":blue[Snowflake Dashboard App] :snowflake:")
st.subheader("This app is for developing the Snowsight dashboard in Streamlit application")
st.markdown("We have dashboard present in snowsight and as part of this use case we will replicate the same into Streamlit")
#Info bar
st.info('Developed by cloudyard for Testing purpose', icon="ℹ️")
#url = "https://cloudyard.in"
#st.markdown("check out this [link](%s) for code and details" % url)

st.markdown("![Foo](https://cloudyard.in/wp-content/uploads/2023/12/cloudlogo-e1704010546918.png)(https://cloudyard.in/)")


st.divider()

# Get the current credentials
session = get_active_session()

Fail_count = f"select user_name,sum(case when IS_SUCCESS = 'NO' then 1 else 0 end )as failed from snowflake.account_usage.login_history where EVENT_TIMESTAMP >= CURRENT_DATE - 180 group by user_name"
Fail_count_df = session.sql(Fail_count)
pandas_Fail_count_df = Fail_count_df.to_pandas()

fail_count_graph=px.bar(pandas_Fail_count_df,x='FAILED',y='USER_NAME',orientation='h',title="Failed Login Count")
fail_count_graph.update_layout(title='<b>Failed Login Count</b>')
fail_count_graph.update_yaxes(tickfont_family="Arial Black")
fail_count_graph.update_traces(marker_color='green')

User_Credit_Cost = f"WITH USER_HOUR_EXECUTION_CTE AS (SELECT USER_NAME,WAREHOUSE_NAME,DATE_TRUNC('hour',START_TIME) as START_TIME_HOUR,SUM(EXECUTION_TIME) as USER_HOUR_EXECUTION_TIME FROM SNOWFLAKE.ACCOUNT_USAGE.QUERY_HISTORY WHERE WAREHOUSE_NAME IS NOT NULL AND EXECUTION_TIME > 0 AND START_TIME > DATEADD(Month,-1,CURRENT_TIMESTAMP()) group by 1,2,3 ) , HOUR_EXECUTION_CTE AS ( SELECT START_TIME_HOUR,WAREHOUSE_NAME,SUM(USER_HOUR_EXECUTION_TIME) AS HOUR_EXECUTION_TIME FROM USER_HOUR_EXECUTION_CTE group by 1,2 ), APPROXIMATE_CREDITS AS (SELECT A.USER_NAME,C.WAREHOUSE_NAME,(A.USER_HOUR_EXECUTION_TIME/B.HOUR_EXECUTION_TIME)*C.CREDITS_USED AS APPROXIMATE_CREDITS_USED FROM USER_HOUR_EXECUTION_CTE A JOIN HOUR_EXECUTION_CTE B ON A.START_TIME_HOUR = B.START_TIME_HOUR and B.WAREHOUSE_NAME = A.WAREHOUSE_NAME JOIN SNOWFLAKE.ACCOUNT_USAGE.WAREHOUSE_METERING_HISTORY C ON C.WAREHOUSE_NAME = A.WAREHOUSE_NAME AND C.START_TIME = A.START_TIME_HOUR ) SELECT USER_NAME,WAREHOUSE_NAME,round(SUM(APPROXIMATE_CREDITS_USED),2) AS APPROXIMATE_CREDITS_USED FROM APPROXIMATE_CREDITS GROUP BY 1,2 ORDER BY 3 DESC"
User_Credit_Cost_df = session.sql(User_Credit_Cost)
pandas_User_Credit_Cost_df = User_Credit_Cost_df.to_pandas()

User_Credit_Cost_graph=px.bar(pandas_User_Credit_Cost_df,x='APPROXIMATE_CREDITS_USED',y='USER_NAME',orientation='h',title="<b> Matrix of users utilizing approximate credit costs in a month")
User_Credit_Cost_graph.update_yaxes(tickfont_family="Arial Red")
User_Credit_Cost_graph.update_traces(marker_color='purple')

container1 = st.container()

with container1:
plot1, plot2 = st.columns(2)
with plot1:
st.plotly_chart(fail_count_graph, use_container_width=True)
with plot2:
st.plotly_chart(User_Credit_Cost_graph, use_container_width=True)

credits_consumption_90days = f"WITH CTE_DATE_WH AS( SELECT TO_DATE(START_TIME) AS START_DATE ,WAREHOUSE_NAME ,SUM(CREDITS_USED) AS CREDITS_USED_DATE_WH FROM SNOWFLAKE.ACCOUNT_USAGE.WAREHOUSE_METERING_HISTORY GROUP BY START_DATE ,WAREHOUSE_NAME ) SELECT START_DATE ,WAREHOUSE_NAME ,round(CREDITS_USED_DATE_WH) AS CREDITS_USED_DATE_WH FROM CTE_DATE_WH where CREDITS_USED_DATE_WH > 0 AND START_DATE > DATEADD(Month,-9,CURRENT_TIMESTAMP()) order by START_DATE;"
credits_consumption_90days_df = session.sql(credits_consumption_90days)
pandas_credits_consumption_90days_df = credits_consumption_90days_df.to_pandas()

#chart
pandas_credits_consumption_90days_graph=px.bar(pandas_credits_consumption_90days_df,x='START_DATE',y='CREDITS_USED_DATE_WH',color='WAREHOUSE_NAME',orientation='v',title="Warehouse Credit Consumption Report for the Last 90 Days")
#pandas_credits_consumption_90days_graph.update_traces(marker_color='red')


Query_Exec_cnt = f"SELECT substr(QUERY_TEXT,0,25) as partial_query_text,count(*) as number_of_queries,sum(TOTAL_ELAPSED_TIME)/(1000*60) as execution_minutes from SNOWFLAKE.ACCOUNT_USAGE.QUERY_HISTORY Q where 1=1 and TO_DATE(Q.START_TIME) > DATEADD(month,-1,TO_DATE(CURRENT_TIMESTAMP())) and TOTAL_ELAPSED_TIME > 0 group by 1 having count(*) >= 10 order by 2 desc limit 20 "
Query_Exec_cnt_df = session.sql(Query_Exec_cnt)
pandas_Query_Exec_cnt_df = Query_Exec_cnt_df.to_pandas()

#chart
pandas_Query_Exec_cnt_graph=px.bar(pandas_Query_Exec_cnt_df,y='NUMBER_OF_QUERIES',x='PARTIAL_QUERY_TEXT',color='EXECUTION_MINUTES',orientation='v',title="Queries by number of times Executed and Their Execution Time")

WH_Credits = f"select warehouse_name as WH_NAME,to_char(start_time,'YYYY-MM') as month,sum(credits_used) as TOTAL_CREDITS from snowflake.account_usage.warehouse_metering_history wmh where start_time >= CURRENT_DATE - 120 group by warehouse_name, to_char(start_time,'YYYY-MM') order by 1;"
WH_Credits_df = session.sql(WH_Credits)
pandas_WH_Credits_df = WH_Credits_df.to_pandas()

pandas_WH_Credits_graph=px.bar(pandas_WH_Credits_df,x='TOTAL_CREDITS',y='WH_NAME',orientation='h',title="Credits Burned by Role for WH")
pandas_WH_Credits_graph.update_traces(marker_color='red')

with container1:
plot1, plot2, plot3 = st.columns(3)
with plot1:
st.plotly_chart(pandas_credits_consumption_90days_graph, use_container_width=True)
with plot2:
st.plotly_chart(pandas_Query_Exec_cnt_graph, use_container_width=True)
with plot3:
st.plotly_chart(pandas_WH_Credits_graph, use_container_width=True)


st.subheader("Data storage by Month")
storage_cost = f"select to_char(usage_date,'YYYYMM') as sort_month ,to_char(usage_date,'Mon-YYYY') as month ,avg(storage_bytes) as storage,avg(stage_bytes) as stage,avg(failsafe_bytes) as failsafe from snowflake.account_usage.storage_usage group by month, sort_month order by sort_month;"
storage_cost_df = session.sql(storage_cost)
pandas_storage_cost_df = storage_cost_df.to_pandas()
st.dataframe(pandas_storage_cost_df.style.applymap( lambda _: "background-color: CornflowerBlue;"))

WH_Cost_per_role = f"select ROL_TOT_TIME.Warehouse_Name, ROL_TOT_TIME.Role_Name,(ROL_TOT_TIME.ROLE_ELAPSED / WH_TOT_TIME.WH_ELAPSED ) * Credits_Used as Credits_By_Role,Credits_By_Role *3 as Dollars_Used from (select sum(total_elapsed_time) WH_ELAPSED, WAREHOUSE_NAME from SNOWFLAKE.ACCOUNT_USAGE.QUERY_HISTORY WHERE START_TIME > DATE_TRUNC(month, CURRENT_DATE()) group by WAREHOUSE_NAME) WH_TOT_TIME,(select sum(total_elapsed_time) ROLE_ELAPSED, WAREHOUSE_NAME, ROLE_NAME from SNOWFLAKE.ACCOUNT_USAGE.QUERY_HISTORY WHERE START_TIME > DATE_TRUNC(month, CURRENT_DATE()-60) group by WAREHOUSE_NAME, ROLE_NAME) ROL_TOT_TIME,(select sum(Credits_used) Credits_Used, Date_Part(Year, Start_Time) || '-' || Date_Part(Month, Start_Time) as Period, WAREHOUSE_NAME from SNOWFLAKE.ACCOUNT_USAGE.WAREHOUSE_METERING_HISTORY WHERE START_TIME > DATE_TRUNC(month, CURRENT_DATE()-60) group by WAREHOUSE_NAME,Period) WH_CREDIT where WH_CREDIT.WAREHOUSE_NAME = ROL_TOT_TIME.Warehouse_Name(+) and ROL_TOT_TIME.Warehouse_Name = WH_TOT_TIME.Warehouse_Name;"
WH_Cost_per_role_df = session.sql(WH_Cost_per_role)
pandas_WH_Cost_per_role_df = WH_Cost_per_role_df.to_pandas()

pandas_WH_Cost_per_role_graph=px.bar(pandas_WH_Cost_per_role_df,x='CREDITS_BY_ROLE',y='ROLE_NAME',color='WAREHOUSE_NAME',orientation='h',title="Warehouse Cost per role")
#pandas_WH_Cost_per_role_graph.update_traces(marker_color='orange')
st.plotly_chart(pandas_WH_Cost_per_role_graph, use_container_width=True)

The Story is originally published here:

--

--