Visualizing Task Hierarchies and Dependencies in Snowflake Snowsight

I often get asked for better ways to visualize or interact with task dependencies and hierarchies beyond show tasks. I was on a call with one of my favorite clients from Sunnyvale last Thursday, and we chatted through methods of using Snowsight to do this. With these SQL snippets and Snowsight charts, you, too, should be able to understand better your tasks, their run history, and dependencies. Just in case you do not have dependant tasks yet, I have included a brief sample set of dependent tasks.

Setting the Session Context
First, if you have not already done so, navigate to Snowsight to use Snowflake’s charting capabilities. How do you do this (https://docs.snowflake.com/en/user-guide/ui-snowsight-gs.html)?

Generic Dependent Tasks if you Don’t Got One (I told him we already got one…)
This block of SQL does many things, including creating a task warehouse, a dependent set of tasks, and starts them up. Be sure to stop them after getting your feet wet, as they will burn a few credits each day if left on. This generic task block will create a sample table, insert random records from Snowflake’s sample TPCDS data set, create an aggregate table, insert aggregated records from the sample run, and delete the sample run. Full SQL file here on GitHub (https://github.com/BigDataDave1/SnowflakeTaskVisualizer/blob/main/snowsight_tasks.sql).

We need to start unwinding and iterating on the ‘Show Tasks;’ command to pull out a proper tree of dependencies.

Task History
Snowflake has a packaged information_schema function to pull out task run history. We will come back to this to pull out run status and timing later in the article.

Visualizing Tasks Explicit Task Trees with CONNECT BY and SYS_CONNECT_BY_PATH()
I will use an old favorite function I built back in February that takes the SHOW command and creates and persists a JSON table from the output https://bigdatadave.com/2020/02/17/snowflake-update-select-show-object-to-json/.

Using this STORE_GET_SHOW2() and CONNECT BY with a SYS_CONNECT_BY_PATH we can create a tree for an explicit path. This works fine if you want to look at Tasks one at a time, we will build the dynamic function next.

Results in this clean hierarchy for the deepest nested task: ->DEMO.PUBLIC.T01_CTAS->DEMO.PUBLIC.T02_INSERT_SAMPLE->DEMO.PUBLIC.T03_CTAS_AGG->DEMO.PUBLIC.T04_INSERT_AGG->DEMO.PUBLIC.T05_DELETE_SAMPLE
We can even view it as a heat grid to see a visual of the dependencies by level.

Pfft…it’s Only Cool if it’s Dynamic
Alright, time to break out the big dog SQL; Recursive Common Table Expressions. We can create layer IDs, a top-level task, and task dependency trees for all database tasks. We can assume every task without a predecessor is a top-level task and recursively visit all the subtasks until we build the full tree. I gotta be honest when it worked the first time; I did a little SQL Shuffle at my desk.

Let’s Visualize Task Run History
We can use the information_schema.task_history function for each of the tasks in the tree. It’s hardcoded in this version. We could create a recursive CTE to grab these from the show tasks command, but you only get 1 per blog post. I have found using the Bar Chart to be the way to go in Snowsight unless you have Tableau and proper Gantt charts.

Example Bar Chart Visual in Snowsight
Example Gantt Visual in Tableau

Conclusion
Be a good data engineer and turn off your tasks to prevent unnecessary metering. Now we can see task dependencies more clearly, understand run times, and use these methods to debug dependent task code.

--Stop the Tasks if you want to Stop Hourly Billing
alter task demo.public.t01_ctas suspend;
alter task demo.public.t02_insert_sample suspend;
alter task demo.public.t03_ctas_agg suspend;
alter task demo.public.t04_insert_agg suspend;
alter task demo.public.t05_delete_sample suspend;

Go ahead and get started with Snowsight here: https://docs.snowflake.com/en/user-guide/ui-snowsight-gs.html

If you want to learn more you can follow tutorials at guides.snowflake.com

Please join me in the Snowflake community to watch more videos and training on Snowflake: https://docs.snowflake.com/en/user-guide/ui-snowsight-gs.html

Happy (Recursive Task Hierarchy) Querying!

Originally published at http://bigdatadave.com on November 16, 2020.

--

--

David A Spezia
Snowflake Builders Blog: Data Engineers, App Developers, AI/ML, & Data Science

David spends his time working with Web-Scale Data Applications built on Snowflake with the largest tenants on the Snowflake Data Cloud.