Snowflake Dynamic Table — Complete Guide — 1

In recent times, Snowflake has introduced Dynamic Tables as a preview feature, which is now available to all accounts. This has sparked significant interest among users who are reaching out to me for more details about Snowflake Dynamic Tables. In my upcoming Medium blog, I will delve into the concept of Dynamic Tables, discussing their use cases and the advantages they offer over other data pipelines. But before we dive into the specifics, let’s start by understanding what exactly Dynamic Tables are. Stay tuned for more in-depth insights in the following topics!

Dynamic Tables ?

Dynamic tables are tables that materialize the results of a specified query. Rather than creating a separate target table and writing code to modify and update the data in that table, dynamic tables allow you to designate the target table as dynamic and define an SQL statement to perform the transformation. These tables automatically update the materialized results through regular and often incremental refreshes, eliminating the need for manual updates. Dynamic tables provide a convenient and automated way to manage data transformations and keep the target table up-to-date with the latest query results.

How to create Dynamic Tables?

To create a dynamic table, use the CREATE DYNAMIC TABLE command, specifying the query to use, the target lag of the data, and the warehouse to use to perform the refreshes.

Syntax:

CREATE [ OR REPLACE ] DYNAMIC TABLE <name>
TARGET_LAG = { '<num> { seconds | minutes | hours | days }' | DOWNSTREAM }
WAREHOUSE = <warehouse_name>
AS <query>

TARGET_LAG = { num { seconds | minutes | hours | days } | DOWNSTREAM }

Specifies the lag for the dynamic table:

'num seconds | minutes | hours | days'

The TARGET_LAG parameter specifies the maximum allowed time lag between updates to the base tables and the content of the dynamic table. It can be specified in terms of seconds, minutes, hours, or days. For example, if the desired lag is 5 minutes or 5 hours, you would specify it accordingly. The minimum allowed value is 1 minute. It's important to note that if one dynamic table depends on another, the lag for the dependent table must be greater than or equal to the lag for the table it depends on.

Example:

CREATE OR REPLACE DATABASE DYNAMIC_TABLE_DB;

CREATE OR REPLACE SCHEMA DYNAMIC_TABLE_SCH;

CREATE OR REPLACE TABLE EMPLOYEE(EMP_ID INT, EMP_NAME VARCHAR,EMP_ADDRESS VARCHAR);

INSERT INTO EMPLOYEE VALUES(1,'AGAL','INDIA');
INSERT INTO EMPLOYEE VALUES(2,'KINNU','INDIA');
INSERT INTO EMPLOYEE VALUES(3,'SHUKESH','AUSTRALIA');
INSERT INTO EMPLOYEE VALUES(4,'SUPREET','UAE');

SELECT * FROM EMPLOYEE;

CREATE OR REPLACE TABLE EMPLOYEE_SKILL(
SKILL_ID NUMBER,
EMP_ID NUMBER,
SKILL_NAME VARCHAR(50),
SKILL_LEVEL VARCHAR(50)
);

INSERT INTO EMPLOYEE_SKILL VALUES(1,1,'SNOWFLAKE','ADVANCE');
INSERT INTO EMPLOYEE_SKILL VALUES(2,1,'PYTHON','BASIC');
INSERT INTO EMPLOYEE_SKILL VALUES(3,1,'SQL','INTERMEDIATE');
INSERT INTO EMPLOYEE_SKILL VALUES(1,2,'SNOWFLAKE','ADVANCE');
INSERT INTO EMPLOYEE_SKILL VALUES(1,4,'SNOWFLAKE','ADVANCE');

SELECT * FROM EMPLOYEE_SKILL;

The given script includes the creation and population of two tables: EMPLOYEE and EMPLOYEE_SKILL. Here’s a brief description of each table:

EMPLOYEE Table:

  • Columns: EMP_ID (integer), EMP_NAME (varchar), EMP_ADDRESS (varchar)
  • Purpose: This table stores information about employees, including their unique IDs, names, and addresses.

EMPLOYEE_SKILL Table:

  • Columns: SKILL_ID (number), EMP_ID (number), SKILL_NAME (varchar), SKILL_LEVEL (varchar)
  • Purpose: This table maintains the skills and skill levels of employees. It establishes a relationship with the EMPLOYEE table through the EMP_ID column, representing the employee’s ID. Each skill entry includes a skill ID, skill name, and skill level.

Points to remember:

Before proceeding with the creation of dynamic tables, it is essential to understand that enabling change tracking for the underlying objects is crucial. As dynamic tables rely on tracking changes in the underlying database objects, it becomes necessary to enable change tracking on all related objects.

When creating a dynamic table in Snowflake, the platform automatically attempts to enable change tracking on the underlying objects. However, it is important to note that the user creating the dynamic table might not have the necessary privileges to enable change tracking on all the required objects. Therefore, it is advisable to use commands such as SHOW VIEW, SHOW TABLE, or similar ones to inspect the CHANGE_TRACKING column. This will help determine if change tracking is enabled for specific database objects, ensuring smooth and error-free refreshes of dynamic tables.

Now, we will check change tracking for the table which we have created,

SHOW TABLES;

Output:

SHOW TABLES

Although change tracking is currently disabled for both the Employee and Employee_Skill tables, it’s important to note that when a dynamic table is created on top of these tables, change tracking will be automatically enabled. This ensures that the dynamic table captures and reflects any modifications made to the underlying data.

Dynamic Table:

CREATE OR REPLACE DYNAMIC TABLE EMPLOYEE_DET
TARGET_LAG = '1 MINUTE'
WAREHOUSE = COMPUTE_WH
AS
SELECT A.EMP_ID,A.EMP_NAME,A.EMP_ADDRESS, B.SKILL_ID,B.SKILL_NAME,B.SKILL_LEVEL
FROM EMPLOYEE A, EMPLOYEE_SKILL B
WHERE A.EMP_ID=B.EMP_ID
ORDER BY B.SKILL_ID ;

In this scenario:

  • The code snippet demonstrates the creation or replacement of a dynamic table named EMPLOYEE_DET. It utilizes the EMPLOYEE and EMPLOYEE_SKILL tables to populate the dynamic table.
  • The target lag for the dynamic table is set to 1 minute, indicating that the data in the dynamic table should ideally not be more than 1 minute behind the data in the source tables.
  • The dynamic table is refreshed automatically, leveraging the compute resources of the COMPUTE_WH warehouse.
  • The data in the dynamic table is derived by selecting relevant columns from the EMPLOYEE and EMPLOYEE_SKILL tables, performing a join based on the EMP_ID column, and ordering the result by the SKILL_ID column.

When querying the Dynamic Table EMPLOYEE_DET immediately after its creation, you may encounter an error stating, “Dynamic Table ‘DYNAMIC_TABLE_DB.DYNAMIC_TABLE_SCH.EMPLOYEE_DET’ is not initialized. Please run a manual refresh or wait for a scheduled refresh before querying.” This error occurs because the table requires a one-minute wait for the Target Lag to be completed. It is necessary to either manually refresh the table or wait until the scheduled refresh occurs before querying the data successfully.

After a one-minute duration following the execution of the Dynamic table creation process,

SELECT * FROM EMPLOYEE_DET;

Any Data Manipulation Language (DML) changes made to the base tables, such as EMPLOYEE or EMPLOYEE_SKILL, will be reflected in the Dynamic table within the specified latency period of 1 minute. This includes any modifications to the data itself, such as inserting, updating, or deleting records in the base tables. The Dynamic table automatically captures and reflects these changes, ensuring that it stays up-to-date with the latest data modifications. This real-time synchronization between the base tables and the Dynamic table allows for accurate and timely data analysis and reporting.

For Example:

UPDATE EMPLOYEE_SKILL
SET SKILL_LEVEL = 'ADVANCED'
WHERE EMP_ID = 1 AND SKILL_NAME = 'SNOWFLAKE';

DELETE FROM EMPLOYEE
WHERE EMP_ID = 4;

After executing the above statements and waiting for a one-minute lag period, the dynamic table will be automatically updated.

In above example EMP_ID — 4 got truncated and SKILL_LEVEL for EMP_ID updated from Advance to Advanced

In the subsequent blog, we will explore the following areas related to dynamic tables:

  1. Working with Dynamic Tables: (Alter / Describe / Drop / Show)
  2. Dynamic Tables vs. Streams and Tasks:
  • Comparing dynamic tables with streams and tasks
  • Understanding their respective use cases and advantages
  • Exploring the differences in functionality and behavior

3. Dynamic Tables vs. Materialized Views:

  • Contrasting dynamic tables with materialized views
  • Examining their distinct characteristics and purposes
  • Analyzing the benefits and trade-offs of using each approach

4. Managing Dynamic Tables:

  • Best practices for managing dynamic tables effectively
  • Optimizing performance and resource utilization
  • Handling dynamic table dependencies and refresh scheduling

5. Understanding Dynamic Table States:

  • Exploring the different states of dynamic tables
  • Interpreting their significance and implications
  • Managing and troubleshooting dynamic table states

Stay tuned for more comprehensive insights on these topics in the upcoming sections of the blog.

References:-

About me:

I am a Data Engineer and Cloud Architect with experience as a Senior Consultant at EY GDS. Throughout my career, I have worked on numerous projects involving legacy data warehouses, big data implementations, cloud platforms, and migrations. If you require assistance with certification, data solutions, or implementations, please feel free to connect with me on LinkedIn.

--

--