Photo by Aron Visuals on Unsplash

Snowflake — Time Travel

Amit Singh Rathore
Geek Culture
Published in
4 min readJun 6, 2022

--

Travel back in time to see the table state as it was at that time.

I want to go back in time. Not to change anything, but to feel something again. — Mercer

Snowflake Time Travel is an interesting tool that allows us to access historical data (data that has been modified/removed) from any point, within a defined period, in the past. We can access the historical versions of data until the retention period, after which the data is moved into Snowflake Fail-safe.

Snowflake Time Travel

This feature helps us with the following:

  • Debugging ETLs with data “as it looked then”
  • Restoring DB objects that might have been modified/deleted.
  • Duplicating or backing up data from key points in the past.
  • Analyzing data usage/manipulation over specified periods of time.

When an object’s retention period expires, the historical data is moved to Snowflake Fail-safe. During this period:

  • We can no longer query historical data
  • We can no longer clone objects from the past state
  • We can no longer restore dropped objects

Retention Period & Parameters

Based on the Snowflake edition and object type we have different retention periods for different database objects. The below table summarizes these values:

Object         | Retention period            | Fail-Safe days
---------------+-----------------------------+------------------Temporary | 0 or 1 (default - 1) | 0
Transient | 0 or 1 (default - 1) | 0
Permanent(Std) | 0 or 1 (default - 1) | 7
Permanent(Ent) | 0-90 (default configurable) | 7

We can use SHOW PARAMETERS to find the Time Travel retention period of Snowflake objects.

SHOW PARAMETERS in DATABASE db_name;

Setting Retention Periods

While creating an object we can specify the desired retention period

CREATE TABLE <TABLE_NAME>(...) data_retention_time_in_days=30;

For an existing object the retention can be changed using ALTER as follows:

ALTER TABLE <TABLE_NAME> set data_retention_time_in_days=15;
Inheritance of retention property

Changing the retention period for our account or individual objects changes the value for all lower-level objects that do not have a retention period explicitly set. E.g, if we change the retention period at the account level, all databases, schemas, and tables that do not have an explicit retention period automatically inherit the new retention period.

When a database is dropped, the data retention period for the child objects, if explicitly set to be different from the retention of the database, is not honored. The child objects are retained for the same period of time as the database. To honor the data retention period for these child objects, we need to drop them explicitly before we drop the database.

Storage Costs for Time Travel

The storage costs are calculated for each 24-hour period (i.e. 1 day) from the time the data changed. Also, Snowflake minimizes the amount of storage required for historical data by maintaining only the information required to restore the individual table rows that were updated or deleted. We can see the storage cost of time travel with the following query:

SELECT * FROM "snowflake"."account_usage"."table_storage_mterics" WHERE schema = <schema_name>;

For the above select statement following attributes are of interest.

ACTIVE_BYTES — Actual table cost
TIME_TRAVEL_BYTES — Time travel cost
FAIL_SAFE_BYTES — Fail Safe cost

Time Travel in Action

Snowflake Time Travel SQL extension offers two sub-clauses (AT & BEFORE) with the FROM clause.

  • AT — Inclusive of the timestamp/statement run time
  • BEFORE — point just before the specified DateTime or statement run time

The above sub-clause takes three different arguments

OFFSET — the difference in seconds from the present time
TIMESTAMP
STATEMENT — statement identifier (ID)

-- Query the table at a particular timestamp
SELECT *
FROM <table_name>
at(timestamp => 'Sat, 06 June 2022 01:00:00 +0200'::timestamp_tz);
-- Query the table at 10 minutes before
SELECT *
FROM <table_name>
at(offset => -60*10);
-- Query the table before a statement is executed
SELECT *
FROM <table_name>
before(statement => '8e5d0ca9-005e-44e6-b858-a8f5b37c5726');
SELECT *
FROM <table_name>
before(timestamp => '2022-06-05 13:31:56.786'::timestamp)
-- Create a clone from previous data using timestamp
CREATE TABLE <NEW_TABLE_NAME>
CLONE <CURRENT_TABLE_NAME>
AT(timestamp => 'Sat, 05 June 2022 14:00:00 +0200'::timestamp_tz);
-- Create a clone from previous data using statement
CREATE TABLE <NEW_TABLE_NAME>
CLONE <CURRENT_TABLE_NAME>
AT(statement => '8e5d0ca9-005e-44e6-b858-a8f5b37c5726');
-- Create a clone from previous data using offset
CREATE TABLE <NEW_TABLE_NAME>
CLONE <CURRENT_TABLE_NAME>
AT(offset=>-60*10);
-- Create a clone from previous data using query_id
CREATE TABLE <NEW_TABLE_NAME>
CLONE <CURRENT_TABLE_NAME>
BEFORE(statement => '8e5d0ca9-005e-44e6-b858-a8f5b37c5726');

Recover dropped objects

We can check the history table to find out the tables which are dropped. The active table’s row will have the column is_active set to TRUE. For dropped tables, we have a value set for the column dropped_on.

-- Check table history 
SHOW TABLES HISTORY;
-- is_current & dropped_on column
UNDROP TABLE <TABLE_NAME>;

Happy Data warehousing!!

--

--

Amit Singh Rathore
Geek Culture

Staff Data Engineer @ Visa — Writes about Cloud | Big Data | ML