All You Need To Know About Snowflake Database Replication & Fail-Over/Failback — Part 2

This blog is a continuation of All You Need To Know About Snowflake Database Replication & Fail-Over/Failback — Part 1. If you missed seeing the previous blog than I would recommend reading that first to understand this blog better.

Please read through the below admin series of blog which might also interest you.

Snowflake Account Replication

Will My Database Role also gets replicated?

If you are not sure what do I mean by Database role than please give a read to my recent blog on the same.

At the moment database role are not supported along with database replication, but it's already added in future roadmap. Let's see the same in action.

What all gets replicated along with my database?

Most of the objects inside the database are replicated to a secondary database either in total or with some consideration.

For more detail objectwise consideration please have a look at documentation for Database Replicated Objects.

All object level parameter explicitly set at schema or object level will gets replicated in all secondary database. Parameters explicitly set at the schema or object level on secondary databases are not overwritten.

Database level parameter set on primary database are not replicated.

Source: Snowflake

Pipe are not replicated but if the PIPE_EXECUTION_PAUSED parameter is set at the schema level in the primary database, it is replicated to the secondary database.

What are the limitation of Database Replication ?

Below are the limitation for database replication:

  1. External table are not supported for database replication at the moment.
  2. Databases created from shares cannot be replicated at the moment.

How does database replication work in Snowflake?

Below are high level steps you have to follow for database replication.

Who provide compute for database replication?

Database replication is done using snowflake provided warehouse which is known as REPLICATION.

Warehouse which is used in the task is only used to kick off the refresh task on schedule not for replication.

How to monitor database replication?

There are multiple metadata views and function available in Snowflake which can help you monitor different aspect of database replication.

To determine the current status of the initial database replication or a subsequent secondary database refresh, query the below information schema functions. These function only returns data for last 14 days only.

  1. DATABASE_REFRESH_PROGRESS INFORMATION SCHEMA FUNCTION
  2. DATABASE_REFRESH_PROGRESS_BY_JOB INFORMATION SCHEMA FUNCTION
--Query for replicated database

SELECT *
FROM TABLE(INFORMATION_SCHEMA.DATABASE_REFRESH_PROGRESS(DATABASE_FOR_REPLICATION))
ORDER BY START_TIME;

--Database name is option if context is already set with required database.

USE DATABASE DATABASE_FOR_REPLICATION;

SELECT *
FROM TABLE(INFORMATION_SCHEMA.DATABASE_REFRESH_PROGRESS())
ORDER BY START_TIME;

--Query using query id of refresh sql statement.
SELECT *
FROM TABLE(INFORMATION_SCHEMA.DATABASE_REFRESH_PROGRESS_BY_JOB( '01AKNS554–0000–0326–0000–000014FD33A1'))
ORDER BY START_TIME;

3. REPLICATION_USAGE_HISTORY INFORMATION SCHEMA FUNCTION

This table function can be used to query the replication history for a specified database within a specified date range.
The information returned by the function includes the database name, credits consumed and bytes transferred for replication. This function returns replication usage activity within the last 14 days.

-- Retrieve the replication history for a date range for your account

SELECT *
FROM TABLE(INFORMATION_SCHEMA.REPLICATION_USAGE_HISTORY(
DATE_RANGE_START=>'2022–12–01 12:00:00.000 +0000',
DATE_RANGE_END=>'2022–12–29 12:30:00.000 +0000'));

--Retrieve the history for the last 24 hours for your account

SELECT *
FROM TABLE(INFORMATION_SCHEMA.REPLICATION_USAGE_HISTORY(
DATE_RANGE_START=>DATEADD(H, -24, CURRENT_TIMESTAMP)));

--Retrieve the history for the past week for your account

SELECT *
FROM TABLE(INFORMATION_SCHEMA.REPLICATION_USAGE_HISTORY(
DATE_RANGE_START=>DATEADD(D, -7, CURRENT_DATE),
DATE_RANGE_END=>CURRENT_DATE));

--Retrieve the replication history for the past week for a specified
--database in your account

SELECT *
FROM TABLE(INFORMATION_SCHEMA.REPLICATION_USAGE_HISTORY(
DATE_RANGE_START=>DATEADD(D, -7, CURRENT_DATE),
DATE_RANGE_END=>CURRENT_DATE,
DATABASE_NAME=>'DATABASE_FOR_REPLICATION'));

4. REPLICATION_USAGE_HISTORY VIEW

This Account Usage view can be used to query the replication history for a specified database for the last 365 days (1 year). It provides the same details as REPLICATION_USAGE_HISTORY information schema provides but for 365 days (1 year).

--Retrieve the replication history for a date range for your account

SELECT * FROM SNOWFLAKE.ACCOUNT_USAGE.REPLICATION_USAGE_HISTORY
WHERE START_TIME>='2022–12–01 12:00:00.000 +0000'
AND END_TIME<='2022–12–29 12:00:00.000 +0000';

--Retrieve the history for a database in you account

SELECT * FROM SNOWFLAKE.ACCOUNT_USAGE.REPLICATION_USAGE_HISTORY
WHERE DATABASE_NAME='DATABASE_FOR_REPLICATION';

--Retrieve the history for the past week for your account

SELECT * FROM SNOWFLAKE.ACCOUNT_USAGE.REPLICATION_USAGE_HISTORY
WHERE START_TIME>=DATEADD(D, -7, CURRENT_DATE)
AND END_TIME<=CURRENT_DATE;

5. DATABASE_REFRESH_HISTORY INFORMATION SCHEMA FUNCTION

This function returns database refresh activity within the last 14 days. This function returns the Bytes & Object counts for each replication

SELECT *
FROM TABLE(INFORMATION_SCHEMA.DATABASE_REFRESH_HISTORY(DATABASE_FOR_REPLICATION));

What are the cost implication for database replication?

Snowflake charge below for Database replication in secondary account:

  1. Data Transfer charges for transferring data from source to target region of choice.
  2. Compute charges for transferring the data from source to target region of choice.
  3. Storage charges in secondary account for all replicated database objects.

Note: Even if your replication fails snowflake charge you for storage for 14 days as those data is kept to save time as you can reuse already copied data without starting from scratch.

Hope this blog helps you to get insight into the database replication and failover/fail back feature in Snowflake. Feel free to ask a question in the comment section if you have any doubts regarding this. Give a clap if you like the blog. Stay connected to see many more such cool stuff. Thanks for your support.

You Can Find Me:

Subscribe to my YouTube Channel: https://www.youtube.com/c/RajivGuptaEverydayLearning

Follow me on Medium: https://rajivgupta780184.medium.com/

Follow me on Twitter: https://twitter.com/RAJIVGUPTA780

Connect with me on LinkedIn: https://www.linkedin.com/in/rajiv-gupta-618b0228/

#Keep learning #Keep Sharing #RajivGuptaEverydayLearning #SnowflakeDataSuperhero #RajivGupta

--

--