Zero copy cloning: Data warehousing Trends

chamathka maddugoda
Quick Insight
Published in
2 min readMay 24, 2023

Hello forks! Interesting in latest warehousing trends?. Let's grasp the latest trends in brief with Quick Insight: my latest publication for short posts.

Zero copy cloning is a new feature introduced by snowflake to create multiple clones of data tables, databases and schemas without any data replication. It allows multiple user groups to use data at no additional cost as it clones without creating additional copies.

How it works

Snowflake’s cloud service layer records the file information including file locations and references to data versions in the metadata repository. When data gets changed, metadata repository gets automatically updated with a pointer to the updated data. This enables user to immediately create a clone with just a simple command as below.

CREATE OR REPLACE TABLE MyTable_V2 CLONE MyTable

Use of Zero copy cloning

Zero copy cloning comes in handy when data in a data warehouse needs to be copied to for testing purposes. Traditionally, copying requires physical moving from production database to the copy database which is time consuming and expensive since we need to pay for storing data twice. Further, if data changes frequently copy needs to be updated where all of these gets eliminated with Zero clone as data resides in meta store.

Another key advantage is separation of storage and compute. When querying the data, it only charges for compute and not for storage as clone resides on provider’s account and doesn’t get replicated to consumer’s account.

Why it a trend?

As data warehousing is moving towards processing data real-time copying data physically can be tedious as data gets frequently changed. When it comes to large amounts of data this can be extremely costly in traditional approach. Since generation of data grows undoubtably multiple user groups being able to clone data without any cost at storage is essentially important. Not only snowflake but also other famous data warehousing platforms as data bricks have provided a similar feature with Delta sharing.

Hope you gained something out of the quick insight. And if you did, leave a response and stay connected for more tech stories in data science like this. Thank you!

--

--