Snowflake Native Apps: The New Form of “Data Sharing”

A futuristic image of a cute polar bear using a network of interconnected databases. Generated by DALL.E

Snowflake Native Apps represent Snowflake’s new approach to data sharing.

This article aims to deepen your understanding of Snowflake Native Apps by diving into the history of data sharing and Snowflake.

The Challenge of Sharing Data

In the modern era, data is recognized as a new “oil”. However, the value of data doesn’t come from just possession. It only generates value when it reaches the hands of those who truly need it and is properly analyzed. Before the advent of Snowflake, sharing or selling data with other companies was extremely difficult.

Creating a World Where All Data Can Be Bought and Sold

Snowflake took on this data sharing challenge. With the vision of “creating a world where all data can be bought and sold,” Snowflake progressively added innovative features. The first addition was Direct Share.

Direct Share is a feature that allows sharing a database from one account to another specific account. It uses objects called share objects, enabling safe data sharing with third-party accounts without dumping or copying data.

Understanding Direct Share in 4 Steps

You can think of it as “packaging” data into a SHARE object in the provider account and “unpacking” it as a DB object in the consumer account.

1 Create a share object

CREATE SHARE TEST_SHARE

2 Grant read permissions on database/table to the share object (packaging)

GRANT USAGE ON DATABASE DB TO SHARE TEST_SHARE; 
GRANT USAGE ON SCHEMA DB.SCHEMA TO SHARE TEST_SHARE;
GRANT SELECT ON TABLE DB.SHARE.TABLE TO SHARE TEST_SHARE;sq

3 Specify the destination account for the share object

ALTER SHARE TEST_SHARE ADD ACCOUNTS=TEST_ORG.TEST_ACCOUNT;

4 Create a database from the share object in the consumer account (unpacking)

CREATE DATABASE TEST_SHARE FROM SHARE TEST_ORG.TEST_ACCOUNT.TEST_SHARE

The Problem with Direct Share

However, Direct Share had a significant challenge: it couldn’t share across different cloud providers or regions. For instance, sharing was possible between AWS Japan Region accounts, but not between an AWS Japan Region account and an AWS US Region account, or between an AWS Japan Region account and an Azure Japan Region account.

If sharing was absolutely necessary across different providers or regions, you had to open an account with the same cloud provider and region as the consumer, manually duplicate the share object and data, and then share it. This process was cumbersome and inefficient.

The Introduction of Listings

To solve this cross-provider and cross-region Direct Share problem, the listing feature was introduced. Listings are a function that wraps share objects and automatically replicates share objects. This eliminated the need for manual work to meet consumer demand.

Listing Feature

The World Where All Data Can Be Bought and Sold

With Direct Share and listing features, data providers no longer needed to perform any operations. This made it possible to automatically sell data in marketplaces.

The Problem with Share Objects

However, share objects had the disadvantage of not being able to store “data processing processes”. For example, processes like importing data from GA4 using Python stored procedures should be common across companies, but there was no way to share such data processing processes.

Towards a World Where All Data and Data Processing Processes Can Be Bought and Sold

The Emergence of Snowflake Native Apps

In June 2023, Snowflake Native Apps emerged to solve the problems with share objects. Snowflake Native Apps are essentially Application Package objects. These can be considered as evolved share objects. They are containers for “packaging” “data processing processes” in the provider account and “unpacking” them in the consumer account. By putting Application Package objects on listings instead of share objects, you can distribute Python stored procedures and other processes in the marketplace.

Comparison between Secure Data Share and Snowflake Native Apps

Secure Data Share and Snowflake Native Apps operate almost identically, with only the commands being different. Here are the key differences:

  • Sharing method: Secure Data Share uses share objects plus listings, while Snowflake Native Apps use App Package objects plus listings.
  • Storable objects: Secure Data Share can store tables, views, and SQL UDF, while Snowflake Native Apps can store tables, views, any UDF, any files (stage), and any stored procedures.
  • The container for Secure Data Share is SHARE, while for Snowflake Native Apps it’s APPLICATION PACKAGE.
  • The content for Secure Data Share is DATABASE, while for Snowflake Native Apps it’s APPLICATION.
  • The packaging command for Secure Data Share is CREATE SHARE <name>, while for Snowflake Native Apps it’s CREATE APPLICATION PACKAGE <name>.
  • The unpacking command for Secure Data Share is CREATE DATABASE FROM SHARE <name>, while for Snowflake Native Apps it’s CREATE APPLICATION FROM APPLICATION PACKAGE <name>.

Differences between Application Package Objects and Share Objects

When an Application Package object is installed, an Application object is unpacked. The main differences between database objects unpacked by share objects and Application objects are:

  1. Always has one stage. The stage always has a specific structure (has manifest.yml in the root, which points to one setup script (setup.sql))
  2. Setup script can be executed during unpacking
  3. Can specify Streamlit objects displayed in the App tab
  4. Can define custom roles within the Application

The Problem with Snowflake Native Apps

However, Snowflake Native Apps also had a challenge. It couldn’t reuse existing data processing process assets. In Snowflake Native Apps, only SQL, Python, or Java could be used to implement data processing processes. In reality, data processing processes might be written in Rust, Javascript, R, etc., so sharing and migrating existing data processing processes didn’t progress much.

Expanding Snowflake Native Apps

Despite these issues, a set of features to solve these problems was announced at the Data Cloud Summit in 2024.

Integration with Snowpark Container Service (Public Preview)

Snowflake Native Apps have been integrated with Snowpark Container Service. This integration enables deployment of data processing processes written in any UI or language to Snowflake.

Snowflake Trail (Public Preview)

As data processing processes within Snowflake become more complex, observability for cost management and explainability has become increasingly important. Snowflake Trail aggregates logs using Event Tables and Open Telemetry.

Snowflake continues to evolve its vision every year, from sharing data to sharing data processing processes.

--

--