Developing Snowflake Native Apps with Snowpark Container Services: Part I

The Snowflake Native Apps Framework enables application developers to create data applications that leverage core Snowflake functionality, and run entirely within the Snowflake’s single, global platform. A customer installing a Snowflake Native App, referred to as a consumer, has complete control over what this app can access in their environment, keeping all processing and data assets in the consumer’s account. A Snowflake Native App developer, referred to as a provider, has control over whether their code remains private to the consumer, which helps to protect their Intellectual Property. i.e. the internals of their app.

Snowpark Container Services enables you to run custom workloads as services or jobs within the Snowflake ecosystem. Developers can choose to use existing containerized workloads for an accelerated dev cycle, or write app code in their preferred language, packaging it as a container. Snowpark Container Services also provide a range of configurable hardware options for execution, including GPUs.

Using Snowpark Container Services within a Snowflake Native App gives developers the flexibility to create apps for just about any use case such as AI/ML workloads, sophisticated web experiences, data processing & analytics tools, etc., all with the safeguards provided by the Snowflake Native Apps Framework. This post will be your guide to the core concepts for developing such apps with containers, and will walk you through the development flow for Snowflake Native Apps and when to use its constructs. Where needed, the post will also draw your attention to differences when compared to developing in a typical container environment. If you would also like to get a quick overview of the architecture of Snowflake Native Apps with Snowpark Container Services, I recommended reading this blog post.

Getting started with Snowpark Container Services

Let’s start with briefly touching upon a standard development cycle using container images. The stages are: 1) Develop code locally 2) Build and push an image 3) Provision resources, and 4) Deploy & test a service. Once a Container Service is developed and tested, it can then be wrapped up in a Native Application for delivering to Consumers. We will first take a quick look at how each of the development stages work with Snowpark Container Services. Please note that Snowpark Container Services support creating long running services and executing a job service that terminates after the code exits. From here on, the blog post will use ‘service’ as a catch-all for services and jobs.

1. Develop code locally

For self-contained services (i.e no dependency on other Snowflake features), this stage is pretty much the same as developing any container service. Developers have complete flexibility in terms of the language they wish to develop in, the service architecture, etc. We run into custom features when these services begin interacting with Snowflake. If you are developing in Snowpark Container Services, or are developing for a Native Application, chances are that your service or job interacts with data stored in Snowflake or other Snowflake objects. Thus, when you start a service, Snowflake provides credentials to the running containers. These enable your container code to use drivers for connecting to Snowflake and executing SQL, similar to any other code on your computer connecting to Snowflake.

2. Build and push an image

Snowflake offers image registry, an OCIv2 compliant service, for storing your images. A Snowflake customer can create an image repository, build and push images to it. You can use any OCI-compliant client for creating the images, a handy tutorial to build images using Docker is available here. For developers who have previously worked with tools like Docker, there is little to no difference in this step.

3. Provision resources

This is where we completely step into Snowflake’s ecosystem. Snowflake provides compute pools, a collection of one or more virtual machine (VM) nodes on which Snowflake runs your Snowpark Container Services. Getting started with compute pools is as simple as choosing an instance family and configuring MIN_NODES and MAX_NODES. A developer can choose to run one or more services on a compute pool. Once a compute pool is created, Snowflake takes care of autoscaling, launching the minimum number of nodes and automatically allocating additional nodes when the running nodes cannot take any additional workload. Similarly, if no services run on a node for a specific duration, Snowflake automatically removes the node, ensuring that the compute pool maintains the minimum required nodes even after the removal.
Managing a compute pool details various states of a compute pool, and how to operate & monitor them.

4. Deploy & test a service

This is another step where we truly see the power of Snowflake’s fully managed container offering that is Snowpark Container Services. Snowflake abstracts most of the intricacies of container management, and enables developers to provide information for configuring and running a service via the service specification. The specification can be specified in the CREATE SERVICE command. The command also requires specifying a compute pool, accepts a QUERY_WAREHOUSE field to be used to run queries in Snowflake, EXTERNAL_ACCESS_INTEGRATIONS if the service has egress requirements, etc. Once the service is created, you can use the SYSTEM$GET_SERVICE_STATUS function to check its status, and can test it when it reaches READY status.

Wrapping container services into a Snowflake Native App

Native Applications framework enables packaging Snowflake features as a standalone application that can be installed across multiple accounts. It now includes some new features and behaviors to support the resource provisioning and service deployment of Container Services. Let’s now dive into the various Native Application features that enable this.

Manifest file

Every application package must have a manifest file. The manifest file contains the configuration and setup properties required by the application such as the location of the setup script, version information, log configuration, etc. It also allows you to specify container images that you wish to use for containers within the app. Please note that you must specify all the images that you wish to use. Here’s how you can specify the container images under the images list in the container_services section.

manifest_version: 1

version:
name: V1
label: "Version One"
comment: "The first version of my amazing Snowflake Native App"

artifacts:
readme: readme.md
setup_script: scripts/setup.sql
container_services:
images:
- /provider_db/provider_schema/provider_repo/server:prod
- /provider_db/provider_schema/provider_repo/web:1.0

/provider_db/provider_schema/provider_repo is the path to the image repository along with the repository name, and server:prod & web:1.0 are the image names along with their tags. The path specified here must point to an image repository in your account, i.e. the provider’s account.

Now, if you are using Snowpark Container Services that expose a web UI, you may want that UI to be the landing page of the application. You can do so by specifying a ‘default_web_endpoint field as follows. The service refers to the service name you give to the service when executing the CREATE SERVICE command and the ‘endpoint’ field refers to the associated endpoint in the service’s Specification file.

manifest_version: 1

version:
name: V1
label: "Version One"
comment: "The first version of my amazing Snowflake Native App"

artifacts:
readme: readme.md
setup_script: scripts/setup.sql
default_web_endpoint:
service: ux_schema.ux_service
endpoint: ui
container_services:
images:
- /provider_db/provider_schema/provider_repo/server:prod
- /provider_db/provider_schema/provider_repo/web:1.0

If you are considering other options for landing pages, or application UI, I recommend looking into Streamlit. You can specify a default_streamlit field in the artifacts where the Streamlit acts as the landing page for your application. Please note that ‘default_web_endpointanddefault_streamlitfields are mutually exclusive. You can only have one landing UI.

Setup script

A Snowflake Native App’s setup script contains the logic for setting up an application. The setup script is executed at installation, and for upgrading the app from one version to another. It contains SQL statements that create objects within the application such as database objects, stored procedures, views, application roles, etc. These statements also create functions & procedures that will be used post-install for resource provisioning for Snowpark Container Services, i.e. compute pools. Now, please note that I mentioned that the ‘procedures will be used post-install’, and not during the install or upgrade time. To understand this, let’s take a deeper look at how resource provisioning in the Snowflake Native Apps Framework works.

Creating compute pools for an app
For a minute, I would like to circle back to one of the key principles of the Snowflake Native Apps Framework. A consumer maintains full control over what the application can access in their environment. This includes managing compute resources. Snowflake enables its users to manage access to its various capabilities via privileges, which are further managed using Role Based Access Control (RBAC). The ability to create compute pools is managed by ‘CREATE COMPUTE POOL’ privilege. Privileges can also control other capabilities in an account. Another privilege the application will need to expose endpoints via a service in an account is ‘BIND SERVICE ENDPOINT’. These privileges are not available to an application by default and must be requested. In fact, the user that is performing the installation may not be sufficiently empowered to grant these privileges to the application, and must request privileges to be granted by another user in the organization with appropriate permissions. The procedures to create compute pools, and eventually create services in the compute pools can only be executed successfully once these privileges are granted. We will refer to such procedures later on as ‘resource provisioning procedure(s)’. Please note that any Resource Provisioning procedure(s) must be granted to an application role so that they can be executed by the consumer.

An application can request these privileges by specifying them in the manifest under the privileges key as shown below:

manifest_version: 1
...

privileges:
- CREATE COMPUTE POOL
description: "Enable appplication to create its own compute pool(s)"
- BIND SERVICE ENDPOINT
description: "Enables application to expose service endpoints"

Once an app is installed, a consumer will be able to grant the necessary privileges to the app in one of the supported mechanisms described ahead.

  1. Snowsight UI Configuration: Snowflake Native Apps with Snowpark Container Services come with a custom Snowsight configuration UI experience. When any privilege(s) are granted in this UI, the UI calls an app’s grant_callback field specified in the manifest. Grant_callback’s value is a procedure which receives a list of privileges as input, thus, this procedure is the perfect place to begin Resource Provisioning such as creating compute pools. This is the recommended mechanism for apps using containers.
manifest_version: 1

...
configuration:
log_level: debug
trace_level: always
grant_callback: setup.grant_callback

2. Streamlit: Snowflake offers a Permissions SDK with Streamlit that enables a developer to create pop ups for privilege grants. An app can begin resource provisioning once consumer grants the privileges via the pop up. You can specify a Streamlit in the default_streamlit field of the manifest, and use the streamlit to call the resource provisioning procedures.

3. SQL Worksheet: In case you prefer that the consumer interacts with your application via SQL, the consumer can grant privileges and call resource provisioning procedure(s) via SQL.

Here’s a sample setup script excerpt with a grant_callback named setup.grant_callback.

create or replace procedure setup.grant_callback(privileges array)
returns string
as $$
begin
CREATE COMPUTE POOL IF NOT EXISTS backend_compute_pool
MIN_NODES = 1
MAX_NODES = 1
INSTANCE_FAMILY = CPU_X64_XS;

CALL services.start_backend('backend_compute_pool');
...
return 'Callback successful';
end;
$$;
grant usage on procedure setup.grant_callback(array) to application role app_public;

This procedure first creates a compute pool, and then calls another procedure to create a service in that compute pool.

Deploying services inside an app
Once compute pool(s) have been created, the application can create one or more service(s). There are two possibilities at this point.

  1. Application already has everything it needs to be able to create its services: In this case, the application can simply execute CREATE SERVICE commands as part of grant_callback to set up its services. If the service needs a warehouse, you can also request ‘CREATE WAREHOUSE’ privilege in the manifest, and then create warehouses as needed.
  2. Application needs additional objects from the consumer before it creates its services: In some cases such as when the application needs to perform egress, it will need objects such as external access integrations or secrets from the consumer. A consumer can provide such objects to an application by binding them to a reference. References have a ‘register_callback’ field, a stored procedure that contains the logic to bind the reference. If a reference is required for creating a service, this procedure can be used to trigger service creation post reference binding. Here’s an example with a sample procedure from a setup script.
CREATE OR REPLACE PROCEDURE v1.register_single_callback(ref_name STRING, operation STRING, ref_or_alias STRING)
RETURNS STRING
LANGUAGE SQL
AS $$
BEGIN
-- Standard logic for binding a reference
CASE (operation)
WHEN 'ADD' THEN
SELECT system$set_reference(:ref_name, :ref_or_alias);
-- Logic to check if the reference should trigger create service
-- Calls other procedures for service creation
CASE (:ref_name)
WHEN 'external_access_backend_reference' THEN
CALL services.start_service_with_egress();
WHEN 'secret_reference' THEN
CALL services.start_service_using_secret();
END CASE;
WHEN 'REMOVE' THEN
SELECT system$remove_reference(:ref_name);
WHEN 'CLEAR' THEN
SELECT system$remove_reference(:ref_name);
ELSE
RETURN 'Unknown operation: ' || operation;
END CASE;
-- Logic to check if the reference we bound should trigger create service
-- Calls other procedures for service creation
CASE (:ref_name)
WHEN 'external_access_backend_reference' THEN
CALL services.start_service_with_egress();
WHEN 'secret_reference' THEN
CALL services.start_service_using_secret();

END CASE;
RETURN 'Operation ' || operation || ' succeeds.';
END;
$$;

The procedure first checks the operation type for the reference, and if it is an ‘ADD’ operation, it checks if the reference binding should trigger a service creation. If so, it calls other procedures for service creation.

Now, another point to note is that CREATE SERVICE commands refer to service specification files for service creation. Thus they must be packaged along with the application (typically with the manifest.yml).

Publishing a Snowflake Native App

Packaging and publishing a app using containers follows the usual steps of packaging and publishing a Snowflake Native App.

With this, you are now ready to develop and publish your first Snowflake Native App with Snowpark Container Services. I recommend trying ‘‘Create a Snowflake Native App with Snowpark Container Services’’ tutorial for hands on experience with an end to end example. Please keep a lookout for follow ups of this blog post that will dive into some handy features such as configuring external access for Snowflake Native Apps with Snowpark Container Services, working with service roles for service ingress, etc.

--

--