Streamlit in Snowpark Container Services

Snowpark Container Services is now in Public Preview in AWS, and we’ve seen folks building and deploying lots of applications in it. Streamlit continues to be a popular tool, especially amongst Data Scientists and Data Engineers, for quickly building applications entirely via Python. Essentially, any Python developer can quickly become a front-end application developer, too — from the comfort of their Python programming language.

Deploying Streamlit apps for others to access is a little trickier. Finding computing resources, standing up the assets, and securing the app can get a little complex… or, you could use something as simple as Snowpark Container Services!

In this post, I will walk through how to take your Streamlit app and deploy it simply and securely to Snowpark Container Services.

Here is a link to the GitHub repo that shows the example that we will walk through: https://github.com/sfc-gh-bhess/st_spcs

Overview

The high-level steps that we need to follow to deploy our Streamlit app in Snowpark Container Services are as follows:

  • Prepare our Snowflake account by setting up various roles, permissions, and object, including a COMPUTE POOL for our SERVICE, a WAREHOUSE to handle queries from our Streamlit, and an IMAGE REPOSITORY for our Docker image.
  • Build a Docker image from our Streamlit code and push it to the IMAGE REPOSITORY.
  • Create a SERVICE using the uploaded Docker image that will expose an ingress URL that we can use to access the Streamlit
  • Share access to the SERVICE by granting permission to use the SERVICE.
  • Visit the ingress URL and enjoy!

To perform the steps, you will need:

  • Snowflake account with ACCOUNTADMIN permission in a region that has Snowpark Container Services (see here — at the time of writing, this is all AWS commercial regions).
  • Docker desktop installed.
  • (Optional) git installed. You can use git to clone the repository, or you can download the ZIP file of the repository if you do not have git.
  • The SNOWFLAKE_SAMPLE_DATA data share imported into your Snowflake account. The Streamlit will use this data for the example.

The Streamlit App

The Streamlit app we are working with is pretty simple. It basically uses the TPC-H data in the SNOWFLAKE_SAMPLE_DATA data share and allows users to specify a date range and shows the top clerks based on sales in that date range. The goal is just to show basic capabilities using data inside Snowflake.

The app was built to be run either locally (using environment variables to specify the connection details) or from within Snowpark Container Services. It uses a Python package in the source directory (spcs_helpers) to simplify making the connection in either scenario. This is the same approach as I blogged about here. I highly recommend using an approach like I wrote about to support both local development (directly and inside Docker) and in Snowpark Container Services.

Setup

The first thing to do is to get a copy of the GitHub repo. You can use git to clone the repo

git clone git@github.com:sfc-gh-bhess/st_spcs.git

or download the ZIP file.

The next thing we need to do is the one-time setup in our Snowflake account. If you have already done this in your account, you can move on to the next step. Follow the instructions here to:

  • Create the snowservices_ingress_oauth SECURITY INTEGRATION. This only needs to be done once per account.
  • Create a test_role to use for this example
  • Create a tutorial_db database and tutorial_warehouse and grant permissions to the test_role.
  • Grant the test_role the permission to create ingress URLs, BIND SERVICE ENDPOINT.
  • Create a COMPUTE POOL, tutorial_compute_pool, and grant permission to use and monitor it to test_role.
  • Grant the test_role to your user (and to ACCOUNTADMIN)
  • Use the test_role to create a schema, data_schema, and an IMAGE REPOSITORY in that schema.

Next we will use the included ./configure.sh script to create a Makefile and a streamlit.yaml file. In order to do that we are going to need 2 things:

  • the URL of the IMAGE REPOSITORY, which you can get by executing SHOW IMAGE REPOSITORIES and getting the value of the repository_url for your tutorial_repository.
  • the WAREHOUSE that we created above

Then we can run the configure script:

./configure.sh
  • Enter the repository URL you got above
  • Enter the WAREHOUSE name

The Makefile will help guide you through the steps. To get help on the commands, run

make help

Build the Docker image locally

The Dockerfile for this example is already set up. It will copy the source files from the src/ directory, install the requirements from the src/requirements.txt file, and set the ENTRYPOINT to the src/entrypoint.sh script. That entrypoint script simply executes the following command and outputs stderr to stdout so that it is captured in the logs:

python3 -m streamlit run app.py --server.address=0.0.0.0

To build the Docker image for local development, run

make build_local

This will build the Docker image for your machine. You can test this container locally by first setting up some environment variables in your terminal:

  • SNOWFLAKE_ACCOUNT — the account locator for the Snowflake account
  • SNOWFLAKE_USER — the Snowflake username to use
  • SNOWFLAKE_PASSWORD — the password for the Snowflake user
  • SNOWFLAKE_WAREHOUSE — the warehouse to use
  • SNOWFLAKE_DATABASE — the database to set as the current database (does not really matter that much what this is set to)
  • SNOWFLAKE_SCHEMA — the schema in the database to set as the current schema (does not really matter that much what this is set to)

Once you have done that, you can run the Streamlit container locally with

make run

and access the Streamlit at http://localhost:8051.

Streamlit in Snowpark Container Services

Now it’s time for the main event — getting this Streamlit set up in Snowpark Container Services.

The first step is to build the Docker image for the Snowpark Container Services environment (which may be different than your local environment) and push the image to your Snowpark Container Services IMAGE REGISTRY.

make all

Now that the image has been uploaded to your IMAGE REGISTRY, we can create the SERVICE in the COMPUTE POOL. To get the DDL, run

make ddl

which should result in output such as

CREATE SERVICE st_spcs
IN COMPUTE POOL tutorial_compute_pool
FROM SPECIFICATION $$
spec:
containers:
- name: streamlit
image: ORGNAME-ACCTNAME.registry.snowflakecomputing.com/sandbox/idea/repo1/st_spcs
env:
SNOWFLAKE_WAREHOUSE: tutorial_warehouse
endpoints:
- name: streamlit
port: 8501
public: true
$$;

As the test_role, execute this SQL in a Worksheet.

You can see that the service has started by getting its status via

SELECT system$get_service_status('st_spcs');

You should get a result that says PENDING, and eventually READY when the service has started. You can also view the Docker logs/output by running

SELECT system$get_service_logs('st_spcs', 0, 'streamlit', 100); ## Gets the last 100 lines of output

Accessing the Streamlit in Snowpark Container Services

Once the SERVICE has started, it’s time to go visit our Streamlit!

To get the ingress URL, run

SHOW ENDPOINTS IN SERVICE st_spcs;

You will see a table that shows the port that we exposed (8501) and the ingress_url that we can use to visit that enpoint. Copy the ingress_url and paste it into your browser. When you navigate to that URL, you will be prompted by Snowflake to authenticate. Authenticate using your credentials (username/password or (optionally) SSO) and you will be routed to the Streamlit container. Voila!

You can grant access to the Streamlit to other users and roles by granting the USAGE role on the SERVICE:

GRANT USAGE ON SERVICE st_spcs TO ROLE some_role; 

Now users with the some_role role can access the ingress_url for our SERVICE!

Summary

In this post, I showed you how to take a Streamlit app and deploy it to Snowpark Container Services. This Streamlit was simple, as I was focusing on the packaging and deployment steps, but the same approach can be taken for more complex Streamlit apps.

Also, please note the approach to connecting to Snowflake that I used (and blogged about here). This can greatly simplify development, testing, and deployment of your Streamlit apps in Snowpark Container Services.

--

--

Brian Hess
Snowflake Builders Blog: Data Engineers, App Developers, AI/ML, & Data Science

I’ve been in the data and analytics space for over 25 years in a variety of roles. Essentially, I like doing stuff with data and making data work.