Leveraging Snowpark Container Services for Advanced Q&A Retrieval: A Journey into GPU-Enhanced Semantic Search

At the end of 2023, Snowflake announced that Snowpark Container Services (SPCS) would enter Public Preview in a few AWS-based regions. This development presents a fantastic opportunity to test the service and experiment with GPUs. In this blog post, I recreate the SentenceTransformers’ example regarding semantic search to run exclusively on Snowpark Container Services.
The use case is focusing on Question & Answer Retrieval, which is asymmetric search task. It requires generation of embeddings of documents, and SPCS is a great fit for the task.

The steps to follow are relatively straightforward and are based on the “Intro to Snowpark Container Services” QuickStart. Thus, I’ll focus directly on the specifics of our use case. Initially, I’ll only mention some of the files, with the rest detailed towards the end of the post.

‘Create box with Snowflake logo. In that box put another box with docker and kubernetes logos. In that box put python and jupyter notebook logos and hugging face logo.’ Generated by DALL-E.

Folder structure

I am utilising Docker and Docker Compose to automate processes. My folder structure is as follows:

❯ tree -e
├── .env
├── docker-compose.yml
├── embed-base
│ ├── Dockerfile
│ └── requirements.txt
├── environment.yml
├── mounted_dirs
│ ├── data
│ ├── models
│ └── notebooks
│ └── semantic_search_wikipedia_qa.ipynb
├── setup
│ ├── 01_setup_database.sql
│ └── 02_setup_containers.sql
└── simple-wiki-jupyter
├── Dockerfile
├── requirements.txt
├── simple-wiki-jupyter-service_one-gpu.yml
└── simple-wiki-jupyter.sql

Quick look at some interesting files

One interesting file might be the docker-compose.yml:

version: "3.7"

services:
embed-base:
platform: linux/amd64
build:
context: ./embed-base
dockerfile: Dockerfile
image: "embed-base:latest"

simple-wiki-jupyter:
platform: linux/amd64
depends_on:
- embed-base
build:
context: ./simple-wiki-jupyter
dockerfile: Dockerfile
image: "simple-wiki-jupyter:dev"
ports:
- "8888:8888"
volumes:
- "./mounted_dirs/models:/home/jupyter/models"
- "./mounted_dirs/notebooks:/home/jupyter/notebooks"
- "./mounted_dirs/data:/home/jupyter/data"

simple-wiki-jupyter-sf:
depends_on:
- simple-wiki-jupyter
extends: simple-wiki-jupyter
image: "${SF_REPO}/simple-wiki-jupyter:dev"

This file defines three services: embed-base, simple-wiki-jupyter, and simple-wiki-jupyter-sf, along with dependencies between them.

  • embed-base is the first image to be built and contains common dependencies. Building the common part only once saves time when using multiple images with the same libraries.
  • simple-wiki-jupyter is the second image to be built. It is based on embed-base and includes everything needed to run Jupyter in Docker.
  • simple-wiki-jupyter-sf is an image that extends simple-wiki-jupyter and will be built last. It is essentially a copy of simple-wiki-jupyter with a different tag, defined by the ${SF_REPO} variable (read from the .env file).

Another notable file is semantic_search_wikipedia_qa.ipynb. It is a modified version of a Colab file using SentenceTransformers framework, with a few minor changes.

Exploration and control

You have probably noticed simple-wiki-jupyter.sql as well. It is used to create structures needed by our service, and to manage it.

First we create stages, if they do not exist yet.

USE DATABASE CONTAINERS_DB;
USE SCHEMA CONTAINERS_DB.SIMPLE_WIKI_SEARCH;
USE WAREHOUSE SIMPLE_WIKI_WH;

CREATE STAGE IF NOT EXISTS specs ENCRYPTION = (TYPE='SNOWFLAKE_SSE');
CREATE STAGE IF NOT EXISTS models ENCRYPTION = (TYPE='SNOWFLAKE_SSE');
CREATE STAGE IF NOT EXISTS notebooks ENCRYPTION = (TYPE='SNOWFLAKE_SSE');
CREATE STAGE IF NOT EXISTS data ENCRYPTION = (TYPE='SNOWFLAKE_SSE');

Then the image repository:

CREATE IMAGE REPOSITORY IF NOT EXISTS images;
SHOW IMAGE REPOSITORIES IN SCHEMA CONTAINERS_DB.SIMPLE_WIKI_SEARCH;

And to manage our compute pool, we can use the following commands:

DESCRIBE COMPUTE POOL SIMPLE_WIKI_POOL_GPU_S;
ALTER COMPUTE POOL SIMPLE_WIKI_POOL_GPU_S SUSPEND;
ALTER COMPUTE POOL SIMPLE_WIKI_POOL_GPU_S RESUME;

To create the service, the latest simple-wiki-jupyter-service_one-gpu.yml file must be uploaded to the specs stage. You can use the Snowflake command line (snow command), Snowflake web UI (browser), Visual Studio Code (VSCode), or custom code utilising the Snowpark API for this task. I chose to use VSCode, as this is a one-time task. For automation purposes, I would consider using a different tool.

DROP SERVICE IF EXISTS 
CONTAINERS_DB.SIMPLE_WIKI_SEARCH.SIMPLE_WIKI_JUPYTER_SERVICE_ONE_GPU;
-- upload simple-wiki-jupyter-service_one-gpu.yml to @specs
CREATE SERVICE
CONTAINERS_DB.SIMPLE_WIKI_SEARCH.SIMPLE_WIKI_JUPYTER_SERVICE_ONE_GPU
IN COMPUTE POOL SIMPLE_WIKI_POOL_GPU_S
FROM @specs
SPEC='simple-wiki-jupyter-service_four-gpu.yml'
EXTERNAL_ACCESS_INTEGRATIONS = (SIMPLE_WIKI_ACCESS_INTEGRATION);
;

To work with the service, the following commands are useful:

SHOW SERVICES IN SCHEMA CONTAINERS_DB.SIMPLE_WIKI_SEARCH;

DESCRIBE SERVICE
CONTAINERS_DB.SIMPLE_WIKI_SEARCH.SIMPLE_WIKI_JUPYTER_SERVICE_ONE_GPU;

CALL SYSTEM$GET_SERVICE_STATUS(
'CONTAINERS_DB.SIMPLE_WIKI_SEARCH.SIMPLE_WIKI_JUPYTER_SERVICE_ONE_GPU');

CALL SYSTEM$GET_SERVICE_LOGS(
'CONTAINERS_DB.SIMPLE_WIKI_SEARCH.SIMPLE_WIKI_JUPYTER_SERVICE_ONE_GPU',
'0',
'simple-wiki-jupyter-service-one-gpu',
100);

ALTER SERVICE
CONTAINERS_DB.SIMPLE_WIKI_SEARCH.SIMPLE_WIKI_JUPYTER_SERVICE_ONE_GPU SUSPEND;

ALTER SERVICE
CONTAINERS_DB.SIMPLE_WIKI_SEARCH.SIMPLE_WIKI_JUPYTER_SERVICE_ONE_GPU RESUME;

Now we can go ahead and deploy Jupyter and play with SentenceTransformers a bit!

Deployment

Before deploying anything (and after completing the setup, which is described at the end of the post), we need to build and tag our Jupyter Docker image. As for which tags to use? Choose whatever works best for you. However, to upload the image to the Snowflake repository, it must have the proper prefix. This can be determined by running a command included in simple-wiki-jupyter.sql:

SHOW IMAGE REPOSITORIES IN SCHEMA CONTAINERS_DB.SIMPLE_WIKI_SEARCH;

The column repository_url contains the text that you need to use as a prefix. It looks like this:

account.registry.snowflakecomputing.com/containers_db/simple_wiki_search/images

Copy your url and place it in .env file, like this:

SF_REPO=account.registry.snowflakecomputing.com/containers_db/simple_wiki_search/images

This approach enables the definition of a variable, SF_REPO, which is used in the docker-compose.yml file to create tags and prefixes. Thanks to this, simply by changing the value of this variable, you will be able to work with different repositories. To log in to the repository, run in your terminal:

docker login \
account.registry.snowflakecomputing.com/containers_db/simple_wiki_search/images \
-u containers_user

Build docker image

Now the image can be built. Just run:

docker compose build

This command will build the embed-base, simple-wiki-jupyter, and simple-wiki-jupyter-sf images. The one with the -sf suffix is similar to simple-wiki-jupyter but with a different tag. It will be uploaded to the Snowflake repository. Once the building process is complete, you can start Jupyter locally by running:

docker compose up simple-wiki-jupyter

It will run the image locally, allowing you to check if it works as expected. It will display all messages on the screen and wait until you stop the process with Ctrl+C.

Once you are satisfied with the results of the local tests, the image can be pushed to the Snowflake repository by running the following command (the -sf suffix indicates the version to upload to Snowflake):

docker compose push simple-wiki-jupyter-sf

After the image is uploaded — the time taken will depend on its size (mine is 6.5 GB, which is due to the dependencies) — you can proceed to deploy the service.

Deploy the service

To deploy the service, upload the simple-wiki-jupyter-service_one-gpu.yml file to the specs stage. I have used Visual Studio Code (VSCode) to do this. Ensure that your compute pool is active and running with the following command:

DESCRIBE COMPUTE POOL SIMPLE_WIKI_POOL_GPU_S;

If the compute pool is not running, you will experience a longer wait time for your service to start. The service will initiate the start of the compute pool, but you will have to wait until it is fully operational. If necessary, you can manually resume the compute pool with the following command:

ALTER COMPUTE POOL SIMPLE_WIKI_POOL_GPU_S RESUME;

And create the service:

CREATE SERVICE 
CONTAINERS_DB.SIMPLE_WIKI_SEARCH.SIMPLE_WIKI_JUPYTER_SERVICE_ONE_GPU
IN COMPUTE POOL SIMPLE_WIKI_POOL_GPU_S
FROM @specs
SPEC='simple-wiki-jupyter-service_one-gpu.yml'
EXTERNAL_ACCESS_INTEGRATIONS = (SIMPLE_WIKI_ACCESS_INTEGRATION);
;

A crucial aspect of the simple-wiki-jupyter-service_one-gpu.yml file is the section where resources are requested. As we are utilizing the SIMPLE_WIKI_POOL_GPU_S, which grants access to one Nvidia GPU, it is essential to request access to this resource. In this example, the number of requested GPUs is equal to the number provided by the pool. If you are working with an instance that has more GPUs — for example, four — you might consider running one service per GPU.

resources:
requests:
nvidia.com/gpu: 1
limits:
nvidia.com/gpu: 1

When you describe the service, you can find out its public endpoint from the public_endpoint column:

DESCRIBE SERVICE
CONTAINERS_DB.SIMPLE_WIKI_SEARCH.SIMPLE_WIKI_JUPYTER_SERVICE_ONE_GPU;

Copy the URL and paste it into your browser. You will be prompted to log in to Snowflake, use the CONTAINERS_USER credentials.

Login page for the service is managed by Snowflake. Image by author.

Next, upload the semantic_search_wikipedia_qa.ipynb file and you're set!

Jupyter Notebook running in Snowpark Container Services. Image by author.

Running Semantic Search

Now, we have the capability to run semantic search directly in Jupyter Notebook. It’s important to note that we are utilising pre-calculated embeddings downloaded from the sbert.net page, similar to the example on which this demo is based. Additionally, because we are using a GPU-based Compute Pool (specifically GPU_NV_S with one A10G Nvidia GPU), our code runs very quickly.

Search works just like in the example by sbert.net. Image by author.

What we did

Based on the example provided, we have successfully re-created semantic search in a Jupyter Notebook running on Snowpark Container Services with an Nvidia GPU:

  1. Database Setup by ACCOUNTADMIN: The usual process of creating a database, user, role, etc., when working in Snowflake. The code for this is at the end of the post.
  2. Security Integration and Network Rules: Essential for the Compute Pool created in the subsequent step, particularly if internet access is required for the service and if the service needs internet access. A container repository for Docker images was also created. CONTAINERS_USER owns all these resources.
  3. Docker Container Image Creation: A Docker container image with Jupyter Notebook was developed and tested locally before being uploaded to the Snowflake repository.
  4. Creation of Stages for Jupyter Notebook: Necessary for caching and saving state between sessions. This approach means new models don’t need to be repeatedly downloaded from the internet, especially with each restart of Jupyter. It also allows running Jupyter without internet access, enhancing security.
  5. Service Creation in SIMPLE_WIKI_POOL_GPU_S: The service, along with mountpoints and resources, is based on a specification YAML file loaded from the specs stage, which was uploaded there initially. The service is set up with external access integration as defined in SIMPLE_WIKI_ACCESS_INTEGRATION.
  6. Service Start and Public Endpoint Access: After the service starts (verifiable directly from the SQL console), the public endpoints of the service are identified to open it in a browser. Accessing the service via this URL requires authentication using the same credentials as the Snowflake CONTAINERS_USER. Notably, no additional security mechanism at the level of Jupyter Notebook is necessary, thanks to Snowflake’s OAuth-based security integration.
  7. Uploading and Running the Jupyter Notebook: The Jupyter notebook was uploaded to the server, enabling us to replicate the functionality seen in sentence-transformers.

What is next?

Now that the entire environment is set up, we can use it to, for example, create our own embeddings and run a REST API that will provide us with matching descriptions.

This capability paves the way for creating your own search engine that operates on top of your documents or texts, which can be stored directly in Snowflake. Exciting, isn’t it? I certainly think so!

Code

You can find the whole code here.

Thanks for Reading!

If you like my work and want to support me…

  1. The BEST way to support me is by following me on Medium.
  2. Also, follow me on LinkedIn or Twitter/X.
  3. Feel free to give claps so I know how helpful this post was for you.

Note: These are my personal opinions and not of my current employer (Snowflake).

--

--

Bart Wrobel
Snowflake Builders Blog: Data Engineers, App Developers, AI/ML, & Data Science

Architect, Data Engineer, Evangelist and Data Cloud Partner Enabler in EMEA at Snowflake - The Data Cloud. Presenting personal opinions, nothing else.