Running RStudio in Snowpark Container Services

Gabriel Mullen
4 min readDec 15, 2023

--

Update Aug 7 2024: Posit now provides Workbench as a Native Application on the Snowflake Marketplace. This removes all maintenance that you would have to do as part of deploying and orchestrating the steps in thie article moot.

Code example repo here.

I ran across this article on Running R in Snowpark Container Services. It’s great! But when I tried to duplicate it, aside from not having a Posit Workspace Key, I had some issues with the details. So I wanted to just document the steps that I took to get RStudio to run in SPCS!

First I used an image from The Rocker Project. And one of the struggles I ran into with a Mac was the image that would get pulled was not compatible with SPCS which needs to be linux/amd64. So first I had to copy down this Dockerfile. I stored it in my Downloads folder which is where I keep all my important stuff. Then I executed the following command in a terminal to build the image for the correct target arch.

docker build --platform linux/amd64 --tag rocker_base .

The build command accesses the Dockerfile that’s in the same directory and builds the image. Running docker images shows the resulting image that we can use.

Before we push the image to Snowflake, we need to setup those pieces. Switch to a Snowflake Worksheet and run the following. I used SECURITYADMIN to create a new role because the SERVICE cannot run as a privileged role like ACCOUNTADMIN. But I did use ACCOUNTADMIN to create the compute pool, warehouse, db, and repo, but then granted the proper permissions to the GMULLEN_RL.

use role securityadmin;
create role gmullen_rl;
CREATE COMPUTE POOL gmullen_pool
MIN_NODES = 1
MAX_NODES = 1
INSTANCE_FAMILY = standard_1;
grant all on compute pool gmullen_pool to role gmullen_rl;

describe compute pool gmullen_pool;

CREATE OR REPLACE WAREHOUSE gmullen_vwh
WITH WAREHOUSE_SIZE='X-SMALL'
AUTO_SUSPEND = 180
AUTO_RESUME = true
INITIALLY_SUSPENDED=false;
grant all on warehouse gmullen_vwh to role gmullen_rl;

CREATE DATABASE gmullen_db;
grant ownership on database gmullen_db to role gmullen_rl;

CREATE OR REPLACE IMAGE REPOSITORY gmullen_img_repo;
grant ownership on IMAGE REPOSITORY gmullen_db.public.gmullen_img_repo to role gmullen_rl;

show image repositories;

The last line the shows the repo is important because that’s where you will grab your account’s specific repo endpoint.

Go back to your terminal and login to your repo using docker and your repo endpoints. Then push the image you built into your repo in Snowflake. Here are the commands I used as an example:

docker login sfsenorthamerica-va-demo103.registry.snowflakecomputing.com/gmullen_db/public/gmullen_img_repo
docker tag 3f584b0854ef sfsenorthamerica-va-demo103.registry.snowflakecomputing.com/gmullen_db/public/gmullen_img_repo/rocker_base
docker push sfsenorthamerica-va-demo103.registry.snowflakecomputing.com/gmullen_db/public/gmullen_img_repo/rocker_base

Next create the service yaml file that will be used to configure the image runtime. It’s probably not a best practice to DISABLE_AUTH, but I felt in this case, because I was testing it was ok. Further, you still need to login into Snowflake to access the endpoints. So while the endpoint will be “public”, it will be so for only Snowflake authenticated users with permissions to access the service. We’ll see this in action shortly. I saved the content below into a file called rocker_base.yaml.

spec:
container:
- name: rockerbase
image: sfsenorthamerica-va-demo103.registry.snowflakecomputing.com/gmullen_db/public/gmullen_img_repo/rocker_base
env:
DISABLE_AUTH: true
endpoint:
- name: e1
port: 8787
public: true

I know what the endpoint port is because, in the Dockerfile that we used to build the image, it exposes 8787 as the port. Back to Snowflake to create an internal stage to hold my yaml file.

create or replace stage gmullen_db.public.config_files encryption = (type = 'SNOWFLAKE_SSE');
grant ownership on stage gmullen_db.public.config_files to role gmullen_rl;sql

Now upload the rockerbase.yaml to the stage using SnowSQL:

put file:///Users/gmullen/Downloads/rockerbase.yaml @gmullen_db.public.config_files auto_compress=false overwrite=true;

This should give us all the pre-requisties, now it’s time to create the SERVICE which will instantiate the image.

CREATE SERVICE gmullen_db.public.rocker_base
MIN_INSTANCES=1
MAX_INSTANCES=3
COMPUTE_POOL=gmullen_pool
SPEC=@gmullen_db.public.config_files/rocker_base.yaml;

grant usage on service gmullen_db.public.rocker_base to role gmullen_rl;

Describe the service to get the end point. Also check the status and the logs to see if you have any problems.

desc service gmullen_db.public.rocker_base;
select system$get_service_status('rocker_base');
SELECT SYSTEM$GET_SERVICE_LOGS('gmullen_db.public.rocker_base', 0, 'rockerbase', 50);

The Describe results will show you the endpoints if the service comes up successfully.

Copy and paste the endpoint into your browser and you will be required to login with your Snowflake User that has USAGE rights to the Service.

Then once you authenticate, you should be in! Happy RRRRing! And Super Duper thanks to Jim ONeill for his help on getting me going.

--

--

Gabriel Mullen

Sales Engineer at Snowflake. All content are solely that of Gabriel Mullen.