Snowflake Snowpark Container Services- Explain, please!
Everything you wanted to ask but couldn't!
Introduction
If you are like me, I am sure you will be equally excited about the Snowpark Container Services (SPCS) announcement and demo at the Snowflake Summit 2023.
In my opinion, SPCS is a massive game-changer. Keeping all the marketing aside, the fact that you can now bring your application so close to your data will change a lot of how the Snowflake ecosystem works and how customers will use Snowflake going forward.
But it is also a lot to take in at once (especially if you are from a ‘data’ team and have no visibility or experience on the app architecture, full-stack, container services, dockers, orchestration, Kubernetes, blah, blah, etc.)
Keeping that in mind, below is a Quick 101 explaining everything you need to know about Snowpark Container Services. I hope this will help answer some basic questions. Please reply if you need anything further!
Caveat:
- The blog is not-too-technical and meant to just get a high-level understanding of the feature. Typical audience will be CDO, CTO, Data Lead, Architects, etc. If you like to read deep dive, here is a good one.
- Since the feature is in Private Preview, please keep yourself updated using the official Snowflake resources.
- Needless to say, views are my own! I am a Data Superhero and Snowflake power user! I also worked on container services in my past life so this seems a full circle!
What is Snowpark Container Service?
Referring to Snowflake’s official blog, Snowpark Container Service is a fully managed container offering that allows you to easily deploy, manage, and scale containerized services, jobs, and functions, all within the security and governance boundaries of Snowflake, and requiring zero data movement.
Here is a simple way to understand it. Snowflake Data Cloud was mainly the place where you kept your data. You could write some SQL, create some Store Procedures, UDFs, etc. But that was the limit of it. That was what I called ‘ Snowflake Gen 1’. Then came Snowpark services. You can now write a ‘data’ application within the Snowflake Snowpark framework. The was a massive move in the direction of creating a simple place for all Data and Data Apps! This was Snowflake Gen 2. Now with the container services, you can technically deploy any application (not just data) right where your data is in Snowflake. That’s Gen 3!
Is this a new Snowflake product?
Snowpark Container Service (SPCS) is a Snowflake feature, under its Snowpark umbrella.
What is the cost associated with SPCS? What is the pricing model?
Just like any Snowflake feature, SPCS will be available for all accounts* (once it is in GA). It should follow the same ‘consumption-based’ pricing model which we are very familiar with. Pay for what you use.
I don’t use Snowpark, does that means I can’t use SPCS?
No. But when you use SPCS you indirectly are going to use Snowpark. You can start with SPCS right away.
What part of my current Snowflake projects falls under the radar of SPCS?
Interesting question! This is where we think the Data team should sit with the Application Team and understand what part of the application workload is worth moving into SPCS. Snowflake has the concept of Workloads and workload segregation which is still holding its ground (kudos to the Snowflake CTO team)! All new Snowflake features work to provide a new experience. So if you are doing any data engineering, data warehousing, Data Lake, data analysis, etc. this works as is. SPCS can be a good fit for your Data Application need (see figure below)
What does the Preview feature mean?
Previews are similar to the beta versions, you get to try it early but the product is still in ongoing development and enhancements. You cannot productionise it unless it is GA. Please see the below stages from the Snowflake article.
What services, applications, etc that I am currently using are good candidates for SPCS?
Any application that can be containerized is a good use case for SPCS. This can be your front-end application, data app, data science models, or data engineering logic.
Take, for example, your application team created an in-house app for scheduling jobs that are currently running on any container service (on-prem, cloud Kubernetes cluster, Amazon EKS, Rancher, EC2, etc). This can be easily moved to SPCS. Apart from the relief of not maintaining any other infrastructure or services, you get the Snowflake’s usual: Fully managed, centrally governed, unified service experience.
To use SPCS, do I need to rewrite my code in Python or any other supported Snowpark language?
Na Na Na! With Snowpark container, you can run any programming language application (C/C++, Node.js, Python, R, React, etc). That has always been the USP of container service in general. You package your application (the programming language is encapsulated). The package can then run on the configured hardware and OS (CPUs or GPUs).
Can you explain to me what exactly containers are?
Container as a technology precedes the SPCS. Started around 2013 and initially developed by Docker, is now a de facto standard for deploying applications. A container is an isolated environment for your code with all dependencies in one bundle.
If you like to learn Docker, I can’t highly recommend Nana’s videos on Youtube!
What kind of applications can be deployed in a container?
Anything! Web applications, backend apps, ingestion pipeline, machine learning model, MLOps, decision training model, LLMS, etc. A container is deployment and orchestration technology which means any application that you develop can be deployed in containers. And any application that works on the container can be moved to SPCS.
That doesn’t mean you should move all your applications to SPCS. This is where deep dive discussion and decision will be taken! As a thumb rule, any application interfacing with data can be a good candidate!
So now my Data Engineerings, Data Analysts, and Data Architecture should learn containers?
Not really (unless, of course, they want to, which I highly recommend). Containerization has been a pivotal technology in terms of application deployment and orchestration. The beauty of a container is it encapsulates all the complexity. With SPCS, most of the other complexity of Container service will be encapsulated further. The below diagram shows the component of container services and how Snowflake clearly encapsulates it. All you do is push the docker images, everything else, Snowflake takes care of (Services, endpoint, function, registry, nodes, etc)
OK but what are the benefits of moving my containerized application to SPCS?
There are quite a few to start with. Firstly if your containerized application interacts with your Snowflake data, you avoid the ingress and egress overhead. Your application sits right where your data is. This indirectly provides you with robust governance and security. All of Snowflake's governance and security feature applies to everything within Snowflake, including SPCS. Then you also have to stop working on maintaining the container services (setup, maintenance, scaling, patching, etc).
Even now, I don’t have to sit and manage my Kubernetes cluster. There are already services that automatically patch, scale, etc. What is the value addition of SPCS here?
True, there are existing applications and features to help you achieve a fully-managed Kubernetes cluster (examples of AWS services: ECS, EKS, Fargate, ECR). The added benefit of Snowflake SPCS is that now the underlining containers are completely isolated and have 0 maintenance. Plus proximity near the data is a big deal!
I already have Snowflake and we are using it as the analytics platform. Whom should I talk to about the SPCS in my organization?
Every organization is structured differently, but if you only deal with data analytics, you should discuss SPCS with your core application team, data science team, etc. Since you can deploy any application on SPCS, it might be worth bringing all the tech stakeholders into a room and brainstorming what part of your container-based service has the potential to be moved to Snowflake. You can also discuss new use cases that can be deployed on SPCS.
Are there any organizations already using Snowpark Container Services?
The feature is in private preview, hence not all Snowflake customers get to try it. That said, there are a few launch partners who have used SPCS and created some cool data apps over and above it.
Sounds exciting! Is it available for us to try?
As of this writing, SPCS is currently in Private Preview and is available on a request basis. You can sign-up below to receive notification once it’s available. You can also try contacting your Snowflake Account Executive (AE) if you would like to join the preview.
Summary
Snowflake’s Snowpark Container Service is a pivotal feature that I think will change the landscape of the Data Cloud in general. It will be interesting to see the use cases and adoption once the feature is available in public!