Join the AI Elite: Master the Untapped Power for Generative AI Applications with Snowflake and Streamlit

Published in

evolv Consulting

8 min readJan 10, 2024

Learn best practices on building a generative AI application from scratch — brought to you by evolv Consulting, one of only 29 Elite Tier partners of Snowflake in the US, known for its proven track record in delivering high-value Snowflake implementation and migrations.

The potential of Artificial Intelligence, especially Generative AI, has become a focal point for innovative companies. evolv Consulting, a leader in Strategy, Digital Transformation and Data & Analytics solutions, has embarked on a journey to demystify the process of building a Generative AI application. This article aims to share our expertise and insights, drawn from our extensive experience and Elite Tier Partnership with Snowflake.

Laying the Foundation

Why Snowflake for a Self-Contained Generative AI Application?

Snowflake’s architecture and suite of features enable the development of generative AI applications that can be mostly contained within the Snowflake environment. This is why Snowflake shines and why evolv Consulting, with numerous successful Snowflake implementations under its belt, recommends it as the go-to solution.

Key Features:

Data Management: Data is the lifeblood of AI. Efficient data management is not just about storage; it’s about accessibility, scalability, and security.
Snowpark: is a developer framework that allows for more sophisticated and streamlined data processing within Snowflake. It plays a crucial role in building generative AI applications by simplifying data engineering tasks.

Expert Tip: To optimize data processing in Snowpark, structure your data schemas to align with the specific AI model you’re implementing. This will reduce data preparation time significantly.

Container Services: Container Services within Snowflake refer to the ability to execute containerized applications and functions directly within the Snowflake environment. It provides a flexible, scalable, and efficient environment for data processing, model training, and deployment, aligning perfectly with the high demands of AI-driven solutions.

By choosing Snowflake, you can establish a solid foundation for your AI initiatives in the data cloud, enjoying the benefits of an accessible, fast, scalable, and secure architecture.

Streamlit and Snowpark Container Services

Streamlit, an open-source Python framework for machine learning and data science teams, combines well with Snowpark Container Services to create the ideal foundation for generative AI applications due to their complementary strengths in simplifying deployment and enhancing interactivity.

Streamlit is gaining traction as the preferred choice for constructing interactive and data-driven web applications, especially within the AI and data science landscape.

Snowpark Container Services offers a fully managed, Snowflake-integrated solution for deploying and managing containerized applications, without the need to move data outside of Snowflake. It simplifies application development by supporting a wide range of functionalities and languages, and ensures ease-of-use, security, and governance, with the added ability to share applications through the Snowflake marketplace.

Implementation

When integrating Streamlit with Snowflake, pay special attention to data handling. Streamlit excels in user interaction, but its full potential is unlocked when combined with efficient data processing and retrieval methods offered by Snowflake. For example: imagine a scenario in the healthcare sector where a Streamlit interface allows medical professionals to input patient data, which is then processed by Snowflake. This integration could power a generative AI model that predicts patient outcomes or suggests personalized treatment plans based on a vast dataset. This not only demonstrates efficient data handling but also showcases the real-world impact of generative AI in enhancing predictive healthcare analytics, providing quantifiable improvements in patient care and operational efficiency.

Learn about Streamlit’s newest developments, announced at Summit 2023 in Las Vegas.

Develop Your Streamlit Application

In this section, we’ll walk through the code necessary to integrate a simple Generative AI model into our Streamlit application. We’ll assume you’re using a text-based model like GPT-3, but the concepts can be adapted for other models.

Import Necessary Libraries

First, import Streamlit and any other libraries you need. If you’re using OpenAI’s GPT-3, you’ll also need to install and import the OpenAI library.

import streamlit as st
import openai

2. Set Up OpenAI API Key

To use GPT-3, you need an API key from OpenAI. Store this key securely and load it into your application.

openai.api_key = 'your-api-key-here'

Expert Tip: Store your API keys in Snowflake’s secure user-defined functions (UDFs) or external stages for enhanced security, rather than hardcoding them in your application.

3. Create Streamlit UI Elements

Create the user interface where users will input data for the AI to process.

st.title('Generative AI with Streamlit')
input_text = st.text_area("Enter your text here", height=150)
submit_button = st.button('Generate')

4. Define the Generative AI Function

Define a function to take user input and use the AI model to generate a response.

def generate_response(input_text):
    response = openai.Completion.create(
      engine="davinci", 
      prompt=input_text, 
      max_tokens=50
    )
    return response.choices[0].text.strip()

In this function, openai.Completion.create calls the GPT-3 model, with "davinci" as the engine. prompt is the user input, and max_tokens limits the length of the generated response.

5. Handling User Interaction

Use the Streamlit button to trigger AI response generation.

if submit_button:
    if input_text:
        with st.spinner('Generating response...'):
            generated_text = generate_response(input_text)
            st.write(generated_text)
    else:
        st.warning('Please enter some text to generate from.')

This code checks if the button is pressed and if there is input text. If so, it calls generate_response and displays the result. If no input is given, it prompts the user to enter some text.

6. Running the Streamlit App

Finally, to run your Streamlit app, simply use the following command in your terminal:

streamlit run streamlit_app.py

Replace streamlit_app.py with the name of your Python script.

We will eventually load this file into our Snowpark Container Services Repository.

Create a Dockerfile and Spec File

A Dockerfile is a script that outlines the steps to create a Docker container image, containing commands to assemble everything needed for an application to run — code, runtime, and dependencies. It ensures consistent, reproducible application deployment by automating the container build process, streamlining development and operations workflows.

Sample Dockerfile:

# Use an official Python runtime as a parent image
FROM python:3.8-slim

# Set the working directory in the container
WORKDIR /usr/src/app

# Copy the current directory contents into the container at /usr/src/app
COPY . .

# Install any needed packages 
RUN pip install --no-cache-dir \
                streamlit==1.12.2 \
                openai==0.10.2 \
                pandas==1.4.3 \
                numpy==1.23.1 \
                requests==2.28.1 \
                matplotlib==3.5

# Make port 8501 available to the world outside this container
EXPOSE 8501

# Define environment variable
ENV NAME World

# Run app.py when the container launches
CMD ["streamlit", "run", "app.py"]

Snowpark Container Services allows users to execute custom code in a secure and scalable environment. The service relies on a specification file, commonly known as a “spec file”, to define the environment in which your code runs. This spec file is essential as it describes the container image, system resources, and environment variables necessary for your application. It ensures that the code executes consistently by providing the required runtime environment, dependencies, and settings, similar to how a Dockerfile works for Docker containers.

Sample Spec File:

version: 1.0
container:
  image: /tutorial_db/data_schema/tutorial_repository/my_image:tutorial  memory: 4GB
  vCPU: 2
environment:
  - name: API_KEY
    value: "your_api_key_here"
  - name: OTHER_VAR
    value: "some_value"
endpoints:
  - name: myendpoint
    port: 8000
    public: true

Expert Tip: When defining your spec file, consider Snowflake’s compute resource offerings to balance cost and performance.

So far, you should have:

Dockerfile (no extension)
streamlit_app.py(.py extension)
spec_file.yaml or spec_file.yml (either .yaml or .yml extension)

Create a Snowpark Container Service

Before continuing ensure you have access to Snowpark Container Services (Currently in public preview for these AWS Regions).

Create a Snowflake repository and stage in the Snowflake UI.

Snowflake image repository: Create the repository you will eventually upload Docker images to.

CREATE OR REPLACE IMAGE REPOSITORY tutorial_repository;

Snowflake stage: Each service or job image includes a specification file that gives Snowflake information needed to run the service or job. You upload the specification files to the stage.

CREATE STAGE tutorial_stage DIRECTORY = ( ENABLE = true );

2. Build an image and upload:

Using the Docker CLI, we will build an image for the linux/amd64 platform that Snowpark Container Services supports.

We will execute this command inside our working directory that contains your Streamlit application:

docker build --rm --platform linux/amd64 -t my_image:tutorial .

Tag the image with the image URL (<repository_url>/<image_name>):

docker tag my_image:tutorial <repository_url>/<image_name>

To Authenticate Docker with the Snowflake registry:

docker login <registry_hostname> -u <username>

Upload the image to the repository:

docker push <repository_url>/<image_name>

To upload your service specification file to the Snowflake stage, use SnowSQL CLI:

PUT file://<absolute-path-to-spec.yaml> @tutorial_stage 
  AUTO_COMPRESS=FALSE 
  OVERWRITE=TRUE;

To create the service, execute the following command in the Snowsight web interface:

CREATE SERVICE my_service
  IN COMPUTE POOL tutorial_compute_pool
  FROM @tutorial_stage
  SPECIFICATION_FILE='spec.yaml'
  MIN_INSTANCES=1
  MAX_INSTANCES=1;

To find the URL of the public endpoint the service exposes:

DESCRIBE SERVICE my_service;

Example Output:

({"myendpoint":"asdfg-myorg-myacct.snowflakecomputing.app"})

View Application in your Browser

Append /ui to the endpoint URL, and paste it in the web browser to access your generative AI application.

Example URL:
asdfg-myorg-myacct.snowflakecomputing.app/ui

Conclusion

The integration of Snowflake’s Container Services with Streamlit opens a new frontier for developing generative AI applications. This powerful combination, highlighted through evolv Consulting’s expert insights, offers a unique blend of robust data management, efficient processing capabilities, and dynamic user interfaces. Key to success in this innovative domain is a structured approach to development, emphasizing meticulous data handling, security, and performance optimization.

As the AI landscape evolves, the synergy of Snowflake and Streamlit stands out as a pivotal framework for deploying sophisticated AI models swiftly and securely, empowering developers to explore the full potential of AI technologies in an accessible and manageable way. This guide not only demonstrates the technical feasibility of such integrations but also paves the way for future innovations in AI application development.

In our next article, we will dive deeper into generative AI, exploring expert strategies and cutting-edge techniques for implementing and maximizing the potential of this transformative technology. We will feature implementing RAG on Snowflake’s vector database, enabling users to integrate large language models (LLMs) for more sophisticated data querying and interpretation all within Snowflake’s ecosystem.

evolv aims to demystify the power of emerging technology for all businesses and aims to be as helpful and relevant as possible for you, our readers. What are your burning questions about generative AI? Are there specific challenges you’re facing, or particular aspects of AI you’re curious about? Drop us a comment or send in your queries and suggestions. Your input will help shape our content, ensuring it addresses your needs and sparks even more innovative ideas!

If you missed it, be sure to check out our previous article, “Boosting AI’s Power: How Retrieval-Augmented Generation and LlamaIndex Are Enriching LLM Responses.”