Run 3 useful LLM inference jobs in minutes with Snowflake Cortex

Published in

Snowflake Builders Blog: Data Engineers, App Developers, AI/ML, & Data Science

5 min readFeb 28, 2024

Overview

Getting started with AI on enterprise data can seem overwhelming, between getting familiar with LLMs, how to perform custom prompt engineering, and how to get a wide range of LLMs deployed/integrated to run multiple tests all while keeping that valuable enterprise data secure. Well, a lot of these complexities are being abstracted away for you in Snowflake Cortex — currently in Public Preview.

Quick Demo

Let’s get started!

In this blog, I will go through two flows — for the first three examples I will not have to worry about prompt engineering and, as a bonus, another example where I will build a prompt for a custom task.

Dataset

Prior to GenAI, a lot of the information was buried in text format and therefore going underutilized for root cause analysis due to complexities in implementing natural language processing. But with Snowflake Cortex it’s as easy as writing a SQL statement!

In this demo, we’ll utilize synthetic call transcripts data, mimicking text sources commonly overlooked by organizations, including customer calls/chats, surveys, interviews, and other text data generated in marketing and sales teams.

Let’s create the table and load the data.

Create Table and Load Data

CREATE or REPLACE file format csvformat
  SKIP_HEADER = 1
  FIELD_OPTIONALLY_ENCLOSED_BY = '"'
  type = 'CSV';

CREATE or REPLACE stage call_transcripts_data_stage
  file_format = csvformat
  url = 's3://sfquickstarts/misc/call_transcripts/';

CREATE or REPLACE table CALL_TRANSCRIPTS ( 
  date_created date,
  language varchar(60),
  country varchar(60),
  product varchar(60),
  category varchar(60),
  damage_type varchar(90),
  transcript varchar
);

COPY into CALL_TRANSCRIPTS
  from @call_transcripts_data_stage;

Here’s what the call transcripts table and data should look like:

Snowflake Cortex

Given the dataset above, let’s see how we can use Snowflake Cortex. It offers access to industry-leading AI models, without requiring any knowledge of how the AI models work, how to deploy LLMs, or how to manage GPU infrastructure. (Check availability in your region.)

Translate

Using Snowflake Cortex function snowflake.cortex.translate we can easily translate any text from one language to another. Let’s see how easy it is to use this function….

select snowflake.cortex.translate('wie geht es dir heute?','de_DE','en_XX');

Executing the above SQL should generate “How are you today?”

Now let’s see how you can translate call transcripts from German to English in batch mode using just SQL.

select transcript,snowflake.cortex.translate(transcript,'de_DE','en_XX')
from call_transcripts 
where language = 'German';

Sentiment Score

Now let’s see how we can use snowflake.cortex.sentiment function to generate sentiment scores on call transcripts.

Note: Score is between -1 and 1; -1 = most negative, 1 = positive, 0 = neutral

select transcript, snowflake.cortex.sentiment(transcript) 
from call_transcripts 
where language = 'English';

Summarize

Now that we know how to translate call transcripts in English, it would be great to have the model pull out the most important details from each transcript so we don’t have to read the whole thing. Let’s see how snowflake.cortex.summarize function can do this and try it on one record.

select transcript,snowflake.cortex.summarize(transcript) 
from call_transcripts 
where language = 'English' limit 1;

Bonus section!

Prompt Engineering

Being able to pull out the summary is good, but it would be great if we specifically pull out the product name, what part of the product was defective, and limit the summary to 200 words. Let’s see how we can accomplish this using the snowflake.cortex.complete function.

SET prompt = 
'### 
Summarize this transcript in less than 200 words. 
Put the product name, defect and summary in JSON format. 
###';

select snowflake.cortex.complete('llama2-70b-chat',concat('[INST]',$prompt,transcript,'[/INST]')) as summary
from call_transcripts where language = 'English' limit 1;

Here we’re selecting the Llama 2 70 billion parameter model and giving it a prompt telling it how to customize the output. Here’s a sample output for one of the transcripts:

{
  "product": "XtremeX helmets",
  "defect": "broken buckles",
  "summary": "Mountain Ski Adventures received a batch of XtremeX helmets with broken buckles. The agent apologized and offered a replacement or refund. The customer preferred a replacement, and the agent expedited a new shipment of ten helmets with functioning buckles to arrive within 3-5 business days."
}

NOTE: You can also use other models like Mistral and Gemma using the same syntax.

select snowflake.cortex.complete('mixtral-8x7b',concat('[INST]',$prompt,transcript,'[/INST]')) as summary
from call_transcripts where language = 'English' limit 1;

select snowflake.cortex.complete('mistral-7b',concat('[INST]',$prompt,transcript,'[/INST]')) as summary
from call_transcripts where language = 'English' limit 1;

select snowflake.cortex.complete('gemma-7b',concat('[INST]',$prompt,transcript,'[/INST]')) as summary
from call_transcripts where language = 'English' limit 1;

Learn more and check out the latest set of models
https://docs.snowflake.com/en/user-guide/snowflake-cortex/llm-functions

Streamlit Application

To put it all together, I’ve also created a Streamlit application in Snowflake. If you’d like to replicate it, here’s the source code that you can simply copy-paste.

import streamlit as st
from snowflake.snowpark.context import get_active_session

st.set_page_config(layout='wide')
session = get_active_session()

supported_languages = {'German':'de','French':'fr','Korean':'ko','Portuguese':'pt','English':'en','Italian':'it','Russian':'ru','Swedish':'sv','Spanish':'es','Japanese':'ja','Polish':'pl'}
supported_llms = ['mixtral-8x7b','mistral-7b','llama2-70b-chat','gemma-7b']

def translate():
    with st.container():
        st.header("Translate With Snowflake Cortex")
        col1,col2 = st.columns(2)
        with col1:
            from_language = st.selectbox('From',dict(sorted(supported_languages.items())))
        with col2:
            to_language = st.selectbox('To',dict(sorted(supported_languages.items())))
        entered_text = st.text_area("Enter text",label_visibility="hidden",height=300,placeholder='For example: call transcript')
        if entered_text:
          entered_text = entered_text.replace("'", "\\'")
          cortex_response = session.sql(f"select snowflake.cortex.translate('{entered_text}','{supported_languages[from_language]}','{supported_languages[to_language]}') as response").to_pandas().iloc[0]['RESPONSE']
          st.write(cortex_response)

def sentiment_analysis():
    with st.container():
        st.header("Sentiment Analysis With Snowflake Cortex")
        entered_text = st.text_area("Enter text",label_visibility="hidden",height=400,placeholder='For example: call transcript')
        if entered_text:
          entered_text = entered_text.replace("'", "\\'")
          cortex_response = session.sql(f"select snowflake.cortex.sentiment('{entered_text}') as sentiment").to_pandas()
          st.caption("Score is between -1 and 1; -1 = Most negative, 1 = Positive, 0 = Neutral")  
          st.write(cortex_response)
            
def summarize():
    with st.container():
        st.header("JSON Summary With Snowflake Cortex")
        selected_llm = st.selectbox('Select Model',supported_llms)
        entered_text = st.text_area("Enter text",label_visibility="hidden",height=400,placeholder='For example: call transcript')    
        if entered_text:
            entered_text = entered_text.replace("'", "\\'")
            prompt = f"Summarize this transcript in less than 200 words. Put the product name, defect if any, and summary in JSON format: {entered_text}"
            cortex_prompt = "'[INST] " + prompt + " [/INST]'"
            cortex_response = session.sql(f"select snowflake.cortex.complete('{selected_llm}', {cortex_prompt}) as response").to_pandas().iloc[0]['RESPONSE']
            if selected_llm != 'gemma-7b':
                st.json(cortex_response)
            else:
                st.write(cortex_response)

page_names_to_funcs = {
    "Translate": translate,
    "Sentiment Analysis": sentiment_analysis,
    "JSON Summary": summarize
}

selected_page = st.sidebar.selectbox("Select", page_names_to_funcs.keys())
page_names_to_funcs[selected_page]()

If all goes well, you should see the app running:

Note: The Snowflake Cortex SQL and Streamlit code referenced in this blog is also available here https://github.com/Snowflake-Labs/sf-samples/blob/main/samples/snowfake-cortex/llm-inference-blog

Conclusion

In minutes we went from unstructured text data from call transcripts to insights using Snowflake Cortex as the LLM service and Streamlit running in Snowflake as a frontend application. Everything we did happened within Snowflake so data remained protected and secure throughout the full stack across data, LLM and the Streamlit app. And to leverage the latest innovation, I didn’t have to manage infrastructure, or have in-depth knowledge of how the models work.

Learn more about Snowflake Cortex for Generative AI.

Thanks for your time! Hope you found this blog educational and inspiring. Connect with me on Twitter and LinkedIn where I share demos, code snippets, QuickStart Guides, and other interesting technical artifacts. Be sure to also check out Snowflake For Developers.