Snowflake Cortex-RAG from External API ❄️

Published in

Snowflake Builders Blog: Data Engineers, App Developers, AI/ML, & Data Science

4 min readMay 21, 2024

External API Integration + Vector Type + LLM

Create a fast and quick RAG functionality -Retrieval Augmented Generation- by reading data from external Web Service (API) over a secure integration from inside Snowflake, with that store the learning corpus as RAW data to be convert -using Snowflake Artic LLM - into embedding to search for similarity to “talk” with your own data.
We will connect to Ninja API via access token to get information about Historical Characters service.

Create Snowflake Objects

use role accountadmin;

create database HISPERSON;

create or replace warehouse python_wh
warehouse_size = 'X-SMALL'
auto_suspend = 60
auto_resume = true
initially_suspended = true;

Create External Access Integration

We will use our API Access Token form API Provider (Ninja API) to stage it as “secret” data and assign a Network rule to indicate Snowflake egress rule to API endpoint (in this case api.api-ninjas.com)

CREATE NETWORK RULE ninja_rule
  TYPE = HOST_PORT
  MODE = EGRESS
  VALUE_LIST = ('api.api-ninjas.com');


 CREATE or REPLACE SECRET ninja_api_key
    TYPE = password
    USERNAME = 'NOT_USED'
    PASSWORD = '<Your Access Token>';



CREATE OR REPLACE EXTERNAL ACCESS INTEGRATION ninja_integration
  ALLOWED_NETWORK_RULES = (ninja_rule)
  ALLOWED_AUTHENTICATION_SECRETS = (ninja_api_key)
  ENABLED = true;

Create UDF(User Defined Function)

The historical_figure function will receive name of the character as parameter to perform request to API endpoint using the secret and external access integration already, as reponse we will send out the JSON structure of data

CREATE OR REPLACE FUNCTION historical_figure(person STRING)
RETURNS STRING
LANGUAGE PYTHON
RUNTIME_VERSION = 3.10
HANDLER = 'get_history'
EXTERNAL_ACCESS_INTEGRATIONS = (ninja_integration)
PACKAGES = ('requests')
SECRETS = ('api_key' = HISPERSON.public.ninja_api_key)
AS
$$
import requests
import _snowflake 

def get_history(person):
    api_url = 'https://api.api-ninjas.com/v1/historicalfigures?name={}'.format(person)
    api_key = _snowflake.get_username_password('api_key').password
    response = requests.get(api_url, headers={'X-Api-Key': api_key})
    if response.status_code == requests.codes.ok:
        return response.text
    else:
        return "Error: " + response.status_code + ":" + response.text
$$;

Checking the UDF

from here you can use the historial_figure UDF to get data form external API Service

Snowflake Notebooks

From here we can use Snowflake Notebook, in case that some packages are required you can add them (1) and select the version(2)

Add the Snowpark functions (lit, parse_json, col and concat) and get_active.session method in order to process transformation and session object respectively

# Import paquetes Python
import streamlit as st
from snowflake.snowpark.functions import lit,parse_json,col, concat
from snowflake.snowpark.context import get_active_session

session = get_active_session()

Then take response(JSON) from API call and perform tranformation to get specific keys values, and finally store on a RAW_DATA table

query = "create or replace table RAW_DATA (texto STRING)"
session.sql(query).collect()

query = "select HISTORICAL_FIGURE('Napoleon Bonaparte') as RAW"
df = session.sql(query)
json_data = df.select(parse_json(df['RAW']).alias("JSON_DATA"))

nombre_tabla = 'RAW_DATA'
database = session.get_current_database()
schema = session.get_current_schema()
contexto = ([database, schema, nombre_tabla])
      
text_df = json_data.select(
   concat(col("JSON_DATA")[0]["name"],lit(" ") ,col("JSON_DATA")[0]["title"],lit(" ") ,col("JSON_DATA")[0]["info"]["born"])
)

text_df.show()

text_df.write.mode("append").save_as_table(contexto)

Create a Embedding Store (Vector Type)

Is time to convert RAW text into numerical vector type, fist we need to create a Vector Type storage using Snowflake Artic LLM using Cortex to Embed (EMBED_TEXT_768) text into vector numerical information, then we can perform a similarity taking an input (question) and check de distance (VECTOR_L2_DISTANCE)

create or replace table VECTOR_STORE as
select 
 texto as input,
 SNOWFLAKE.CORTEX.EMBED_TEXT_768('snowflake-arctic-embed-m',input) as embeddings from raw_data;

select input from VECTOR_STORE 
order by VECTOR_L2_DISTANCE(
SNOWFLAKE.CORTEX.EMBED_TEXT_768('snowflake-arctic-embed-m','Where Napoleon was born?'), embeddings) limit 1;

And naw we can add Snowflake Artic LLM via Cortex to complete a response, notice how Artic if fixing the response, Napoleón vs Pierre Napoleón, but is taking vector type data to perform de response (RAG)

SELECT snowflake.cortex.complete(
    'mistral-7b', CONCAT( 
    'Answer the question based on the context. Be concise.','Context: ',
        (
            SELECT input FROM VECTOR_STORE
            ORDER BY vector_l2_distance(
            snowflake.cortex.embed_text('snowflake-arctic-embed-m', 
            'Where Napoleon was born?'
            ), embeddings
            ) LIMIT 1
        ),
        'Question: ', 
        'Where Napoleon was born?',
        'Answer: '
    )
) as response;

Streamlit App

You can build a Streamlit App to add UX to final users

import streamlit as st 
from snowflake.snowpark.context import get_active_session

session = get_active_session()

st.title("Ask Your Data Anything :snowflake:")
st.write("""Built using end-to-end RAG in Snowflake with Cortex functions.""")

model = st.selectbox('Select your model:',('mistral-7b','llama3-8b','snowflake-arctic'))
prompt = st.text_input("Enter prompt", placeholder="What makes time perceived to be slower?", label_visibility="collapsed")

quest_q = f'''
SELECT snowflake.cortex.complete(
    '{model}', CONCAT( 
    'Answer the question based on the context. Fix any error.','Context: ',
        (
            SELECT input FROM VECTOR_STORE
            ORDER BY vector_l2_distance(
            snowflake.cortex.embed_text('snowflake-arctic-embed-m', 
            '{prompt}'
            ), embeddings
            ) LIMIT 1
        ),
        'Question: ', 
        '{prompt}',
        'Answer: '
    )
) as response;
''';

if prompt:
    df_query = session.sql(quest_q).to_pandas()
    st.write(df_query['RESPONSE'][0])