Building a Snowpark-ready Python UDF code generator, for Snowflake customers

Ideation

I knew I wanted to build something with Open AI and Streamlit to help snowflake customers. I couldn’t think of what would be useful. I was stuck. Then it hit me, as I continue to demonstrate the power of Snowpark for Python using Astrato, the first thing I need (and customers too!), is to create Python UDFs/UDTFs. I thought to myself — I’ll make a generator!

Using the code generator

It is so easy to use, simply pick a task (you can select custom to write your own) and generate code!

Each Generated function comes with a sample data input (a python data frame). If you want to run the code in python first, you can use python locally, if you are new to python or don’t have it installed — don’t panic, use deepnote.com. To run directly in Snowflake, leave the output destination set to Snowflake. Sample code is below.

Try it here: Snowpark for Python UDF Generator

create or replace function piers_from_astrato__simpleaddition(x int ,y int) /* YOU MAY WISH TO CHANGE THE INPUT FIELD DATA TYPE, TO SUIT THE OUTPUT */
returns string
language python
runtime_version = 3.8
handler = 'func'
packages = ('pandas','pandas')
as
$$

#import pandas to use dataframe
import pandas as pd

#create dataframe
df = pd.DataFrame({'A': [1,2,3], 'B': [4,5,6]})

#define function
def func(x,y):
return x+y

#apply function
df['C'] = df.apply(lambda x: func(x['A'], x['B']), axis=1)

#run function
print(func(df['A'], df['B']))
$$;

What do I do with the code?

I’m glad you asked! You can run the code directly in the Snowflake UI to create the UDF.

I want to run the code in real-time — how do I do that?

When I use Snowpark for Python, I add my call my UDF (User-defined Function) in a view. This was I can easily make use of it in my tool of choice. My tool of choice is Astrato, with full Snowpark support.

Create or replace view vw_snowpark as
select
field_x,
field_y
piers_from_astrato__simpleaddition(field_x ,field_y ) -- Calling the UDF
from TABLE;

Embedding usage analytics

Each code generation in my app, adds to the “hit counter” and chart. The data is queried live from Snowflake, using Astrato.io, presented using an Astrato embed using Streamlit iframe component.

import streamlit as st
st.components.v1.iframe("https://app.astrato.io/embed/my-embed-id", width=None, height=500, scrolling=False)

Build your own “Chat GPT” code generator

It is really simple to build! Copy paste the code below and add to a new streamlit project. Have fun! Don’t forget your OpenAI API key 🔑

# FUNCTION TO CALL: OPEN AI "CHATGPT" API
def response1(task):
openai.api_key = "https://platform.openai.com/account/api-keys"

response = openai.Completion.create(
model="text-davinci-003",
prompt=f""""
ENTER PROMPT HERE AND INCLUDE {task}
""",
temperature=0.7,
max_tokens=2000,
top_p=1,
frequency_penalty=0,
presence_penalty=0
)
return response.choices[0].text

# SELECT TASK TO RUN INSIDE PROMPT, ADD OPTION FOR CUSTOM TASK:
request=st.selectbox('Task', ('Task1', 'Task2', 'Custom'))
if request == 'Custom':
task=st.text_input('Task', '')
else:
task=request

# DEFINE BUTTON
button = st.button("Generate code")

# BUTTON-CLICK EVENT
if button:
response = response1(task)

''' ------------------
# Requirements.txt
openai
streamlit
pandas
'''

--

--