Data Engineering Blogpost
10 min readOct 12, 2023

Sproc: A Powerful Tool for Creating Python Stored Procedures in Snowflake

Sproc: A Powerful Tool for Creating Python Stored Procedures in Snowflake

Stored procedures in Snowflake are like handy shortcuts that help you automate repetitive tasks. They are essentially a set of SQL statements that are grouped together and saved as a single unit. Once you’ve created these stored procedures, you can call these stored procedures repeatedly within Snowflake queries, or they can be scheduled to run automatically.

We can write stored procedures using various programming languages, such as SQL, Python, and JavaScript. In this blog, we will focus on writing stored procedures using Python. Then, we will deep dive into the sproc decorator, which is a tool that can be used to simplify the creation of Python stored procedures.

In the following section, we will explore what a Python stored procedure is and the different ways to write it in Python.

Python stored procedures & Ways to Create Python Stored Procedures

Python stored procedures is a feature in Snowflake that allows you to write stored procedures in Python. Using Python stored procedures gives you the ability to use all of the power and flexibility of Python, such as its data manipulation and analysis capabilities, to automate your Snowflake workloads.

There are several advantages to using Python stored procedures over traditional SQL stored procedures: -

  1. Python is a more versatile language than SQL, so you can use it to perform a wider range of tasks.
  2. Python is more expressive than SQL, so your stored procedures will be easy to read and understand.
  3. Python is more powerful than SQL, so you can use it to write more complex and efficient stored procedures.

There are three ways to create Python stored procedures in Snowflake:

  1. Creating Python Stored Procedures Using SQL in Snowsight

Creating Python stored procedures using SQL in Snowsight is the simplest way to create a Python stored procedure in Snowflake. You can simply write inline Python code within an SQL worksheet to create the procedure.

To Create a Python stored procedure using SQL, you can use the following template:

Python stored procedure template in SQL.
create or replace PROCEDURE Procedure_Name(input_paramaters datatype)
returns String // return type
language python // choosing the programming language in our case its Python
runtime_version = '3.8' // Python version
packages = ('snowflake-snowpark-python') // packages that we wanna import
handler = 'python_function_name' // python function name
as

$$
# sess :- session name
def python_function_name(ses,input_parameters):
#function body
#return
$$;
call function_name(input_parameters);

Note: You have to open an SQL Worksheet to write your code.

The above code is a template that will help you in writing your code. Next, we will take a look at some small examples on Python stored procedures.

Example:

A) The following code creates a Python stored procedure called multiply_together that multiplies two numbers together:

Multiply two variables with Python
//Create a basic program 
create or replace PROCEDURE Multiply_together(input_number_1 int , input_number_2 int)
returns String
language python
runtime_version = '3.8'
packages = ('snowflake-snowpark-python')
handler = 'multipler'
as

$$
def multipler(sess,a,b):
return a*b
$$;

call Multiply_together(3,5);// Call the procedure name

B) In this example, we created a Python stored procedure using the Pandas module to filter customer data from theSNOWFLAKE_SAMPLE_DATA.tpch_sf1.customer table based on the C_MKTSEGMENT column.

Python stored procedure using the Pandas
CREATE OR REPLACE PROCEDURE filterByRole(tableName VARCHAR, role VARCHAR)
RETURNS TABLE()
LANGUAGE PYTHON
RUNTIME_VERSION = '3.8'
PACKAGES = ('snowflake-snowpark-python','pandas')
HANDLER = 'filter_by_role'
AS
$$
import pandas as pd
from snowflake.snowpark.functions import col

def filter_by_role(session, table_name, role):
df = session.table(table_name)
pd_df = df.to_pandas()
data_df =pd_df[ pd_df["C_MKTSEGMENT"]==role]
data_df = session.create_dataframe(data_df)
return data_df
$$;

call filterByRole('SNOWFLAKE_SAMPLE_DATA.tpch_sf1.customer','HOUSEHOLD')

In this subsection, we discussed how to create Python stored procedures using SQL in Snowsight. We provided a template that you can use to create your own Python stored procedures. We also presented examples of Python stored procedures that can be used to multiply two numbers together and filter customer data from a Snowflake table.

Learn More: -

2. Creating Python Stored Procedures by Storing the Code in a Local Environment

In this section, we will discuss how to create a Python stored procedure by storing the Python code in a .py file in your local environment. This is a more flexible way to create a Python stored procedure, as it allows you to separate the Python code.

Example:

Create Python stored procedures in your local environment
from snowflake.snowpark import Session
# Snowpark configuration
sf_options = {
"account":"your account name",
"user": "Username",
"password": "Password",
"warehouse": "warehouse name",
"database": "database name",
"schema": "schema name",

}
sess = Session.builder.configs(sf_options).create()
#Writing sql code for the Python Stored procedure
sql =(
"create or replace PROCEDURE Multiply_together_1(input_number_1 int , input_number_2 int) \n"
"returns String \n "
"language python \n "
"runtime_version = '3.8' \n"
"packages = ('snowflake-snowpark-python') \n "
"handler = 'multipler' \n "
"as \n "

"$$ \n"
"def multipler(sess,a,b):\n"
" Z=a*b \n"
" return Z \n "
"$$;")

res=sess.sql(sql)
res.show()

sql_1 = "call Multiply_together_1(5,8);"
res1 = sess.sql(sql_1)
res1.show()
print("complete")

Storing Python stored procedure code locally can make it easier to manage and reuse the code. However, it can be tedious to write the code in this manner. The sproc decorator can simplify the process.

Learn More: -

3. Using the sproc() Function in the Snowpark Python Client API.

The sproc decorator is a Python decorator that can be used to create stored procedures in Snowflake. The sproc decorator makes it easy to create and manage stored procedures in Python by allowing you to write your Python code directly in the function definition, rather than having to write it inside a SQL template.

Here is an example of how to create a stored procedure using the sproc decorator:

Sproc Stored Procedure code
from snowflake.snowpark import Session
import snowflake.snowpark as snowpark
from snowflake.snowpark.functions import sproc

sf_options = {
"account":"your account name",
"user": "Username",
"password": "Password",
"warehouse": "warehouse name",
"database": "database name",
"schema": "schema name",

}
sess = Session.builder.configs(sf_options).create()
sess.add_packages("snowflake-snowpark-python")


@sproc(name="my_prc", is_permanent=True, stage_location="@procedure",replace=True,packages=["snowflake-snowpark-python"])
def my_proc(Session: snowpark.Session, x:int, y:int)->int:
return int(x) + int(y)

The sproc() decorator takes the following arguments:

  • name: The name of the stored procedure.
  • is_permanent: A boolean value that specifies whether the stored procedure should be stored permanent or not.
  • stage_location: The stage location of the stored procedure.

Below code is an demonstration of how to create a stage to store our sproc code.

Note: Without a stage location, our code won’t be stored properly. I have created the stage in Snowsight, but you can also create it in your local Python code itself.

create or replace stage procedure;
  • replace: A boolean value that specifies whether the stored procedure should be replaced if it already exists.
  • packages: A list of packages that are required to run the stored procedure Example: - Pandas, snowflake-snowpark-python etc.

In this subsection, we discussed what a Python stored procedure is and the different ways to create them. We discussed how to create Python stored procedures using SQL in Snowsight, how to create them by storing the Python code in a .py file in your local environment, and how to use the sproc decorator to simplify the process.

In the next section, we will dive deeper into the sproc decorator. We will learn about another way to write a stored procedure using the register function. We will also learn how to run multiple functions using a stored procedure. Finally, we will create a small program that calls a table from the Snowflake database, converts the Snowflake table into a Pandas Data Frame, performs some basic transformations, and stores it back to a Snowflake table.

Exploring the Sproc() function

Why Sproc ()

  • More concise and easier to write.
  • More flexible, as it allows you to write your Python code directly in the function definition, rather than having to write it inside a SQL template.
  • The sproc allows you to specify the name, location, and other properties of the stored procedure in a single place.
  • The sproc automatically registers the stored procedure with Snowflake.

Writing a stored procedure using the register method

The register method is a way to create a stored procedure in Snowflake without using the sproc decorator.The register method arguments are similar to the sproc decorator.

Here is an example of how to use the register method to create a stored procedure:

Register method in Stored Procedure
from snowflake.snowpark import Session
import snowflake.snowpark as snowpark
from snowflake.snowpark.functions import sproc
sf_options = {
"account":"your account name",
"user": "Username",
"password": "Password",
"warehouse": "warehouse name",
"database": "database name",
"schema": "schema name",

}
sess = Session.builder.configs(sf_options).create()

from snowflake.snowpark.session import Session

sess.add_packages(["snowflake-snowpark-python"])


def add_two_number(sp_session : Session, input_1 : int , input_2 : int) -> str:

return input_1 + input_2

sess.sproc.register(func=add_two_number, is_permanent=True, stage_location="@procedure",name="add_two_number", replace=True)

The register function is a less used way to create a stored procedure than the sproc decorator. However, it is still a valid way to create stored procedures in Snowflake.

Run multiple functions using a stored procedure.

In this subsection, we will learn how to run multiple functions using a stored procedure. We will use the sproc decorator to create a stored procedure that calls two Python functions.

Here is an example of how to run multiple functions using a stored procedure:

Run multiple functions using a stored procedure:
from snowflake.snowpark import Session
import snowflake.snowpark as snowpark
from snowflake.snowpark.functions import sproc
sf_options = {
"account":"your account name",
"user": "Username",
"password": "Password",
"warehouse": "warehouse name",
"database": "database name",
"schema": "schema name",

}
sess = Session.builder.configs(sf_options).create()

from snowflake.snowpark.session import Session

def add_two(input_1 : int , input_2 : int)-> str:
return input_1 + input_2

sess.add_packages(["snowflake-snowpark-python"])

def add_two_number(sp_session : Session, input_1 : int , input_2 : int) -> str:
t= add_two(input_1,input_2)
return str(t)+" addition"

sess.sproc.register(func=add_two_number, is_permanent=True, stage_location="@procedure",name="add_two_number", replace=True)

In the above the code The first function, add_two, takes two integers as input and returns the sum of the two integers. The second function, add_two_number, takes two integers as input and calls the add_two function to calculate the sum of the two integers. The add_two_number function then returns a string that includes the sum of the two integers and the word "addition".

Creating a small program to call a table, convert to Pandas, transform, and store back.

In this subsection, we will create a small program that calls a table from the Snowflake database, converts the Snowflake table into a Pandas Data Frame, performs some basic transformations, and stores it back to a Snowflake table.

Code for creating a small program to call a table, convert to Pandas, transform, and store back.
from snowflake.snowpark import Session
import snowflake.snowpark as snowpark
from snowflake.snowpark.functions import sproc

sf_options = {
"account":"your account name",
"user": "Username",
"password": "Password",
"warehouse": "warehouse name",
"database": "database name",
"schema": "schema name",

}
sess = Session.builder.configs(sf_options).create()

from snowflake.snowpark.session import Session

sess.add_packages(['pandas','numpy', 'snowflake-snowpark-python'])

def do_data_trasform(sp_session : Session, tablename: str,stored_table : str) -> str:
import pandas as pd
from snowflake.connector.pandas_tools import write_pandas

data = sp_session.table(tablename)
data_df = data.to_pandas()

data_df = data_df.iloc[1:7,:]

data_df = sp_session.create_dataframe(data_df)
data_df.write.mode('overwrite').save_as_table(stored_table)
return "Sucess"

sess.sproc.register(func=do_data_trasform, is_permanent=True, stage_location="@procedure",name="do_data_trasform", replace=True)

The code first creates a session and adds the required packages. Then, it defines a function called do_data_trasform. This function takes two arguments: the name of the Snowflake table and the name of the Snowflake table where the transformed data will be stored.

The function first calls the table function to create a Snowpark DataFrame from the Snowflake table. The Snowpark DataFrame is then converted to a Pandas DataFrame.

The Pandas DataFrame is then transformed by slicing it to the first 7 rows. The transformed Pandas DataFrame is then converted back to a Snowpark DataFrame.

Finally, the Snowpark DataFrame is written back to the Snowflake table.

Note: -

Sure, here is the formatted text about the disadvantage of sproc, with bullet points and a call to action:

One disadvantage of sproc is that it can only return a string or integer. This means that if you want to return a table from a sproc, you will need to store the table in a Snowflake table. Here are some possible solutions to this problem:

  • Store the data into a Snowflake temporary table: This is the most efficient solution, as it does not require you to use a third-party library.
  • Use a cursor to iterate over the rows of the table and insert them into the Snowflake table one by one: This is a manual process, but it is the most reliable way to return a table from a sproc.
  • Use a third-party library to serialize the table data into a string or integer that can be returned by the sproc: This is the least efficient solution, but it does not require you to store the table in a Snowflake table.

The best solution for you will depend on your specific needs.

Learn more: — Exploring the Sproc() function

Conclusion

In this blog, we explored Python stored procedures and the sproc decorator. We discussed the different ways to create Python stored procedures and the benefits of using the sproc decorator. We also presented an example of how to use the sproc decorator to create a stored procedure that calls a table from the Snowflake database, converts the Snowflake table into a Pandas Data Frame, performs some basic transformations, and stores it back to a Snowflake table.

Data Engineering Blogpost

Data engineering enthusiast ️ Building pipelines, wrangling data, & unlocking insights. AWS, Snowflake, Python . Join me on the journey!