Getting your AWS Lambda Functions to work with Snowflake

A recent addition to the Snowflake feature-set was External Functions which greatly improves the extensibility of what you can do from a data engineering and data science perspective with Snowflake (more examples on these to follow…). Alternatively, you may have a scenario where you want to talk to Snowflake from an AWS Lambda using the Snowflake Python Connector.

When creating AWS Lambda Functions, you may need to import an external module beyond what is available by default — an example might be the requests module or the Snowflake Python Connector module mentioned above.

One approach to handling this is the Serverless Application Framework which is a tool for deploying your Lambda Functions to AWS, and is useful since it takes care of all the plumbing needed to get your function going. There are some great articles that explain the benefits of using this framework such as this one. There may be times however when you want to manually create your Lambda Function and therefore need to package your dependencies manually— this article focuses on this pattern.

Note — This requirement is not exclusive to Lambda Functions used with Snowflake, this applies to any Lambda that uses imports an external module.

We have 2 options to do this:

  1. Creating a package with the dependencies, and uploading this as a self-contained zip file, or;
  2. Using Lambda Layers (my personal favorite).

In this post, I will provide instructions on both; either work fine but the benefit of Lambda Layers are they are reusable by multiple Lambdas Functions.

Option 1 — Creating a combined package:

In this example we’ll be importing the requests module.

$ mkdir my-lambda-function
$ cd my-lambda-function
$ python3 -m venv venv
$ source venv/bin/activate

At this point you are now in your new virtual environment. We can now install the required dependency:

$ pip3 install requests

Once this is installed, add your lambda function file (in this example I’ll use snowflake_lambda.py as my filename). The next step is to bundle the dependencies and your python handler file together:

$ cd venv/lib/python3.8/site-packages
$ zip -r9 ${OLDPWD}/function.zip .
$ cd $OLDPWD
$ zip -g function.zip snowflake_lambda.py

The output will be the zip file needed to upload to the Lambda within the AWS CLI or Console.

Option 2— Using Lambda layers:

By using Lambda Layers we can upload modules that can then be imported into the Lambda function, the benefits to this are:

  1. Creating reusable dependencies that can be used across multiple functions (i.e. once you have created a Lambda layer for the requests module you can re-use this across multiple Lambda Functions).
  2. Reduces the footprint of your function code when managing your Lambda function and you can view this in the Lambda function screen.

Note — using Lambda Layers does not overcome the overall size limitation of a Lambda imposed by AWS.

In a similar way to Option 1, we need to create a package to upload to AWS, but this time we are just uploading the module only, without the Lambda function itself.

To get started, we need to create our zipped package. In this example, I’ll use the requests module but the same would apply to any module (i.e. snowflake-python-connector).

$ mkdir -p temp/python
$ cd temp/python
$ pip3 install requests -t .
$ cd ..
$ zip -r9 ../requests.zip .

In the directory above your temp folder you will now have a requests.zip file that contains all files part of the requests module.

To upload to AWS, using the console do the following:

  1. Within AWS Lambda select “Layers” in the left menu, and click on the “Create Layer” button in the right-side pane.
  2. Give your Layer a name (I use the package name), select the runtimes that apply and upload the zip file we created in your temp folder.
  3. The Lambda Layer is now created and available to use in your Lambda Function.
  4. When you create your Lambda you will now to able to add the Layer to the Lambda, like so:

Alternatively you could use the AWS CLI:

$ aws lambda publish-layer-version --layer-name requests \      
--zip-file fileb://requests.zip \
--compatible-runtimes python3.8

Using Docker to build your package

If you are building your package on MacOS, then you may hit an error that manifests as an invalid ELF header message. This occurs because the Lambda Function runs on Amazon Linux, but you are compiling the package in a non-linux environment (i.e. MacOS).

In order to overcome this you need to build your package within a Linux environment. The quickest way to do this is using Docker to spin up a container locally and perform the same steps as above in (1) or (2). Once the container is spun up, we will need to install a few tools, but will be able to perform the same commands.

To do this, in the terminal perform the following:

$ docker run -v <directory with snowflake_lambda.py>:/lambda -it --rm ubuntu

Notes:

  • Clearly, having Docker installed is a pre-requisite to this.
  • Before you do this, make sure the only file in the “directory with snowflake_lambda.py” is the lambda function. Remove any other previous dependency installations.
  • Using the -rm flag means Docker will remove the container when you are finished with it, if you want it to persist then remove this flag.
  • Using the -v flag when creating the container gives the container access to the folder containing the lambda function, in a directory called “lambda”

Once this has executed (Docker may download the ubuntu image if needed, so may not be instant), you will be in a terminal session within the container:

root@7b6ea21bd7a5:/#

Next we need to install some packages:

$ apt-get update
$ apt-get install python3-pip
$ apt-get install zip
$ apt-get install python3-venv

Note — you don’t need to do line 4 if you are doing the Lambda Layer approach.

You will now find yourself in your working folder. From here you should be able to then execute Option (1) or (2) and you will have a zip file, compiled for Linux to be uploaded to AWS.

Example — Bundling a single deployment package with the Snowflake Connector

$ docker run -v <your dev directory>:/lambda -it --rm ubuntu$ apt-get update
$ apt-get install python3-pip
$ apt-get install zip
$ apt-get install python3-venv
$ cd lambda$ mkdir my-lambda-function
$ cd my-lambda-function
$ python3 -m venv venv
$ source venv/bin/activate
$ pip3 install snowflake-connector-python
$ cd venv/lib/python3.8/site-packages
$ zip -r9 ${OLDPWD}/function.zip .
$ cd $OLDPWD
$ zip -g function.zip snowflake_lambda.py

Example — Building a Lambda Layer with the Snowflake Connector

$ docker run -v <your dev directory>:/lambda -it --rm ubuntu$ apt-get update
$ apt-get install python3-pip
$ apt-get install zip
$ cd lambda
$ mkdir -p temp/python
$ cd temp/python
$ pip3 install snowflake-connector-python -t .
$ cd ..
$ zip -r9 ../snowflake-connector-python.zip .

--

--