How to Install Python Packages for AWS Lambda Layers

Jeno Yamma
Nov 8 · 4 min read

Lambda and its Layers

When I first heard about AWS Lambda I was quite confused about what it was and tried to use it to train a simple ML model but was hit with a hard 5 minutes execution limit. Fast-forward a few years, I believe Lambda has evolved a lot and so have people’s understanding of event-driven systems and serverless compute. It has become part of many modern applications and data architects.

At re:Invent 2018, Lambda was heavily buffed with custom runtime and an increase in execution runtime limit to 15 minutes. Lambda Layers was also released which allowed you to share common dependencies to ease lambda deployment size and updates. However, AWS still hasn’t addressed the needs of friendly steps to bring in non-native python packages such as Pandas.

The troublesome approaches to bringing in external packages…

Currently, you either have to zip up your Lambda function and Linux compatible dependencies, or upload your dependencies as a Lambda Layers. If you’ve played around with Google Cloud Functions and Azure Function before then you would know that it can be as easy as to writing a wish list in the requirements.txt.

To add extra complexities, some of the Python packages need to compile C or C++ extensions (packages such as Numpy and Pandas). This could be a bit of a hassle if you want to use your macOS or Windows machine to pip install -t . pandasthen zip them up for Lambda Layers which is an Amazon Linux environment.

There are a few ways of bringing in Linux compatible dependencies whether it’s through Serverless or using an EC2 Instance. Now, if you’ve read some of my blogs before, I really enjoy using Lambdas in Hackathons and since time is the essence, I want to show you the easiest and quickest way on how I bring Python dependencies across as a Lambda Layers using Docker.

5 Easy Steps:

1. Build

Hopefully, you have Docker set up but if you don’t then be sure to do that first. The first thing you will do is use an Amazon Linux as the base image and create a Dockerfile from that with a few utilities, python3.7 and Virtualenv.

FROM amazonlinux:2.0.20191016.0RUN yum install -y python37 && \
yum install -y python3-pip && \
yum install -y tar && \
yum install -y gzip && \
yum install -y zip && \
yum clean all
RUN python3.7 -m pip install --upgrade pip && \
python3.7 -m pip install virtualenv

Optional: I installed tar and gzip as well because I am fancy and I like to remote into my Container using Visual Studio Remote-Containers.

Run the commands below to create your Dockerfile with a tag.

usr> docker build -f "<filename>.Dockerfile" -t lambdalayer:latest .

2. Run

You should be able to see the images by doing docker images. After that, you want to bash into your container.

usr> docker run -it --n lambdalayer lambdalayer:latest bash

3. Install

Create a new virtual environment within your container to isolate your python environment and reuse the same container without worrying about global installation breaking things. You could create a container per dependencies but time is the essence here…

bash> python3.7 -m venv pandas

I named it pandas, you can call it whatever you want but be sure to activate it, install your package(s) to a specific folder, then deactivate it after.

bash> source pandas/bin/activate
(pandas) bash> pip install pandas -t ./python
(pandas) bash> deactivate

4. Package

The packages should have been installed along with their dependencies in the python folder (or your specific folder). You can now zip up that folder as python.zip and exit the container. You will need to copy the zipped folder into your local environment so you can upload it to the Lambda Layer or S3.

bash> zip -r python.zip ./python/usr> docker cp lambdalayer:python.zip ./Desktop/

5. Upload

If your zipped file is larger than 50mb then you will need to upload it to S3 as opposed to uploading the zip directly in Lambda Layers. Be sure to record the S3 Object URL.

That’s it, you now have Pandas (and Numpy) to use as part for your Python Lambda as a Layer and If you want to create a deployment package then be sure you add your lambda function code within the zipped folder as a .py file.

Limitations of Lambda Layers

There are a few limitations that you need to be aware of and this includes:

  1. You can only use up to 5 layers per Lambda.
  2. The size of all your layers unzipped cannot exceed 250mb.
  3. Layers are mounted to the /opt directory in the function’s execution environment so be sure to Layer your functions properly if you are going to have more than one.

About Me

As a Consultant at Servian, I have been helping businesses build scalable cloud and data solutions in the area of Data Warehousing, AI/ML and Personalisation, and have a growing interest in DevOps and Micro-services.

I love data and cloud so feel free to reach out to me on LinkedIn for a casual chat or to find out more on how we Servianites can help you with your cloud and data solutions.

Jeno Yamma

Written by

Come on a data journey with me

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade