Python dependencies and AWS Lambda

Johnny Opao
Dec 18, 2017 · 6 min read

TL;DR: Handling dependencies in Lambda can be tricky. Using Docker you can create containers that replicate the environment of Amazon Linux. You can use these containers to build, package, and verify the required python and C dependencies.


My preference has been to use Javascript for AWS Lambda. However, a recent data-related project required exploring using Python as it had a number handy data libraries.

Turns out, things can get a little tricky in Lambda when handling select dependencies. It’s not as simple as calling ‘pip install’ then zipping it all up.

Here’s a few issues I ran into:

  • Some of the python libraries had C library dependencies
  • These C libraries needed to be built on an environment similar to the environment in AWS Lambda (Amazon Linux)
  • Knowing which C libraries are required takes a consuming amount of trial and error of the following steps:
    1. Upload lambda code
    2. Invoke a test event
    3. See an error specifying what C library I’m missing
    4. Copy missing C files
    5. Repackage Lambda
    6. Repeat steps until all errors disappear

Using Docker, I was able to consistently replicate the build, and reduce the amount of time fixing errors by testing the lambda locally. Hopefully this serves as a helpful guide to someone else who might be stumbling over the same issues as I was.


For this project I created a Lambda in python that uses the numpy and PyJAGS libraries. PyJAGS is a interface to talk to JAGS, which is a program for statistical analysis.

Starting off

I’ve provided a few files you’ll need to kick things off. Unless otherwise stated, add the file to the root of your project.

  • A python script that imports and executes the PyJAGS and numpy libraries. Add this to a new directory named /package. The first 5 lines are extremely important and required to resolve dependencies in a subfolder called /lib. You’ll later create this directory and copy all required libraries into it.

Step 1 — Create a DockerFile

I needed to create a docker container that emulates the Amazon Linux environment. Amazon maintains a docker base image for purposes like this.

Create a new file name DockerFile with amazonlinux as the base image and python.

FROM amazonlinux:latestWORKDIR /app# install pip
RUN curl -s https://bootstrap.pypa.io/get-pip.py | python

Step 2 — JAGs

The PyJAGS library is an interface to talk to JAGs rather than a native rewrite in python. Modify your Dockerfile to compile and install JAGs. Don’t worry too much about what it’s going on here. It’s just installing the required compiler tools then compiling as per the JAGs documentation

Add the following to the end of your Dockerfile

# requirements for compiling JAGs
RUN yum install -y \
gcc \
gcc-gfortran \
lapack-devel \
gcc-c++ \
findutils \
python27-devel
COPY JAGS-4.3.0.tar.gz .
RUN tar xf JAGS-4.3.0.tar.gz
# compile JAGs
WORKDIR JAGS-4.3.0/
RUN F77=gfortran ./configure --libdir=/usr/local/lib64
RUN make
RUN make install

Step 3 — Install python libraries

Now that you’ve included JAGs, you’re all set to include PyJAGS.

Add the following to the end of your Dockerfile

# install python deps
WORKDIR /app
COPY requirements.txt .
ENV PKG_CONFIG_PATH=/usr/local/lib64/pkgconfig/:$PKG_CONFIG_PATH# numpy needs to be installed globally first as pyjags
# checks the regular path for numpy as a requirement
RUN pip install numpy
RUN pip install -t ./lib -r requirements.txt

Step 4 — Build the docker image

Build a docker image of what you’ve done so far. From the root of your project, run the following:

docker build -t python-jags .

This builds a docker image named ‘python-jags’. The build might take some time so grab a snack or take a little break.

Step 5 — Run a container

If all went well with the build, you should be able to call the handler.py file with no problem. Using docker run you can mount the handler.py file into the docker container and invoke the function

docker run --rm -v `pwd`/package/handler.py:/app/handler.py python-jags python handler.py

After some calculations, it should print the message: ‘Did stuff with pyjags!’

Success!

Or is it?

Hate to say it, but if you zipped up the contents of the docker container now and invoked it on AWS, you’d see a number of errors about missing files. While this docker container contains all you need for your lambda project, these dependencies are not installed or included on the AWS Lambda environment where your code would be invoked.

What you need to do now is run the code on a fresh amazon linux environment and inspect the error messages to discover what libraries are missing. You could do so by uploading the lambda to AWS then invoking a test event, but that takes a ton of time. Luckily, there’s a handy docker image that simulates a AWS lambda environment:

You’ll use this to emulate AWS Lambda so you can quickly run the code and evaluate the errors.

Step 6 — Copy dependencies

You’ll be running a new docker container, so you need to pull out your existing dependencies from the containers ‘lib’ directory into your root of your project.

docker run --name=pythonJags python-jags
docker cp pythonJags:/app/lib ./package
docker rm pythonJags

This runs and persists a container named ‘pythonJags’, copies the lib directory into the /package directory, then removes the container.

Step 7 — Trial and error

Now for the tedious part. You’ll run the code using docker-lambda. It should return an error immediately as it’s missing a number of libraries. You’ll need to copy the missing file from the container, then once again invoke the function on docker-lambda to find the next missing library.

(If you want to skip ahead, I’ve added the final Dockerfile containing all dependencies to the end of this article.)

Run the following command to invoke your function on docker-lambda

docker run -v "$PWD"/package:/var/task lambci/lambda:python2.7 handler.main

It should output a error similar to this

Unable to import module '/package/handler': libjags.so.4: cannot open shared object file: No such file or directory

You’ll need to find this file in the pythonJags container. You can browse around the container filesystem with the following command:

docker run --rm -it -v `pwd`/package/handler.py:/app/handler.py python-jags

These dependencies will usually be found in one of two places:

/usr/local/lib64/

or

/usr/lib64/

Once you’ve identified the location of the file you can either docker cp like you did earlier or add it to the build steps of the Dockerfile and rebuild the image.

To save time, I’ve put together a list of the missing dependencies. Add the following to the end of the Dockerfile:

RUN cp -a /usr/local/lib64/JAGS lib
RUN cp /usr/local/lib64/libjags.so.4 lib
RUN cp /usr/local/lib64/libjrmath.so.0 lib
RUN cp /usr/lib64/libgfortran.so.3 lib
RUN cp /usr/lib64/libblas.so.3 lib
RUN cp /usr/lib64/liblapack.so.3 lib
RUN cp /usr/lib64/libquadmath.so.0 lib

Rebuild the image

docker build -t python-jags .

Then copy lib directory into your root project with the earlier commands

docker run --name=pythonJags python-jags
docker cp pythonJags:/app/lib ./package
docker rm pythonJags

Invoke the code one last time on docker-lambda

docker run -v "$PWD"/package:/var/task lambci/lambda:python2.7 handler.main

Success! But for real this time!

Step 8 — Deploy

Once you’ve got your container running on docker-lambda, zip up the contents of your /package directory and upload it over the AWS. You can either copy the lib directory out of your container and bundle it on your host machine, or zip it all up directly from within your container.

Alternatively you could user something https://serverless.com/ to handle deployment.

Summary

There you have it! While the process still takes a bit of trial and error, using Docker should give you consistent builds and save time testing and evaluating errors. Keep in mind docker-lambda is more of a simulation, so be sure to invoke the Lambda on AWS with test events before hooking up it up to production.

Here’s what the final state of your Dockerfile should look like:

Hope someone finds this helpful. Feel free to leave a comment below if you run into any issues.

Thanks to Slan Dizier.

Johnny Opao

Written by

Software Developer