Setting up Python 3.6 AWS Lambda deployment package with numpy, scipy, pillow and scikit-image
Use Amazon Linux docker container to build your own Python 3.6 lambda deployment package in 8 steps.
AWS Lambda with Python 3.6 is a great new tool in serverless computing scene. However, vanilla Python 3.6 environment provided by the AWS is not well suited for development of lambdas for image processing, numerical calculations and scientific analysis. This is because it does not provide an easy and straight forward way of installing vital Python packages that are usually required for these types of applications, i.e.: numpy, scipy, pillow and scikit-image.
This issue can be overcome be creating your own, custom lambda deployment package containing the aforementioned Python libraries. In this article instructions, in 8 steps, are provided which I used personally to create such a lambda package for my own use.
Basic understanding of Linux and Docker are assumed as the following steps were executed in a Linux operating system (specific distribution should not matter) and a Amazon Linux docker container. Also it is assumed that you have Docker installed.
1. Setting up Amazon Linux docker container
We are going to build the lambda deployment package inside the official Amazon Linux docker container. However, we cannot use the latest release of the Amazon Linux, since AWS lambdas are executed inside its older version. Specifically, the Amazon Linux version used to execute lambda functions is (at the time of writing this) amzn-ami-hvm-2017.03.1.20170812-x86_64-gp2. For this reason we are going to use same version inside our docker.
mylambdapackagfolder in your home directory
# start the docker container and share the folder created
docker run -ti -v ~/mylambdapackage:/mylambdapackage amazonlinux:2017.03.1.20170812
The above docker command will download the Amazon Linux with the given version, start it in an interactive mode, and mount folder
~mylambdapackage in your host operating system to the folder
/mylambdapackage inside the docker container. Once the command completes, you should be already in the lambda container.
To verify that you are using the correct version of the Amazon Linux, the
cat /etc/os-release command can be used. It should produce the following output:
NAME="Amazon Linux AMI"
PRETTY_NAME="Amazon Linux AMI 2017.03"
After that we are going to update the Amazon Linux and install required dependencies for building the four Python packages later.
yum update -y && yum install -y gcc48 gcc48-c++ python36 python36-devel atlas-devel atlas-sse3-devel blas-devel lapack-devel zlib-devel libpng-devel libjpeg-turbo-devel zip freetype-devel findutils libtiff libtiff-devel
2. Creating Python 3.6 virtual environment
The lambda deployment package will be created in the Python 3.6 virtual environment inside the docker container’s
# go into /mylambdapackage folder
# create Python environment in mylambda folder and activate it
python36 -m venv --copies mylambda && source mylambda/bin/activate
--copies parameter will “try to use copies rather than symlinks, even when symlinks are the default for the platform”. Note: to deactivate the Python 3.6 environment just use
The last thing to do before we move to installing the Python packages, is to upgrade
pip3 install -U pip
3. Installing numpy, scipy and pillow
Assuming you are still in
pip3 install --no-binary :all: numpy scipy pillow
Since we are using
--no-binary :all: option, binary versions of these packages will not be used. Instead they all will be compiled from source.
4. Installing scikit-image
scikit-image package depends on
cython. Thus it needs to be installed first. For some reason installation of both
scikit-image using a single
pip3 command was failing. However, installing them separately worked.
pip3 install --no-binary :all: cython
And now we can install
pip3 install --no-binary :all: scikit-image
Note: This will also install
matplotlib, which is a dependency of
scikit-image. So we should have
matplotlib also available in our custom lambda package.
5. Copying required shared libraries
The compiled python packages will depended on the number of system libraries (e.g. for BLAST, LAPACK). We have them in our own Amazon Linux docker container, but they will not be available on the AWS version in which our lambdas are going to be executed. Thus we are going to provide these libraries with our lambda deployment package. Based on the instructions here, this can be done as follows:
# specify where the shared libraries will be stored
mkdir -p $libdir
# copy the libraries
cp -v /usr/lib64/atlas/*.so.3 $libdir
cp -v /usr/lib64/libquadmath.so.0 $libdir
cp -v /usr/lib64/libgfortran.so.3 $libdir
cp -v /usr/lib64/libpng.so.3 $libdir
cp -v /usr/lib64/libjpeg.so.62 $libdir
cp -v /usr/lib64/libtiff.so.5 $libdir
These libraries and those that were compiled with the Python packages can be large. A good way to reduce their size is by means of the
strip command which “discard symbols from object files”.
find $VIRTUAL_ENV/lib/python3.6/site-packages/ -name "*.so*" | xargs strip -v
6. Compressing the site-packages into lambda deployment package
Once all our packages are installed in our Python virtual environment, we can package them as a compressed zip file named
# compress site-packages content into mylambda.zip
zip -r -9 /mylambdapackage/mylambda.zip *
The created zip file should be located in
7. Adding your lambda function to the zipped package
Assuming your lambda function is located in
~/mylambdapackage (subsequently it will be available in
/mylambdapackage in the docker container) you can add it into
mylambda.zip from witting the docker as follows:
zip -ur mylambda.zip lambda_function.py
If you don’t have any specific
lambda_function.py yet, you can use the one I quickly prepared just to check whether the packages installed work on the AWS:
modernpaste is a modern, feature-rich, python-powered open source alternative to pastebin repository on githubpaste.fedoraproject.org
This assumes default name for lambda file and function name, i.e.
8. Deploying the lambda package to S3
Due to the size of
mylambda.zip (~76MB) you will not be able to upload it using lambda creation website on AWS. Instead, you need to upload it an S3 bucket, and then in the lambda creation website just provide the link to the
In this article detailed steps for building a custom Python 3.6 lambda deployment package with numpy, scipy, pillow and scikit-image for AWS were provided. Hopefully the article will become useful to you. If not, I hope that at least it will get you closer to building your own custom lambda packages.
If you have any questions or issues regarding the steps described, please feel free to ask in the commends. I will do my try to help.
Appendix 1: pandas
pandas is also a very popular package that many would like to run on their Python lambdas. Installing it, in the context of the steps provided, is as easy as executing the following command after the steps 3 or 4:
pip3 install --no-binary :all: pandas
requests-toolbelt very hand for processing binary data submitted into a lambda through an API Gateway. Its installation (with the
request package itself for completeness) can be performed in the following way:
pip3 install --no-binary :all: requests