Publishing a TensorFlow model on AWS Lambda to make a sell-able API

In Data Science, a productive model is an amazing work from you. You want to sell your excellent model to a Financial Business or a potential StartUp through the API subscription method. AWS Lambda and API Gateway will be the best choice for you. Deploying your model into production and scale-out them for thousands of thousands of users is almost done if it works on Serverless architecture. Doesn’t like deploying on an EC2, it costs you nothing if there is no usage.

Aug 26, 2020 · 5 min read

There are three popular frameworks using Python: SciKit-Learn, TensorFlow, and PyTorch. Today, TensorFlow is used widely for Deep Learning along with PyTorch while SciKit-learn is used for Machine Learning at most. In Mar 2019, Google has released a Tensorflow Lite version (TFLite). The lite version intends to use for Smart Phone and Embedded application. This is significant good news for Data Scientists.

It’s important to reduce the footprint of a Machine Learning or Neural Network model that runs in devices such as a Raspberry PI. When distributing it at client devices. It isn’t easy to protect your intellectual property from Jailbreaking. You need to keep your model on the server-side then exposing an API for any usage.

Does a full-version of Deep Learning frameworks is adaptive to modern software architecture such as Lambda serverless? No, it’s designed to support an end to end solutions for Data Science. Therefore, the full version of those frameworks is too large to fit in 250MB limited space of a Lambda function. To install the Tensorflow package, it takes nearly 1GB space!!!

AWS has just released a Lambda EFS feature that allows you to attach an HDD-like to your Lambda. This will solve the problem of a very big size model file (over 250MB). To use an EFS, you must configure a lot of steps before being able to use it with Lambda.

Why I cannot use TFLite with AWS Lambda?

TFLite has a small footprint, so what is the troubling thing to use with Python running on a Lambda function? It’s all about the native builds for different processors, OS platforms, and customized kernels. AWS Lambda runs on its own, customized Linux so that, the pre-build TFLite from the community doesn’t work.

What can I do to make one of them works?

Let’s take TensorFlow to be the case. Cook it yourself to fit your needs! That’s all the stuff you have to do. But cook the Tensorflow Lite from SourceCode in the Amazon Linux platform resulting to have a compatible binary with AWS Lambda runtime. How to build a native library, cross-platform from SourceCode in my PC, this was a nightmare story when I was in the year of 2000. Now with Docker and the community, you are happy and very lucky. It is done easily within 15 minutes. Let’s create a Dockerfile:

FROM amazonlinuxWORKDIR /tfliteRUN yum groupinstall -y development
RUN yum install -y python3.7
RUN yum install -y python3-devel
RUN pip3 install numpy wheel pybind11
RUN git clone --branch v2.3.0 sh ./tensorflow/tensorflow/lite/tools/make/download_dependencies.shRUN sh ./tensorflow/tensorflow/lite/tools/pip_package/build_pip_package.shRUN pip3 install tensorflow/tensorflow/lite/tools/pip_package/gen/tflite_pip/python3/dist/tflite_runtime-2.3.0-cp37-cp37m-linux_x86_64.whlCMD tail -f /dev/null

All the things you need is to build a docker image that compiles the Tensorflow Lite library inside the amazonlinux image: (you can skip this step if you don’t have a Docker machine in your PC or don’t want to build this!)

docker build -t tflite_amazonlinux .

This process will run in a couple of dozen minutes depending on your machine computation speed. There is a pre-built tflite_runtime and numpy library to be ready to use if you use simplify-cli to generate your Serverless project.

A Pre-built TFLite library for AWS Lambda:

Simplify CLI offers you a tool to create a Serverless project, manage the deployment and its layers gracefully. Now, let’s create a Lambda function with “simplify-cli”.

npm install -g simplify-cli         # install serverless frameworkmkdir tensorflow-lite               # create a project folder
cd tensorflow-lite # enter this project folder
simplify-cli init -t python # generate a python project

In this default project, the file will use the tflite_runtime library to load a pre-built model named detect_object.tflite that was generated before.

Checkout the “tflite-python-layer” repository into the layer folder:

git clone layer

In tensorflow-lite/layer/python folder, there are two pre-built libraries for running on AWS Lambda:

  • tflite_runtime (2.3.0)
  • numpy (1.19.1)

Everything you need to run your project is to setup two variables inside the .env file. To do so, you need an AWS Account with a Credential setup as a Profile or leave it blank if you use a default one.

### - Application DeploymentDEPLOYMENT_ENV=demo
### - Application StackName
### - Backend Serverless Lambda

The “37216” number you should change it to not having a conflicted bucket name with the other one who pickup this number before your test.

Publishing your Python code to AWS Lambda service with this command:

simplify-cli deploy

Then, deploy its layer that contains your TFLite library and numpy:

simplify-cli deploy --layer --source layer

Finally, going to AWS Console, look for your Lambda function named as detectObject-demo then Test your code. You can see this link if you don’t know how to run a Test for your Lambda: You just need to do the “Step 4: Invoke Lambda Function and Verify Results”.

(To continue with the Docker build, there is a script for your last step. That is the layer/ It will take out the result from Docker build to the layer folder.)

Testing with library loading time. The Lambda cold start took ~8 seconds. After the first request, it just takes around 25 milliseconds.

Organize your Lambda with an API Gateway and AWS Marketplace. After finish this setup, you will be ready to sell your API to the world. Then, let’s start to find your customers.

Leave your comment if you need a version of “tflite_runtime” for NodeJS on AWS Lambda.

Follows my articles

A very simple Framework to manage your Infrastructure as code
Publishing a TensorFlow Model on AWS Lambda to make a sell-able API
DIY — Build Yourself a Serverless Framework with 152 Lines of Code
The On-Demand Wakeup Pattern to Overcome AWS Lambda Cold Start

Problem Solving Blog

Wicked problems have no stopping rule, as in there’s no way to know your solution is final.


Written by


(MSc) Cloud Security | Simplify Framework Creator | FinTech CTO | Technical Startup Booster (Consultant).

Problem Solving Blog

Albert Einstein has solved the problem of “time”, “light” and “space” by his curious thinking then answered by himself. In daily life, problems come every time and each person has their solution. By sharing your solution to people, you are contributing to creating the best idea.


Written by


(MSc) Cloud Security | Simplify Framework Creator | FinTech CTO | Technical Startup Booster (Consultant).

Problem Solving Blog

Albert Einstein has solved the problem of “time”, “light” and “space” by his curious thinking then answered by himself. In daily life, problems come every time and each person has their solution. By sharing your solution to people, you are contributing to creating the best idea.

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store