Running TensorFlow on AWS Lambda using Serverless

Mike Moritz
14 min readMay 4, 2019

For smaller workloads, serverless platforms such as AWS Lambda can be a fast and low-cost option for deploying machine learning models. As the application grows, pieces can then be moved to dedicated servers, or PaaS options such as AWS Sagemaker, if necessary.

Although it is instructive to first use Lambda by uploading code directly, it is best to select a framework so that you can leverage additional integrations, such as API Gateway with AWS. For this example we will use TensorFlow as the machine learning library, and so we will look for frameworks that can deploy Python applications.

Zappa is well known for being able to easily deploy existing Flask or Django apps, however since we are creating this with serverless in mind from the start we will select the ubiqutous and powerful Serverless framework.

When treating infrastructure configuration as a first-class citizen it is advisable to first create a shell of the application and deploy it, and then write the actual code. This allows for rapid iterations that are close to the end-state, and avoids costly surprises down the road.

Structuring the project

For machine learning most of the work can be categorized into three critical steps:

  • Retrieving, cleaning, and…

