How to deploy a Serverless Machine Learning Microservice with AWS Lambda, AWS API Gateway and scikit-learn

In this tutorial, we deploy a machine learning microservice using AWS Lambda, AWS API Gateway and scikit-learn. The accompanying code repository can be found on

Before you begin, make sure you are running Python 2.7 or Python 3.6 and you have a valid AWS account and your AWS credentials file is properly installed.

Step 1: Train a basic model

First, we train a 3-class gradient boosted decision tree logistic regression model on iris data set using the scikit-learn tutorial as a guide. Pickle the model as model.pkl.

It doesn’t matter how good this model is for the purposes of this hack, it just needs to make predictions. Here’s my full model training and serialization script.

Step 2: Upload your model to AWS S3

First we have to upload our model to a S3 bucket.

Step 3: Creating a Flask API

Let’s create a project directory

We wrap all our code and dependencies into a virtual environment.

Now, we’re ready to build our API. Create a directory api with a file called and write this into it:

The code is basically self-explanatory. We make a Flask object, use the ‘route’ decorator functions to define our paths, and call a run function when we run it locally (which you can confirm by calling python api/ and visiting localhost:5000 in your browser.)

To load the model from S3 we use the following helper function:

Note: The code in the Github repository uses a memoized annotation to cache the model file after it is pulled from S3, eliminating any need for additional S3 data transfer and leading to significantly faster prediction.

Step 4: Configure AWS Lambda & API Gateway

We use a framework called Zappa to create and configure both AWS Lambda and the API Gateway automatically. Think of it as “serverless” web hosting for your Python apps.

That means infinite scaling, zero downtime, zero maintenance — and at a fraction of the cost of your current deployments!

So let’s start: First, we install the required packages into our virtual environment.

Next, we initialize Zappa.

Zappa has automatically created the a zappa_settings.json configuration file:

This defines an environment called ‘dev’ (later, you may want to add ‘staging’ and ‘production’ environments as well), defines the name of the S3 bucket we’ll be deploying to, and points Zappa to a WSGI-compatible function, in this case, our Flask app object.

By setting the configuration parameter slim_handler to true allows Zappa to load code from Amazon S3 in case our environment exceeds the maximum size of 50 MB.

Step 5: Testing the API locally

The API can be tested locally like a regular Flask application

First, run the Flask app as usual:

Second, make a test API call

The response should be

Step 6: Deploying to AWS Lambda

Now, we’re ready to deploy to AWS. It’s as simple as:

And our serverless Machine Learning microservice is alive!

Congratulations, you have finished all required streps to deploy a serverless machine learning microservice. I hope you enjoyed the project.

Github repository:

If you run into issues getting the application working, feel free to DM me.