A production-grade Machine Learning API using Flask, Gunicorn, Nginx, and Docker — Part 2

Published in

Technonerds

6 min readOct 25, 2019

In Part 1, we learned about the components of our API and wrote a very simple Flask API for our text classification model. In this part, we are ready to expand upon our project and build this API in a Docker container, and expose it using Nginx and Gunicorn (read Part 1 to understand what these do)!

For Part 2, our final file structure will look as follows:

flask-ml-api
|- api
    |- __init__.py
    |- models
        |- model.pkl
    |- app.py
    |- wsgi.py
    |- requirements.txt
    |- Dockerfile
|- nginx
    |- nginx.conf
    |- Dockerfile
|- docker-compose.yml

Step 1: Using Gunicorn WSGI

WSGI stands for Web Server Gateway Interface. A WSGI is the middleman between your Flask application and your web server (which will be configured using Nginx).

The built in WSGI of Flask is not built for production. Gunicorn is one of the few options of WSGI we can easily setup and use with Flask.

Start by installing Gunicorn:

pip install gunicorn

Next, create a wsgi.py file and add the following code (we are setting this up in Debug mode for now, but you can change that later for production environments):

from .app import app

if __name__ == "__main__":
    app.run(use_reloader=True, debug=True)

Now, instead of starting our API by running python3 app.py , we will use this:

$ gunicorn -w 3 -b :5000 -t 30 --reload wsgi:app

When you run this command, you are instructing Gunicorn to start app.py using the app from the wsgi.py. You are also asking Gunicorn to setup 3 worker threads, set the port to 5000, set timeout to 30 seconds, and allow hot-reloading the app.py file, so that it rebuilds immediately if the code in app.py changes.

Step 2: Setting up the API in Docker

Create two folders: api and nginx. These will be the two Docker containers we require. We will deal with the api folder first.

Our api folder structure looks like this:

flask-ml-api
|- api
    |- __init__.py
    |- models
        |- model.pkl
    |- app.py
    |- wsgi.py
    |- requirements.txt
    |- Dockerfile
...

Create and empty __init__.py file in this folder.

Next, create a requirements.txt file with a list of packages we require Docker to install (these are the packages we want to pip install). In my case, the requirements file looks like this:

flask
gunicorn
fastai

We can finally create our first Dockerfile! Create Dockerfile (note that this file has no file extension) and paste the following code:

FROM python:3.6

#update
RUN apt-get update

#install requirements
COPY ./requirements.txt /tmp/requirements.txt
WORKDIR /tmp
RUN pip3 install -r requirements.txt

#copy app
COPY . /api
WORKDIR /

CMD ["gunicorn", "-w", "3", "-b", ":5000", "-t", "360", "--reload", "api.wsgi:app"]

So… what does Dockerfile do? Dockerfile contains an instruction set for Docker to setup a container and execute a script to run the program.

Here’s a brief about each step of this process:

FROM python:3.6 will start with a Docker container that already has Python 3.6 installed. Specifically, we are using a container image from here. This image runs Linux.
RUN apt-get update will run the apt-get update command in our Docker container.
COPY ./requirements.txt /tmp/requirements.txt will copy your requirements.txt file from your local disk to the Docker container, in the /tmp folder.
WORKDIR /tmp is equivalent to changing our directory to /tmp. This is equivalent to running cd /tmp on a Linux system.
RUN pip3 install -r requirements.txt does what you would expect: run pip install on the packages listed in the requirements file. This installs flask, fastai and gunicorn packages in our Docker container environment.
COPY . /api will copy our local api folder (which is located at .) to the container at /api
WORKDIR / will change our working directoy to /
CMD ["gunicorn", "-w", "3", "-b", ":5000", "-t", "360", "--reload", "api.wsgi:app"] runs our Gunicorn command to start the API in Docker!

Step 3: Add the Nginx container

flask-ml-api
|- api
   ...
|- nginx
    |- nginx.conf
    |- Dockerfile
...

In our nginx folder, create two files. First, we have nginx.conf file:

worker_processes  3;

events { }

http {

  keepalive_timeout  360s;

  server {

      listen 8080;
      server_name api;
      charset utf-8;

      location / {
          proxy_pass http://api:5000;
          proxy_set_header Host $host;
          proxy_set_header X-Real-IP $remote_addr;
          proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
      }
  }
}

This file instructs Nginx to listen on port 8080 and transfer the request to our api on port 5000. Things will get a lot clearer when we write the docker-compose file.

We also create a Dockerfile in the nginx folder with the following contents:

FROM nginx:1.15.2

RUN rm /etc/nginx/nginx.conf
COPY nginx.conf /etc/nginx/

This creates a Docker container with the nginx image, then copies the nginx config to it.

Step 4: Assembling our building blocks: Docker Compose

Finally, we are ready to create the most important file that will manage our two Docker containers.

flask-ml-api
|- api
    ...
|- nginx
    ...
|- docker-compose.yml

In the root folder of the project, add docker-compose.yml:

version: '3'

services:

  api:
    container_name: flask_api
    restart: always
    build: ./api
    volumes: ['./api:/api']
    networks:
      - apinetwork
    expose:
      - "5000"
    ports:
      - "5000:5000"

  nginx:
    container_name: nginx
    restart: always
    build: ./nginx
    networks:
      - apinetwork
    expose:
      - "8080"
    ports:
      - "80:8080"

networks:
  apinetwork:

Let’s uncover what all this means:

We have two services (containers), one for our api folder and another for our nginx folder.
We also have a network, which is arbitrarily named apinetwork.
Under api:
- we name our container flask_api
- restart: always allows our api to hot reload when files change
- we build the ./api folder
- volumes: [‘./api:/api’] is necessary to keep our local and docker api folders in sync. This is also a necessary step for hot-reloading when we update our app.py file.
- we mark this container to be a part of the apinetwork, and expose port 5000, where 5000:5000 says that port 5000 in our Docker container will be same as 5000 on our local.
For nginx we do the same setup as our api, but we expose 8080, where 80:8080 means that port 8080 of our Docker container is accessed on port 80 locally.

Our connection flow looks as follows:

We query on port 80 on our localhost, which is sent on port 8080 on our apinetwork to nginx (remember, nginx is listening on port 8080)
nginx transfers this request to port 5000 on the apinetwork (which is where Gunicorn will recieve the request)
Gunicorn passes this request to Flask

Phew! That’s a long process, but it allows for expansion capabilities. For example, you can attach multiple api containers to one nginx web server in and configure it to be a load-balancer to handle a large number of API requests.

Step 5: Let’s Query!

Now, on your local machine, make sure you have Docker and docker-compose installed. You can follow the steps here and here.

Run the following on your terminal to start docker-compose, setup the containers and start the API:

$ docker-compose build
$ docker-compose up

This process may take a long time if this is the first time you are building and running the containers. If everything goes well, you should be able to access your api on localhost (at port 80).

You can use Postman to query the API on GET 0.0.0.0/classification

Conclusion to Part 2

And there you have it, a production grade API deployed on Docker. When you are done testing it, you can set the dubug and auto-reload related flags to false. You can also easily deploy this to your preferred cloud service and make it publicly accessible, but this is out of the scope of my project.

You can find all the code I wrote on GitHub.

If you have any questions or feedback, feel free to drop them in the comments.

In this series:

Part 1: Setting up our API
Part 2: Integrating Gunicorn, Nginx and Docker
Part 3: Flask Blueprints — managing multiple endpoints
Part 4: Testing your ML API

Hello there! Thanks for reading. Here’s a tad bit about me. I am a Computer Science student at the University of British Columbia, Canada. I primarily work on machine learning projects, mostly NLP. I also do photography as a hobby. You can follow me on Instagram and LinkedIn, or visit my website. Always open to opportunities 🚀