Deploying machine learning models using Docker

Published in

Mindboard

3 min readMay 24, 2019

This section explains how to productionize your Flask API and get it ready for deployment using Docker.

In this approach, we will use nginx, gunicorn and Docker Compose to create a scalable template to deploy machine learning models.

Why not use Flask’s in-built server?

· Flask’s built-in server is not suitable for production

· Docker allows for smoother deployments and more reliability than attempting to run Flask on a standard Virtual Machine.

The folder structure might look something like this:

From this, our original Flask application lives in the api/ folder, and there is a seperate folder nginx/ which houses our nginx Docker container and configurations.

nginx, gunicorn work together as follows:

1. Client navigates to your URL, example.com

2. nginx handles this http request, and passes the request to gunicorn

3. gunicorn receives this request from nginx and serves the relevant content (gunicorn runs your Flask application and handles requests to it).

This is a straight forward approach for deploying most of the machine learning models.

Steps:

The first step is to develop a flask application which is supported by gunicorn and create a docker container for the application. Please ensure that the debug mode is set to false. This is just to get rid of the additional stacktrace in case of an error and all the required packages are captured in the requirements.txt file.

In order to write the docker file, we pull the existing python image and copy application files to the container. A sample docker file looks like the below.

In the above file, there is no entry point or command to run the flask application. As discussed earlier, we will be doing that using gunicorn in the later stages.

2) The second step is to install ngnix which serves as the web server. The reason not to use the default flask web server is its inability to scale.

No configuration changes are needed for ngnix and it can be used as such.

1) Final step is to find a wat to run both the docker containers together. In order to do that we can use docker compose which exactly serves this purpose.

A docker-compase.yml file is required in order to bring both the containers together. A sample file looks like the below:

The service should start by just supplying the command docker-compose up. In the next article, we shall see how machine learning projects can be deployed using tf-serving and tensor-RT and AWS Lambda.

Masala.AI
The Mindboard Data Science Team explores cutting-edge technologies in innovative ways to provide original solutions, including the Masala.AI product line. Masala provides media content rating services such as vRate, a browser extension that detects and blocks mature content with custom sensitivity settings. The vRate browser extension is available for download via the Chrome Web Store. Check out www.masala.ai for more info.

Deploying machine learning models using Docker

Written by Guru Prasad Natarajan