Machine Learning Deployment: A Storm in a Teacup

Deploy a machine learning model in 10 minutes with Flask, Docker, and Jenkins.

Authored by Ekramul Hoque, Pavan Kosaraju, Rohith Sooram, Chirag Ahuja


Deploying machine learning models in production environments is an often overlooked area of data engineering. Most tutorials/blogs on the web focus on building, training and tuning machine learning models. Well, what use is a model if it can’t be used to make real-world predictions? So, let’s take a look at few of the deployment options we have at our disposal.


When it comes to deploying machine learning models in production, there exist a variety of options. One of the popular methods is to design and train the model using cloud services like Azure Machine Learning Studio and Amazon SageMaker. These services come with the ability to build and train models using drag-and-drop tools. Moreover, publishing these models as web services is just a matter of a few clicks. The added advantage of such a setup is that the deployment scales automatically with the spike in usage of the application. Cool, eh?

Convenient though this may seem in the short run, this setup can potentially be problematic in the long run. The difficulty starts when we want to migrate the application away from those third-party cloud platforms and deploy it on our servers. Since these tools are tightly integrated with their respective cloud platforms, such a setup is not portable. Additionally, the cost of cloud computing as the application scales can be a prohibitive factor.


These problems can be avoided if we build custom REST-APIs as an end-point to our machine learning models. In particular, this tutorial will use a Python-based Flask web framework to build an API for our machine learning models and then containerize this flask application neatly inside a Docker image for deployment. Docker lends itself naturally to this problem as all the dependencies of the application can be packaged inside a container and scalability can be achieved by simply deploying more containers when the situation demands it. Such a deployment architecture is scalable, cost-efficient and portable in nature.


An overview of the deployment architecture
  • Docker: Docker is an open source containerization technology that allows developers to package the application along with dependent libraries and isolate it from the underlying operating system. Unlike VMs, docker does not require guest OS for every application and thus maintains a lightweight resource management system. Virtual Machines are more heavyweight as compared to containers and hence containers can be spun off relatively quickly all while having a lower memory footprint. This helps with the scalability of our application and model in the future.
Source : Roderick Bauer
  • Jenkins: Jenkins is perhaps the most popular continuous integration and continuous delivery tool with about 1400 plugins to automate the build and deployment of projects. Jenkins provides a provision to add a GitHub web-hook to its pipeline so that every time a developer pushes a change to GitHub repository, it automatically starts running validation tests for the modified model and builds a docker image for deployment.
  • ngrok: ngrok is a free tool that tunnels a public URL to an application running locally. It generates a URL that can be used in the GitHub web-hook to trigger push events.
  • Flask: Flask is an open-source web framework written in Python with a built-in development server and a debugger. Although there are many alternative web frameworks to create a REST API, Flask is preferred because of its simplicity.



You are probably wondering ‘what sort of loopy land have I entered?’ but we promise the next steps are going to be simple and practical to understand.

Till now, we have had a look at the different components in the deployment architecture and a brief description of what each component does. In this section, we will discuss the detailed steps to deploy our model.


The deployment process can be tentatively be divided into four sections: Building and saving the model, exposing the model using a REST API, packaging the model inside a container and configuring continuous integration tools.

Before moving on to the next steps, we recommend cloning this GitHub repository to your local machine using the below command. This repository contains all the code files which can be used as a reference for deploying your custom models.

git clone

Note: Although the steps mentioned in this tutorial are for the Windows operating system, it should be fairly trivial to modify these commands to work on Mac or Unix systems.

  • Training and saving the model

In this example, we use the iris dataset from scikit-learn to build our machine learning model.

After loading the dataset, extract the features(x) and targets(y) which are used for model training. In order to test the predictions, create a dictionary named “labels” which contains label names for the targets. Here, Decision Tree Classifier is used as the model. Feel free to try out other classifiers from sklearn. Prediction labels for the test data are generated by calling the predict method on the model.

We export our model as a pickle file using the pickle library and persist our model on the disk. After loading the model from the file, we give sample data as an input to the model and predict its target variable.

Code for
  • Build a REST API:

The flask web framework helps us create HTTP endpoints required to communicate with our model.

We read the saved model from the disk using the pickle.load() method.

Flask provides a route() decorator which tells the application which URL should call the associated function. It accepts 2 arguments namely the ‘rule’ and ‘options’. The ‘rule’ argument represents URL which binds to the function. The ‘options’ are a list of parameters to be forwarded to the underlying Rule object.

In the example, the ‘/api’ URL is bound to the predict() function. Hence, when we make a POST request, it calls the function which receives the feature vector in JSON format. The ‘feature’ vector is then passed into the model which makes predictions and then returns the labels in the JSON format.

Notice that the run() method of the Flask class runs the application on the local development server. Here, we pass the host as ‘’ in order to expose it from the docker container. You can see more about it in the docker configuration setup.

Code for
  • Packaging

In order to allow Docker to host our API, we need to specify a set of instructions that allow Docker to build the image. This set of instructions can be stored inside a Dockerfile. This file contains all the commands that can be called on the command line to create a Docker image.

So let’s create our Dockerfile. Open a text editor and save it as ‘Dockerfile’ with no suffix or prefix.

Our working directory will now have the following files. • to train and build the model • to manage requests and server • Dockerfile contains instructions for docker image • requirements.txt contains required libraries for the API

  • Continuous Integration

So far, we have created our Flask API, composed a Dockerfile and pushed the project in our git repository. As a prerequisite, we need to install these 3 applications — Docker, Ngrok and Jenkins. Going forward, the video in this section will demonstrate the entire process we have mentioned previously in our architecture diagram.

Video Tutorial depicting the deployment process


In this blog, we have dived into the process of deploying machine learning models using Docker, Flask, and Jenkins. We hope you find this information useful while deploying your own machine learning models in production. A GitHub repository for the code presented in this article can be found here.

If you found this information helpful, please feel free to comment and don’t forget to give us some claps.

For more Data Science awesomeness, follow SFU Big Data Science


726 claps
SFU Big Data Science

Blog posts on Big Data, Data Science, and Artificial Intelligence written by SFU Students and Professors