Serving ML with Flask, TensorFlow Serving and Docker Compose

Usen Osasu
Analytics Vidhya
Published in
4 min readJun 17, 2020

A short guide on how to serve your deep learning models in production using Flask, Docker-Compose and Tensorflow Serving

Introduction

In the first part, we built a neural network classifier to predict if a given image is of “Good” or “Bad” quality using transfer learning. In this tutorial, we would serve this model as an endpoint using tensorflow-serving and Flask.

Tesnorflow-serving is an API(Application Programming Interface) designed by Google for using Machine Learning models in production. Tensorflow-serving makes it easier to deploy your trained model as well as providing API endpoint for interacting with the model.

Docker

Docker is an open platform for developing, shipping, and running applications. Docker enables you to separate your applications from your infrastructure so you can deliver software quickly

Install docker

If you do not have Docker installed, click here to install it. Also install docker-compose from here.

docker-compose

Docker Compose is a tool for defining and running multi-container Docker applications. Using docker compose you can configure you application services and parameters. With a single command, you create and start all the services from your configuration.

Before building the flask server, we first have to export the model made in the previous tutorial to a format required by tensorflow serving. The following code snippet will load a saved model from disk and export it as a tensorflow SavedModel.

import tensorflow as tf

model_path = "path to saved model"
export_path = "serving/"

new_model = tf.keras.models.load_model(model_path)

# Check its architecture
new_model.summary()

tf.saved_model.save(new_model, export_path)

Flask

Flask is a micro-framework written in python that offers a powerful way to build REST API. In this tutorial, we are using flask as a client facing API and also as the preprocessing step before querying tensorflow’s API.

Install Flask

pip install Flask

Flask Application

Next we build the endpoint to receive an image from client application and preprocess the image for prediction. First we load the image using keras preprocessing module then we transform the image into an array of type “float16” and size 150x150. We then query tensorflow serving REST API serving our model and get the prediction.

import numpy as np
import json, requests
from io import BytesIO
from tensorflow.keras.preprocessing import image
from flask import Flask, request, jsonify


app = Flask(__name__)


@app.route('/')
def hello_world():
return 'Hello World!'


@app.route('/image-quality')
def image_quality():
data = {}

if not request.files["image"]:
return jsonify({"status": 400, "message": 'No image passed'})

# Decoding and pre-processing base64 image
img = image.img_to_array(image.load_img(BytesIO(request.files["image"].read()),
target_size=(150, 150))) / 255.

# this line is added because of a bug in tf_serving < 1.11
img = img.astype('float16')

# Creating payload for TensorFlow serving request
payload = {
"instances": [{'input_image': img.tolist()}]
}

# Making POST request
r = requests.post('http://localhost:8501/v1/models/qualitynet:predict', json=payload)

# Decoding results from TensorFlow Serving server
pred = json.loads(r.content.decode('utf-8'))

pred = (np.array(pred['predictions'])[0] > 0.4).astype(np.int)
if pred == 0:
prediction = 'Bad'
else:
prediction = 'Good'

data["prediction"] = prediction

# Returning JSON response
return jsonify({"status": 200, "message": 'No image passed', "data": data })

For the next step, we want to start our tensorflow serving container. To do that, we will be using Bitnami’s Docker TensorFlow Serving image. Click here for more

With the Bitnami Docker TensorFlow Serving image it is easy to server models like ResNet or MNIST.

Bitnami’s docker tensorflow image provide the ability to configure tensorflow serving using docker-compose. Below is the docker-compose configuration file for both the flask server and tensorflow serving.

version: "3.7"
services:
flask_server:
container_name: flask_server
restart: always
build:
context: ./flask_server
dockerfile: Dockerfile
environment:
- FLASK_ENV=dev
- FLASK_APP=app.py
- FLASK_RUN_HOST=0.0.0.0
ports:
- 5000:5000
volumes:
- .:/flask_server
depends_on:
- image-serving
networks:
ml-network:
aliases:
- flask_server

image-serving:
image: docker.io/bitnami/tensorflow-serving:2-debian-10
container_name: image-serving
ports:
- 8500:8500
- 8501:8501
volumes:
- image-serving_data:/bitnami
- ./serving/conf:/bitnami/tensorflow-serving/conf/
- ./serving/model-data:/bitnami/model-data
networks:
- ml-network

volumes:
ml-db:
name: ml-db
image-serving_data:
driver: local

networks:
ml-network:

Here we define our configuration for the flask server named “flask_server”, we point the build context to the folder flask_server and its corresponding dockerfile. We define some environment variables and expose port 5000. We also added a volume and a docker network named ml-network. Dockerfile is shown below

FROM python:3.7.1

ADD . /flask_server
WORKDIR /flask_server

RUN pip install --upgrade pip && \
pip install --no-cache-dir -r requirements.txt

CMD [ "flask", "run" ]

For the tensorflow serving container named “image-serving”. We pull the bitnami tensorflow serving image from docker hub, expose the two port required by tensorflow, 8500 and 8501. Then we add a volume and also include the image to the the created docker network. Next, we copy the bitnami tensorflow serving configuration file and exported model data into our image. Tensorflow-serving conf file is shown below

model_config_list: {
config: {
name: "qualitynet",
base_path: "/bitnami/model-data",
model_platform: "tensorflow",
}
}

One other advantage of docker containers is the ability to connect them together. You can create docker networks by defining it under “networks”. Docker networking enables us to reference other containers in your network from within a container. This is done using the docker container name as a hostname.

The line in the flask server where we query tensorflow serving REST API will be changed to:

r = requests.post('http://image-serving:8501/v1/models/qualitynet:predict', json=payload) # was "http://localhost:8501"

To start up your containers, type “docker-compose up” in your terminal.

Conclusion

In this tutorial, we have successfully deployed a deep learning model for production using docker, flask and tensorflow’s model serving module. We looked at using docker-compose to define configurations for both the flask server and tensorflow serving. Thank you for your time.

Github repo with source code

--

--

Usen Osasu
Analytics Vidhya

Senior Data Scientist | Generative AI | Deep learning | Bringing data-driven strategies to the forefront of the fintech industry