Tensorflow Object Detection Model Serving Tutorial.

Alexey Korotkov
4 min readMay 4, 2020

--

Editorial

Previously we were showing you mostly ways how to train and ship a model to your device. This approach has some benefits, such as running a model on a mobile device without an internet connection, no need to maintain server infrastructure, no need to pay the bills for GPU servers. But it also has some drawbacks — mobile devices computational power limitations, and battery power usage.

For some tasks, it’s better to run the neural network inference in the cloud. It can give support for all kind of devices of your users (not only the latest generation of devices), but it can also provide you with performance benefits because running a model in the cloud on a powerful instance will work much faster. For solving some tasks, there is a need in cascades of different huge models, so running model pipelines on the server would be a preferable option in this case as well. That’s why we want to show you how you can deploy your trained model on a server, send input data on a server, and receive results from it.

Today we will try to deploy a server with the Tensorflow object detection model.

Model Creating

Let’s start by creating an object detection model. You can use already pretrained models or create custom object detection model that you need with MakeML app.

For creating a model with MakeML, create a project, using Object Detection dataset type and Tensorflow training configuration. Import and markup images and press start training button. If you don’t know how to do it, take a look at other our tutorials, for example, Socer Ball Tutorial.

When your model has been trained, download it and go to the frozen/saved folder. There is a saved_model.pb file. It’s all that we need for this demo.

Configure Server

I will use the server on Ubuntu 16.04 that works on AWS EC2. For this demo, we will not use GPUs for object detection. Let’s start with the CPU. Also, you need to install Docker on your server.

At first, create a folder on the server with, for example, a TensorflowDocker name. Go inside it and create a new model folder. Copy the saved_model.pb file from your local machine to this folder. Move back to the TensorflowDocker and create Dockerfile with the command nano Dockerfile. Paste the following code:

FROM tensorflow/serving # Define metadata LABEL author="MakeML" LABEL version="1.0" LABEL description="Deploy tensorflow object detection model with MakeML" # Copy model WORKDIR /models RUN mkdir -p object-detect/1 RUN mkdir -p object-detect/1/variables ADD model/saved_model.pb object-detect/1/ EXPOSE 80 ENTRYPOINT ["tensorflow_model_server", "--model_base_path=/models/object-detect"] CMD ["--rest_api_port=80","--port=81"]

So, we are ready to create a Docker image with your model and Dockerfile. Let’s name it makeml_model.

docker build -t makeml_model .

The next command will run your Docker image:

docker run -p 80:80 -p 81:81 makeml_model

That’s all, and you are ready to get requests.

Client part

We will make a simple python script that makes a post request with one image and print answer in JSON format. Let’s create file detection.py and paste into it the following code:

import PIL.Image import numpy import requests import sys image = PIL.Image.open(sys.argv[1]) image_np = numpy.array(image) payload = {"instances": [image_np.tolist()]} start = time.perf_counter() res = requests.post("http://<Public IP>:80/v1/models/default:predict", json=payload) #change to your public IP print(f"Took {time.perf_counter()-start:.2f}s") pprint(res.json())

Just replace Public IP with your IP. For using this script, just run python detection.py <image path>.jpg. You will get JSON in the following format:

Where detection_boxes is an array of detection objects, the object is the array of the four numbers. The number with the index 0 means min Y, 1 - min X, 2 - max Y, 3 - max X. Array detection_classes - types of detection objects. Array detection_scores - confidences.

So, I hope this simple demo shows you how to make your object detection server in 5 minutes. By the way, all this code you can find in our Github repository. Good coding!

Originally published at https://makeml.app.

--

--