Image for post
Image for post

Deploying Object Detection Model with TensorFlow Serving — Part 2

In Part 1 of this series, I wrote about how we can create a production-ready model in TensorFlow that is compatible with TensorFlow serving. In this part, we will see how can we create TF-serving environment using Docker.

About Docker

Docker is a software tool that lets you package software into standardised units for development, shipment and deployment. Docker container image is a lightweight, stand-alone, executable package of a piece of software that includes everything needed to run it: code, runtime, system tools, system libraries, settings.

In short, Docker lets us you isolate your application and its dependencies in a stand-alone package that can be used anywhere and anytime without having to worry about installing code and system dependencies. Our motivation for using Docker for TensorFlow serving is that it we can ship our container to run on the cloud and easily scale our service without having to install any dependencies again.

Official documentation of TensorFlow serving describes how to build it from source. It’s good but I (and a lot of the community) had problems compiling it in the Docker container. So we will go over the steps one-by-one here.

  1. Build the container using the official docker image

Assuming you have cloned the official TensorFlow serving repo as described in the last part, you can build the docker image by,

# Move to the directory of the docker files
cd ./serving/tensorflow_serving/tools/docker/
# Build the image (CPU)
docker build --pull -t $USER/tensorflow-serving-devel-cpu -f Dockerfile.devel .
or # Build the image (GPU)
docker build --pull -t $USER/tensorflow-serving-devel-gpu -f Dockerfile.devel-gpu .

Before starting the docker container, increase the memory (to 10–12 GBs) and CPUs (to 4–6) available to the container in the preferences section of the docker app. Building TensorFlow serving is a memory intensive process and the default parameters might not work. Once done, you can start the container by,

docker run -it -p 9000:9000 $USER/tensorflow-serving-devel-cpu /bin/bash
docker run -it -p 9000:9000 $USER/tensorflow-serving-devel-gpu /bin/bash

In the container,

# Clone the TensorFlow serving Github repo in the container
git clone --recurse-submodules
cd serving/tensorflow
# Configure TensorFlow
cd ..
# Build TensorFlow serving
bazel build -c opt --copt=-msse4.1 --copt=-msse4.2 tensorflow_serving/...
or [FOR GPU]
# TensorFlow serving Github repo is already present in the container # so do not need to clone again
# Configure TensorFlow with CUDA by accepting (-y) --
# with_CUDA_support flag
cd serving/tensorflow
# Build TensorFlow serving with CUDA
bazel build -c opt --copt=-msse4.1 --copt=-msse4.2 --copt=-mavx --copt=-mavx2 --copt=-mfma --copt=-O3 --copt=/usr/local/cuda tensorflow_serving/...

The build process can take up to 1 hour depending the host system and docker configuration. Once the build is finished without any errors, you can test if the model server is running by,


The output should look something like,

Flags:--port=8500                       int32 port to listen on--enable_batching=false           bool enable batching--batching_parameters_file=""     string If non-empty, read an ascii BatchingParameters protobuf from the supplied file name and use the contained values instead of the defaults.--model_config_file=""            string If non-empty, read an ascii ModelServerConfig protobuf from the supplied file name, and serve the models in that file. This config file can be used to specify multiple models to serve and other advanced parameters including non-default version policy. (If used, --model_name, --model_base_path are ignored.)--model_name="default"            string name of model (ignored if --model_config_file flag is set--model_base_path=""              string path to export (ignored if --model_config_file flag is set, otherwise required)--file_system_poll_wait_seconds=1 int32 interval in seconds between each poll of the file system for new model version--tensorflow_session_parallelism=0 int64 Number of threads to use for running a Tensorflow session. Auto-configured by default.Note that this option is ignored if --platform_config_file is non-empty.--platform_config_file=""         string If non-empty, read an ascii PlatformConfigMap protobuf from the supplied file name, and use that platform config instead of the Tensorflow platform. (If used, --enable_batching is ignored.)

Your serving environment is now ready to be used. Exit the container and commit the changes in the container to an image. You can do this by,

  • Pressing [Cltr-p] + [Cltr-q] to exit the container
  • Find the container Id,
# Find the container Id
docker ps
  • Commit the changes,
# Commit the changes
docker commit ${CONTAINER ID} $USER/tensorflow-serving-devel-cpu
or [FOR GPU]
docker commit ${CONTAINER ID} $USER/tensorflow-serving-devel-gpu
  • Re-enter the container,
docker exec -it ${CONTAINER ID} /bin/bash

Note: For TensorFlow serving container to access the GPUs on your host system, you need to install nvidia-docker on your system and run the container by,

nvidia-docker docker run -it -p 9000:9000 $USER/tensorflow-serving-devel-gpu /bin/bash

You can then check your GPU usage inside the container by using the nvidia-smi cmd.

Pre-built Docker images

As seen on a number of Github issues (see resources) that people are unable to compile TensorFlow serving on docker, I have pre-built Docker images for both CPU and GPU support.

You can find them at my Docker Hub page or pull the images down by,

docker pull gauravkaila/tf_serving_cpu
or [FOR GPU]
docker pull gauravkaila/tf_serving_gpu

In the next part, I will describe how/where to store our model created in part 1 and create a client that can request the TensorFlow serving service created in this part. At the end of the next part, we will be able to run inference on a test image using the model being served on the docker container.


Github issues:

About the author: Gaurav is a data science manager at EY’s Innovation Advisory in Dublin, Ireland. His interests include building scalable machine learning systems for computer vision applications. Find more at

Written by

Data Science Manager @EY and Chief Data Scientist @IdeaChain; A hub for ideas, discussion and collaboration -

Sign up for Innovation Monthly

By The Innovation Machine

The Newsletter for the Innovation Leader - Methods, Ideas, Technology Updates Take a look

By signing up, you will create a Medium account if you don’t already have one. Review our Privacy Policy for more information about our privacy practices.

Check your inbox
Medium sent you an email at to complete your subscription.

The community of Innovators and Inventors. We welcome people who are passionate about technology as the means of solving big problems. We believe in ideas and the power of online communities. Follow the Innovation Machine to discover problems worth solving and big ideas.

Get the Medium app