How to Setup Tensorflow Serving For Production

4 min readJun 22, 2018

Google is once again leading the space and defining how we build and develop technology. The rush to be the leader in AI is no joke, and they are not waiting for anybody. They recently announced Tensorflow Serving, which is an API that easily enables Data Scientists to launch a pre-built super fast RESTFUL API to serve their models in a production ready environment. It includes the ability to serve different models, AND the ability to serve different versions of the same model at the same time. Too good to be true? Its Google, so no.

Getting Started

Google has made it super easy to get started by providing us a docker file.

Install Docker

For those not familiar with docker before this, like I was, let me tell you, it simplifies your life tremendously especially for things like this. So if you do not have Docker installed, click here to install it.

Begin Installation of Tensorflow Serving

Start by cloning the Tensorflow Serving Repo

git clone https://github.com/tensorflow/serving
cd serving/tensorflow_serving/tools/docker

In that docker directory you will see 3 files.

Dockerfile, which is a minimal VM with TensorFlow Serving Pre-Installed and Compiled.
Dockerfile.devel, which is a minimal VM with all of the dependencies needed to build TensorFlow Serving.
Dockerfile.devel-gpu, which is a minimal VM with all of the dependencies needed to build TensorFlow Serving with GPU support.

I wanted you to clone this repo to get familiar with the code and examples. However, I have made my own Dockerfile which is better suited for learning and testing quickly. Click Here for the repo link. That repo also has pre-built models in it, that we will later build in this series.

Clone the Repo on your desktop or home dir.

git clone https://github.com/brianalois/tensorflow_serving_tutorial.git

Do a docker build using the Dockerfile in my repo. This will take about 3–5 minutes.

cd tensorflow_serving_tutorialdocker build --pull -t test-tensorflow-serving .

Run that docker newly built docker image, and enter into the virtual env.

Or (if you want to compile it yourself)

I suggest most people just build from my docker image and skip this part.

If you want to compile the tf serving from source you can. In the original tf serving repo where they have the docker images run this command(it will take 30 min — 2 hours, depending on your computer):

cd serving/tensorflow_serving/tools/dockerdocker build --pull -t test-tensorflow-serving -f Dockerfile.devel .

However, based on your computer that command might fail. It did for me and I have a Top of the line Mac currently, so you will probably need to run this command, to compile it yourself.

#however based on the ram of your computer you may have to run this
docker build --pull --build-arg TF_SERVING_BUILD_OPTIONS="--copt=-mavx \
--cxxopt=-D_GLIBCXX_USE_CXX11_ABI=0 --local_resources 2048,.5,1.0" -t \
test-tensorflow-serving -f Dockerfile.devel .

Run the Docker Image

Now after it is built regardless of the way you did it, we can begin to run it.This next command is not a direct copy and paste. You must use absolute paths here to the volume you want to use for your docker machine. We will be using the model_volume directory in the repo just cloned. My path is “/Users/Brian/Desktop/model_volume/” Replace to what ever your path is.

Tip you can run this command in your directory to find the path

pwd

Then run this command with your absolute path:

docker run -it -p 8500:8500 -p 8501:8501 -v /Users/Brian/Desktop/tensorflow_serving_tutorial/model_volume/:/home/ test-tensorflow-serving

Now if everything worked you should be in your docker machine, and your shell should look something.

root@15954c4d0666:/#

If that is correct YOU ARE IN!

Now we can get to the fun part and run the tensorflow serving api:

tensorflow_model_server --port=8500 --rest_api_port=8501 --model_config_file=/home/configs/models.conf

This basically starts the server and serves the rest api on port 8501. It uses the configuration from this directory on the virtual machine /home/configs/models.conf, but that is actually referring to the file in the repo in this directory

cat model_volume/configs/models.config

That file contains config information for the server to be able to serve the models in the repo in the right way. Lets look at that real quick:

model_config_list: {
  config: {
    name:  "xor",
    base_path:  "/home/models/xor",
    model_platform: "tensorflow",
    model_version_policy: {
        all: {}
    }
  },
  config: {
    name:  "iris",
    base_path:  "/home/models/iris",
    model_platform: "tensorflow",
    model_version_policy: {
        all: {}
    }
  }
}

This is pretty self explanatory, the part that is important is if you want to serve all versions of your model this part is necessary, if you do not have it than it will only serve the latest model:

model_version_policy: {
   all: {}
}

The model versions are determined by name of the nested directories, in that specific model directory.

Now Lets Test It!!

Open up another terminal, to make some curl requests. You could/should use Postman if you prefer.

Iris Classify

curl -X POST \
  http://localhost:8501/v1/models/iris:classify \
  -H 'cache-control: no-cache' \
  -H 'postman-token: f7fb6e3f-26ba-a742-4ab3-03c953cefaf5' \
  -d '{
 "examples":[
   {"x": [5.1, 3.5, 1.4, 0.2]}
  ]
}'

Response:

{
    "results": [
        [
            [
                "Iris-setosa",
                0.872397
            ],
            [
                "Iris-versicolor",
                0.108623
            ],
            [
                "Iris-virginica",
                0.0189799
            ]
        ]
    ]
}

Iris Predict

curl -X POST \
  http://localhost:8501/v1/models/iris/versions/2:predict \
  -d '{
 "signature_name": "predict-iris",
 "instances":[
  [5.1, 3.5, 1.4, 0.2]
 ]
}'

response:

{
    "predictions": [
        [
            0.872397,
            0.108623,
            0.0189799
        ]
    ]
}

There you go! We’re all set. Now you must be wondering well how to a build, train and save my models in a format that this api will recognize. Click here for part 2 of this series that answers those questions.

Best of Luck

— — Brian Alois Schardt