How to deploy Machine Learning models with TensorFlow. Part 3— into the Cloud!

Vitaly Bezgachev
Towards Data Science
9 min readJul 1, 2017

--

In Part 1 and Part 2 we created a GAN model to predict the Street View House Numbers and hosted it with TensorFlow Serving locally in a Docker container. Now it is a time to bring it into the Cloud!

Motivation

When you implement a web service (such as prediction service) and want that other people use it, you publish it at a Cloud Hosting provider. Usually you do not want to take care of such things as availability of your service, scalability and so on; you want to concentrate on developing of your models and algorithms.

Nowadays you have a huge selection of cloud platform providers. Most prominent are Amazon AWS, Microsoft Azure, Google Cloud Platform, IBM Bluemix. They provide resources and take care of availability of a service once it is deployed on their servers.

The second important task is an automation. You want automate deployment, scaling and management of you service. Personally I want to deploy per mouse click or running a simple script, I want that my service scales automatically if it gets more requests, I want that in a case of crash it restores without manual intervention and so on. Here a tool Kubernetes comes into play.

Kubernetes is an open source system for automating deployment, scaling, and management of containerized applications.

Kubernetes is developed and supported by Google. So you can definitely rely on it. The cloud providers, listed above, have a built-in support of it. So we can relatively easy setup our deployment into the Cloud.

Cloud provider selection

For this article I use Microsoft Azure. Mainly for one reason — they provide 30-days trial with 200$ credit.

I also looked at other providers. At AWS I didn’t find a possibility to get a small Kubernetes cluster for free. At Google you can test their Cloud Platform for free, but they scared me with “Account Type = Business” (I wanted just a small test environment for myself). IBM Bluemix was not very intuitive.

So I went with Microsoft Azure and they have nice User Interface too :-)

Prepare the Docker image

Before we step into the Kubernetes, we need to save our Docker container with all changes, we made, as an image. First we need container ID:

docker ps --all

You’ll get a list of all Docker containers — select ones you have created. For example:

CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
35027f92f2f8 <your user name>/tensorflow-serving-devel "/bin/bash" 4 days ago Exited (0) 4 days ago tf_container_cpu

I marked in bold the ID we need.

HINT: Since I do not want to have an expensive GPU-powered virtual machine in the cloud, I went with TensorFlow Serving CPU Docker container.

Now we can create Docker image from our container:

docker commit 35027f92f2f8 $USER/tensorflow-serving-gan:v1.0

If you look at the image list (docker images command), you should find a newly created image <your user name>/tensorflow-serving-gan tagged with v1.0. The image is approx. 5 times bigger than the original ones, since we downloaded and built a lot of new things.

Kubernetes. Key concepts shortly

I encourage you to check Kubernetes documentation for its capabilities, concepts and tutorials. Here I give just a very rough idea, how to work with it.

We deploy our Docker image into Kubernetes cluster. Kubernetes cluster consists of, at least, one master and one or more worker Nodes.

Node

Node is a worker machine in the Kubernetes cluster (virtual machine or bare metal). It is managed by the master component and has all services to run Pods. Those services include, for example, Docker, which is important for us.

We will create one master and one worker Node. Master Node does not run any Pod, its responsibility is a cluster management.

Pod

Pod is a group of one or more containers, the shared storage for those containers, and options about how to run the containers. So Pod is a logical host for logically dependent containers.

In our Pod we have only one Docker container, namely, our TensorFlow GAN Serving. We will create two Pods, though they will run on the same node.

Service

Pods are mortal. They are born and die. Replication controllers in particular create and destroy Pods dynamically. While each Pod gets its own IP address, even those IP addresses are not reliable. So if we have Pods that need talk to each other (e.g. frontend to backend) or we want to access some Pods externally (in our case), then we have a problem.

Kubernetes Service solves it. This is an abstraction, which defines a logical set of Pods and policy to access them. In our case we will create one service that abstracts two pods.

Setup Kubernetes cluster at Microsoft Azure

Microsoft Azure provides a convenient way to setup and operate a Kubernetes cluster. First of all, you need Azure account.

Get into Microsoft Azure

  • Get Azure free trial account at https://azure.microsoft.com/free/. If you already have Microsoft account, you can us it.
  • Microsoft provides 200$ credit to start with Azure. It is more than enough for our purpose.
  • Go to Azure Portal and check that you have full access to it.
  • Install Azure Command Line Interface locally on your PC. In Ubuntu 64-bit you can do that from the Terminal:
echo "deb [arch=amd64] https://packages.microsoft.com/repos/azure-cli/ wheezy main" | \
sudo tee /etc/apt/sources.list.d/azure-cli.list
sudo apt-key adv --keyserver packages.microsoft.com --recv-keys 417A0893sudo apt-get install apt-transport-httpssudo apt-get update && sudo apt-get install azure-cli

Then check installed version with:

az --version

It should be equal to or greater than 2.0.x.

Create Kubernetes cluster

You need to go through simple steps to create a Kubernetes cluster.

Login into Azure

az login

and follow instructions in the Terminal and in browser. At the end you should see a JSON document with a cloud information.

Create resource group

Resource group is a logical group where you deploy your resources (e.g. virtual machine) and manage them.

az group create --name ganRG --location eastus

You should see a JSON document with the resource group information.

Create cluster

Now it is time to create Kubernetes cluster. In Azure free trial we can use up to 4 cores, so we can create only one Master and one Worker; each of them has 2 cores. Master Node is created automatically, agent-count specifies number of worker (or agent) Nodes.

az acs create --orchestrator-type=kubernetes \
--resource-group ganRG \
--name=ganK8sCluster \
--agent-count=1 \
--generate-ssh-keys

Hint: do not use very long names, I got an error by creation of Kubernetes Services, which complained about too long names (Azure adds pretty long suffixes to a cluster name).

You will receive also generated SSH keys in the default location if they don’t already exist. Wait a couple of minutes… If everything went right, then you should see pretty long and detailed JSON document with cluster information. Convince yourself that a cluster has been created in your resource group

{
.........
"resourceGroup": "ganRG"
}

You can also check in Azure Portal that we have 2 virtual machines — k8s-master-… and k8s-agent-… in the resource group ganRG.

CAUTION

Be aware to stop or deallocate virtual machines if you do not use them to avoid extra costs:

az vm [stop|deallocate] — resource-group=ganRG — name=k8s-agent-…
az vm [stop|deallocate] — resource-group=ganRG — name=k8s-master-…

You can start them again with:

az vm start — resource-group=ganRG — name=k8s-agent-…
az vm start — resource-group=ganRG — name=k8s-master-…

Upload Docker image into Azure

Now I make a short break with Kubernetes and upload our Docker image into Azure Container Registry. It is our Docker private registry in the Cloud. We need this Registry to pull the Docker image into our Kubernetes cluster.

Create container registry

az acr create --name=ganEcr --resource-group=ganRG --sku=Basic

creates a private registry ganEcr. As a response you should get a JSON document with registry information. Interesting fields are:

{
"adminUserEnabled": false,
.........
"loginServer": "ganecr.azurecr.io",
.........
}

The first tells you, there is no administrator for the Registry and the second gives you a name of the server in your resource group. For the upload of a Docker image we need an administrator — we must enable him:

az acr update -n ganEcr --admin-enabled true

Upload Docker image

First, we must provide credentials for the upload:

az acr credential show --name=ganEcr

You should receive a response similar to:

{
"passwords": [
{
"name": "password",
"value": "=bh5wXWOUSrJtKPHReTAgi/bijQCkjsq"
},
{
"name": "password2",
"value": "OV0Va1QXv=GPL+sGm9ZossmvgIoYBdif"
}
],
"username": "ganEcr"
}

You use a username and one of passwords to login into registry container from Docker:

docker login ganecr.azurecr.io -u=ganEcr -p=<password value from credentials>

Then tag the GAN Docker image in that way:

docker tag $USER/tensorflow-serving-gan:v1.0 ganecr.azurecr.io/tensorflow-serving-gan

And now you can push it into Azure Container Registry!

docker push ganecr.azurecr.io/tensorflow-serving-gan

Be patient, it takes a while :-) Remember, this operation uploads the Docker image from your PC to Microsoft servers.

Operate on Kubernetes cluster

Back to Kubernetes. For operations on a cluster we need a tool named kubectl. It is a command line interface for running commands against Kubernetes clusters.

I encourage you to check kubectl documentation for a list of available commands. I use here:

kubectl get [nodes|pods|services] 

to get the information about Kubernetes Nodes, Pods and Services respectively and

kubectl create ...

for creation of Pods and Services from a configuration file.

Connect with kubectl to Azure

First get credentials from Azure:

az acs kubernetes get-credentials --resource-group=ganRG --name=ganK8sCluster

This commands gets credentials from Azure and saves them locally into ~/.kube/config. So you do not need to ask for them again later.

Now verify connection to the cluster with kubectl:

kubectl get nodes

You should see master and worker Nodes:

NAME                    STATUS                     AGE       VERSION
k8s-agent-bb8987c3-0 Ready 7m v1.6.6
k8s-master-bb8987c3-0 Ready,SchedulingDisabled 7m v1.6.6

Deployment configuration

I have created a configuration file (YAML format) for the deployment into Kubernetes cluster. There I defined a deployment controller and a service. I encourage you to read Kubernetes documentation for details.

Here is just an explanation, what I did. Actually, you can deploy Pods and Services with kubectl commands, but it is much more convenient to write desired deployment configuration into such YAML file once and use it later.

In our case it defines a deployment and a service:

......
kind: Deployment
metadata:
name: gan-deployment
......
---
kind: Service
metadata:
labels:
run: gan-service
name: gan-service

Deployment

I want to deploy my Docker image into 2 Pods:

spec:
replicas: 2

And pull it from my Azure Container Registry:

spec:
containers:
- name: gan-container
image: ganecr.azurecr.io/tensorflow-serving-gan

After deployment a Pod should start the Shell and start TensorFlow, serving a GAN model, in the Docker container:

command:
- /bin/sh
- -c
args:
- /serving/bazel-bin/tensorflow_serving/model_servers/tensorflow_model_server --port=9000 --model_name=gan --model_base_path=/serving/gan-export

On the port 9000:

ports:
- containerPort: 9000

Service

Service must accept external requests on port 9000 and forward them to a container port 9000 in a Pod:

ports:
- port: 9000
targetPort: 9000

And provide load balancing between 2 underlying Pods:

type: LoadBalancer

Deploy into Kubernetes cluster

Now deploy the Docker image and let Kubernetes manage it!

cd <path to GAN project>
kubectl create -f gan_k8s.yaml

You should see:

deployment "gan-deployment" created
service "gan-service" created

Now check that Pods are running:

kubectl get podsNAME                              READY     STATUS    RESTARTS   AGE
gan-deployment-3500298660-3gmkj 1/1 Running 0 24m
gan-deployment-3500298660-h9g3q 1/1 Running 0 24m

Hint: it can take some time before the status changes to Running. Only then you can be sure that Pods are ready to serve.

And the service is ready:

kubectl get servicesNAME          CLUSTER-IP     EXTERNAL-IP    PORT(S)          AGE
gan-service 10.0.134.234 40.87.62.198 9000:30694/TCP 24m
kubernetes 10.0.0.1 <none> 443/TCP 7h

Hint: External IP for gan-service must be a valid IP (not <pending> or <node>). Otherwise the service is not operational.

Check that it works

So now we can check that our efforts were worthwhile!

cd <path to GAN project>
python svnh_semi_supervised_client.py --server=40.87.62.198:9000 --image=./svnh_test_images/image_3.jpg

You should see:

outputs {
key: "scores"
value {
dtype: DT_FLOAT
tensor_shape {
dim {
size: 1
}
dim {
size: 10
}
}
float_val: 8.630897802584857e-17
float_val: 1.219293777054986e-09
float_val: 6.613714575998131e-10
float_val: 1.5203355241411032e-09
float_val: 0.9999998807907104
float_val: 9.070973139291283e-12
float_val: 1.5690838628401593e-09
float_val: 9.12262028080068e-17
float_val: 1.0587883991775016e-07
float_val: 1.0302327879685436e-08
}
}

If you get this result, congratulations! You deployed the GAN model in the Cloud in a Kubernetes cluster. And Kubernetes scales, load balances and manages a GAN model in reliable way.

Conclusion

In Part 1 and Part 2 we created a GAN model, prepared it for serving by TensorFlow and put it into a Docker container. That was a lot of work, but we made everything locally.

Now we made it publicly available and did that in reliable and scalable way, using state-of-the-art technologies.

For sure, the model is quite simple and it gives results in user-unfriendly form. There is a lot of automation potential, and I did a lot of manual steps to explain the mechanics. All the steps can be packed into scripts for “one-click” deployment or be a part of Continuous Integration/Continuous Deployment pipeline.

I hope you enjoyed this tutorial and find its useful. In case of any questions or problems do not hesitate to contact me.

Update 12. Nov. 2017

I extended a tutorial with instructions, how to create REST API to models, hosted by TensorFlow Serving.

--

--