Geek Culture
Published in

Geek Culture

Deploying a TensorFlow Model to Kubernetes

An versatile approach to using AI in a microservice architecture

The TensorFlow model

# Importing TensorFlow
import tensorflow as tf
# Loading the data
mnist = tf.keras.datasets.mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()
# Data preprocessing (here, normalization)
x_train, x_test = x_train / 255.0, x_test / 255.0
# Building the model
model = tf.keras.models.Sequential([
tf.keras.layers.Flatten(input_shape=(28, 28)),
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(10)
])
# Loss function declaration
loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)
# Model compilation
model.compile(optimizer='adam',
loss=loss_fn,
metrics=['accuracy'])
# Training
model.fit(x_train, y_train, epochs=5)
model(x_test[:2])
model.save('./mymodel/1/')

The TensorFlow Serving Docker container

docker run -d --name serving_base tensorflow/serving
docker cp ./mymodel serving_base:/models/mymodel
docker commit --change "ENV MODEL_NAME mymodel" serving_base my-registry/mymodel-serving
docker kill serving_base
docker rm serving_base
docker run -d -p 8501:8501 my-registry/mymodel-serving
{
"model_version_status": [
{
"version": "1",
"state": "AVAILABLE",
"status": {
"error_code": "OK",
"error_message": ""
}
}
]
}
docker push my-registry/mymodel-serving

Deploying the container to Kubernetes

apiVersion: apps/v1
kind: Deployment
metadata:
name: mymodel-serving
spec:
replicas: 1
selector:
matchLabels:
app: mymodel-serving
template:
metadata:
labels:
app: mymodel-serving
spec:
containers:
- name: mymodel-serving
image: my-registry/mymodel-serving
ports:
- containerPort: 8501
---
apiVersion: v1
kind: Service
metadata:
name: mymodel-serving
spec:
ports:
- port: 8501
nodePort: 30111
selector:
app: mymodel-serving
type: NodePort
kubectl apply -f kubernetes_manifest.yml

Using the TensorFlow serving API

The AI model deployed in Kubernetes can now be used in prediction. This can be done by sending a POST request to prediction API of the TensorFlow Serving container. The body of the request consists of a JSON containing the input data to be fed to the model. The model will then reply with its prediction, also in JSON format. Here is an example of how this can be implemented in Python, using the requests module

# Import the necessary modules
import requests
import numpy as np
import json
import tensorflow as tf
# Loading data
mnist = tf.keras.datasets.mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()
# Data preprocessing (here, normalization)
x_train, x_test = x_train / 255.0, x_test / 255.0
# Format the image data so as to be sent as JSON
payload = json.dumps( { 'instances': x_test[:2].tolist() } )
# URL of the TensorFlow Serving container's API
url = 'http://<cluster IP>:30111/v1/models/mymodel:predict'
# Send the request
response = requests.post(url, data=payload)
# Parse the response
prediction = response.json()["predictions"]
# Print the result
print(prediction)

Conclusion

Thanks to Tensorflow serving, AI models can be containerized easily, turning them into own standalone applications that can be interacted with through HTTP calls. This provides increased separation of concern and modularity, especially compared to embeddeding the model directly in the source code of the application relying on it.

--

--

A new tech publication by Start it up (https://medium.com/swlh).

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store