Manage ML Deployments Like A Boss: Deploy Your First AB Test With Sklearn, Kubernetes and Seldon-core using Only Your Web Browser & Google Cloud

Gus Cavanaugh
Oct 17, 2018 · 16 min read
#ModelBoss: https://github.com/SeldonIO/seldon-core

Read This First — Summary

If you’re interested in deploying machine learning models as REST APIs but simply serving up endpoints isn’t good enough any more, you might be ready for model management.

Seldon-core is a lot nicer than Mick and you won’t have to drink raw eggs
  • Serve and route that model as an AB test with seldon-core
  • Deploy the whole kit and caboodle on kubernetes
  • Good attitude

Quick Introduction To Our Topic

To borrow from someone smarter than myself:

If a machine learning model is trained in a Jupyter notebook but never deployed, did it ever really exist?

And while that sinks in let me drop another bomb on you:

If you don’t manage your models, your models will manage you…

I know how asinine that sounds and I’m (mostly) kidding. But there is some truth amidst my nonsense. Deploying machine learning models is not only difficult but woefully insufficient. And while managing deployed models is even harder, it is absolutely required. So what’s a girl to do?

  • Install seldon-core using helm
  • Use s2i to build a Docker image of a basic ML model
  • Define and deploy our AB test as a model graph using seldon-core

1. Start & Connect To Kubernetes Cluster On GKE

The thing we use to move Docker containers around: https://kubernetes.io/
Get ready for a wild ride
We could ssh from our local machine but I’m too lazy for that
Thanks Google
Now let’s get this show on the road

2. Install Helm And Seldon-core On GKE

How we install stuff on kubernetes: https://helm.sh/
#!/usr/bin/env bash

echo "install helm"
# installs helm with bash commands for easier command line integration
curl https://raw.githubusercontent.com/kubernetes/helm/master/scripts/get | bash
# add a service account within a namespace to segregate tiller
kubectl --namespace kube-system create sa tiller
# create a cluster role binding for tiller
kubectl create clusterrolebinding tiller \
--clusterrole cluster-admin \
--serviceaccount=kube-system:tiller

echo "initialize helm"
# initialized helm within the tiller service account
helm init --service-account tiller
# updates the repos for Helm repo integration
helm repo update

echo "verify helm"
# verify that helm is installed in the cluster
kubectl get deploy,svc tiller-deploy -n kube-system
mkdir install-helm
cd install-helm
vim install-helm.sh
#Enter insert mode in vim with "i"
#Paste your code
#Exit vim with ":x"
bash install-helm.sh
helm install seldon-core-crd --name seldon-core-crd --repo https://storage.googleapis.com/seldon-charts \
--set usage_metrics.enabled=true
helm install seldon-core --name seldon-core --repo https://storage.googleapis.com/seldon-charts
You are very welcome, seldon-core

3. Use s2i To Build A Docker Image Of A Basic ML Model

Now that we have seldon-core installed, it’s time to build our model as Docker image and pass it to seldon-core for deployment and routing.

# Download the installer
wget https://github.com/openshift/source-to-image/releases/download/v1.1.12/source-to-image-v1.1.12-2a783420-linux-amd64.tar.gz
# Unpack the installer
tar -xvf source-to-image-v1.1.12-2a783420-linux-amd64.tar.gz
# Add the executable - s2i - to your path
cp s2i ../.local/bin/
sudo pip install sklearn grpcio-tools
# Clone the repo
git clone https://github.com/SeldonIO/seldon-core-launcher.git
# cd into the example directory we want to run
cd seldon-core-launcher/seldon-core/getting_started/wrap-model
#train_model.py
import numpy as np
import os
from sklearn.linear_model import LogisticRegression
from sklearn.pipeline import Pipeline
from sklearn.externals import joblib
from sklearn import datasets

def main():
clf = LogisticRegression()
p = Pipeline([('clf', clf)])
print('Training model...')
p.fit(X, y)
print('Model trained!')

filename_p = 'IrisClassifier.sav'
print('Saving model in %s' % filename_p)
joblib.dump(p, filename_p)
print('Model saved!')

if __name__ == "__main__":
print('Loading iris data set...')
iris = datasets.load_iris()
X, y = iris.data, iris.target
print('Dataset loaded!')
main()
python train_model.py
#IrisClassifier.py
from sklearn.externals import joblib

class IrisClassifier(object):

def __init__(self):
self.model = joblib.load('IrisClassifier.sav')
self.class_names = ["iris-setosa","iris-vericolor","iris-virginica"];
# feature_names aren't needed
def predict(self,X,features_names):
return self.model.predict_proba(X)
$ #cat.s2i/environmentMODEL_NAME=IrisClassifier
API_TYPE=REST
SERVICE_TYPE=MODEL
PERSISTENCE=0
$ cat requirements.txtscikit-learn==0.19.0
scipy==0.18.1
env DOCKER_REPO=gcav66 #Replace "gcav66" with your Docker Hub unames2i build . seldonio/seldon-core-s2i-python3 ${DOCKER_REPO}/sklearn-iris:0.1
docker push gcav66/sklearn-iris:0.1 #replace "gcav66" with the DOCKER_REPO environment variable or just hard code your Docker Hub username

4. Define Our AB test As A Model Graph Using Seldon-core

This is where the rubber meets the road.

Click the + button to open a new shell
kubectl port-forward $(kubectl get pods -l app=seldon-apiserver-container-app -o jsonpath='{.items[0].metadata.name}') 8002:8080
Forwarding from 127.0.0.1:8002 -> 8080
Forwarding from [::1]:8002 -> 8080
import requests
from requests.auth import HTTPBasicAuth
from proto import prediction_pb2
from proto import prediction_pb2_grpc
import grpc
try:
from commands import getoutput # python 2
except ImportError:
from subprocess import getoutput # python 3

API_HTTP="localhost:8002"
API_GRPC="localhost:8003"

def get_token():
payload = {'grant_type': 'client_credentials'}
response = requests.post(
"http://"+API_HTTP+"/oauth/token",
auth=HTTPBasicAuth('oauth-key', 'oauth-secret'),
data=payload)
print(response.text)
token = response.json()["access_token"]
return token

def rest_request():
token = get_token()
headers = {'Authorization': 'Bearer '+token}
payload = {"data":{"names":["sepallengthcm","sepalwidthcm","petallengthcm","petalwidthcm"],"tensor":{"shape":[1,4],"values":[5.1,3.5,1.4,0.2]}}}
response = requests.post(
"http://"+API_HTTP+"/api/v0.1/predictions",
headers=headers,
json=payload)
print(response.text)
payload = {"data":{"names":["sepallengthcm","sepalwidthcm","petallengthcm","petalwidthcm"],"tensor":{"shape":[1,4],"values":[5.1,3.5,1.4,0.2]}}}
  • The shape of our features, e.g. [1,4](one row, 4 columns)
  • The actual values, e.g., [5.1, 3.5,1.4]
{
"apiVersion": "machinelearning.seldon.io/v1alpha2",
"kind": "SeldonDeployment",
"metadata": {
"labels": {
"app": "seldon"
},
"name": "sklearn-iris-example"
},
"spec": {
"name": "sklearn-iris-deployment",
"oauth_key": "oauth-key",
"oauth_secret": "oauth-secret",
"predictors": [
{
"componentSpecs": [{
"spec": {
"containers": [
{
"image": "gcav66/sklearn-iris:0.1",
"imagePullPolicy": "IfNotPresent",
"name": "sklearn-iris-classifier",
"resources": {
"requests": {
"memory": "1Mi"
}
}
}
],
"terminationGracePeriodSeconds": 20
}
}],
"graph": {
"children": [],
"name": "sklearn-iris-classifier",
"endpoint": {
"type" : "REST"
},
"type": "MODEL"
},
"name": "classifier",
"replicas": 1,
"annotations": {
"predictor_version" : "0.1"
}
}
]
}
}
#Swap "gcav66" with your username
"image": "gcav66/sklearn-iris:0.1",
kubectl apply -f TMPL_deployment.json
kubectl get seldondeployments sklearn-iris-example -o jsonpath='{.status}'
In [3]: rest_request()
{"access_token":"b01d867a-ebf1-4d7f-8764-c8b11ae43461",
"token_type":"bearer",
"expires_in":43199,
"scope":"read write"}
{ "meta":
{ "puid": "j64h9tqf404rv2j3ikc8r98sdf",
"tags": { },
"routing": { } },
"data":
{ "names": ["iris-setosa", "iris-vericolor", "iris-virginica"], "tensor": { "shape": [1, 3], "values": [0.9974160323001712, 0.002583770255316237, 1.9744451239167056E-7] } }}
kubectl delete -f TMPL_deployment.json
{
"apiVersion": "machinelearning.seldon.io/v1alpha2",
"kind": "SeldonDeployment",
"metadata": {
"labels": {
"app": "seldon"
},
"name": "sklearn-iris-example"
},
"spec": {
"name": "sklearn-iris-deployment",
"oauth_key": "oauth-key",
"oauth_secret": "oauth-secret",
"predictors": [
{
"componentSpecs": [{
"spec": {
"containers": [
{
"image": "gcav66/sklearn-iris:0.1",
"imagePullPolicy": "IfNotPresent",
"name": "classifier-1",
"resources": {
"requests": {
"memory": "1Mi"
}
}
}],
"terminationGracePeriodSeconds": 20
}},
{
"metadata":{
"labels":{
"version":"v2"
}
},
"spec":{
"containers":[
{
"image": "gcav66/sklearn-iris:0.1",
"imagePullPolicy": "IfNotPresent",
"name": "classifier-2",
"resources": {
"requests": {
"memory": "1Mi"
}
}
}
],
"terminationGracePeriodSeconds": 20
}
}],
"name": "classifier",
"replicas": 1,
"annotations": {
"predictor_version": "v1"
},
"graph": {
"name": "random-ab-test",
"endpoint":{},
"implementation":"RANDOM_ABTEST",
"parameters": [
{
"name":"ratioA",
"value":"0.5",
"type":"FLOAT"
}
],
"children": [
{
"name": "classifier-1",
"endpoint":{
"type":"REST"
},
"type":"MODEL",
"children":[]
},
{
"name": "classifier-2",
"endpoint":{
"type":"REST"
},
"type":"MODEL",
"children":[]
}
]
}
}
]
}
}
kubectl apply -f ab_test.json
kubectl get seldondeployments sklearn-iris-example -o jsonpath='{.status}'
Isn’t that beautiful!?
kubectl delete -f ab_test.json 
#To same goodbye for now

Next Steps

Admittedly, this is a very basic example. But if you’ve made it this far perhaps you are willing to go a little further. I know I plan to try some different model graphs for myself. I plan to play around with building a variety of different models and routing traffic to them using not just AB tests, but multi-armed bandit as well. I hope you’ll join me — the model management world is now your oyster!

Analytics Vidhya

Analytics Vidhya is a community of Analytics and Data…

Analytics Vidhya

Analytics Vidhya is a community of Analytics and Data Science professionals. We are building the next-gen data science ecosystem https://www.analyticsvidhya.com

Gus Cavanaugh

Written by

I write about using Python for data analysis in Enterprise settings when IT challenges get in the way https://www.linkedin.com/in/gustafrcavanaugh/

Analytics Vidhya

Analytics Vidhya is a community of Analytics and Data Science professionals. We are building the next-gen data science ecosystem https://www.analyticsvidhya.com