DataOps part 2: Sail the Titanic to API

Calvin Canh Tran
Aug 14 · 2 min read

Before starting the deployment part, let’s prepare our model. If you haven’t had the running Kubernetes environment, refer to Part 1.

To be short, in this part, I will cover:

  • Training the model and persistence model using pickle.
  • Create Restful API to serve model file.

I won’t go details about the EDA or training model in this article however I made a notebook using Titanic dataset (can be found here). I used different machine learning methods to train the model and here is the results.

Model Comparison

The winner belongs to Random Forest & Decision Tree with accuracy 93.60%. I exported the random forest model to pickle format model.pkl, we use this file from now onward.

import pickle
pickle.dump(random_forest, open('model.pkl', 'wb'))

Flask framework

One app.py file to serve the titanic model.

app.py

Let’s break it down

with open('model.pkl', 'rb') as handle:    
app.model = pickle.load(handle)

I load the model to app.model variable during starting flask server, the model will be stored inmemory so that it doesn’t reload every time we send requests.

I also create a GET endpoint name heartbeat which returned ‘OK’ (200 response code). This using by kubernetes to check the liveness of api.

The main part of service is from line 21. It gets the parameters from POST request and convert it to 2D numpy array

np_array = np.expand_dims(list(payload.values()), 0)

This step is required, machine learning model only accept the 2D numpy array as the input.

Let’s do a quick test for our API

# Run the server
$ python app.py
* Serving Flask app "app" (lazy loading)
* Environment: development
* Debug mode: on
* Running on http://127.0.0.1:5000/ (Press CTRL+C to quit)

And curl on another terminal.

$ curl -i -X POST -H "Content-Type:application/json" http://localhost:5000/predict -d '{"Pclass":3,"Sex":1,"Age":2,"Fare":7,"Embarked":0,"Title":1,"FamilySize":2,"IsAlone":0,"Age_Class":6, "FarePerPerson":3}'HTTP/1.0 200 OK
Content-Type: application/json
Content-Length: 18
Server: Werkzeug/0.14.1 Python/3.6.2
Date: Sun, 14 Jul 2019 16:48:57 GMT
{
"result": 1
}

That’s awesome, our API has been working as expected. One more final step, dockerize our application and push it to dockerhub

Here is my Dockerfile

A few commands for pushing our image to docker hub.

$ docker build -t titanic-api .
$ docker tag titanic-api canhtran/titanic-app-demo
$ docker push titanic-api canhtran/titanic-app-demo

The image will be available on https://hub.docker.com. Alternatively, you can use my image.

$ docker pull canhtran/titanic-app-demo

All right! Our preparation have been completed and ready to deploy to kubernetes.

Part 3: https://medium.com/@canhtran/dataops-part-3-deploy-titanic-to-kubernetes-fe22c96ef87f

Calvin Canh Tran

Written by

Machine learning engineer (http://tranduycanh.com)

the tech warrior

truly technologies

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade