Package your ML Model as a REST Service

Published in

Aruva.io Tech

4 min readMar 3, 2021

In this blog post, we will be packaging our previously created Titanic Survival Model and building a REST-based prediction service on the model. The process is fairly generic and can be adapted to any Sci-kit model

Starting from where we left of, we have so far been able to fit a model as below:

#Identify the model to define and initialize 
from sklearn.tree import DecisionTreeRegressor  #define the model 
survivorModel = DecisionTreeRegressor()#train the model
survivorModel.fit(training_set, training_survived)

Our next step will be to set up the predict method as a REST service.
There are numerous tools available to quickly do this. I tried BentoML recently and would say, it works amazingly well, is simple to implement can be dockerized to deploy in Kubernetes cluster.

Let’s get started

Prerequisites

Install BentoML in your python interpreter environment using
pip install bentoml

and validate using command line
bentoml --version

an expected output should return the bentoml version
bentoml, version 0.11.0

FYI: BentoML is currently not officially supported for Python v3.9, I am using v3.8 for my environment
I did face an issue as unable to execute ‘x86_64-linux-gnu-gcc’: No such file or directory; but the fix was easy as :
sudo apt-get install gcc

Step 1. Create a Service Class i.e. the REST definition and expose the appropriate rest method

For our example, we will create a TitanicSurvivorPredictor service which will accept a Pandas dataframe as body, iterate through the dataframe row-by-row, run the predictions and provide an array response back with relevant prediction values

A minimal prediction service class looks like below:
name: titanic_survivor.py

import pandas as pd   
from bentoml import env, artifacts, api, BentoService   
from bentoml.adapters import DataframeInput   
from bentoml.frameworks.sklearn import SklearnModelArtifact      @env(infer_pip_packages=True)   @artifacts([SklearnModelArtifact('model')])   
class TitanicSurvivorPredictor(BentoService):              @api(input=DataframeInput(), batch=True)
    def predict(self, df: pd.DataFrame):           
        return self.artifacts.model.predict(df)

description:
@env annotation or descriptor (based on your preference) will specify the dependencies to be automatically searched on the model interpreter path

@artifacts defines the packaged model key ('model') and the type of model which in our case is SklearnModelArtifact

BentoML supports other model frameworks as well for Pytorch, Keras and Xgboost. More details here

@api decorator defines the entry point and endpoint to call i.e. predict in this case. input defines the input type i.e. dataframe and batch=True indicates input dataframe to include multiple prediction requests as a list and response to be provided as a corresponding list

Step 2. Save Prediction Service

Next, we will indicate our model to be packaged with our newly created prediction service and save the service on disk

You can do this in a separate class or use the model Jupyter notebook. Add the following code lines as below:

from titanic_survivor import TitanicSurvivorPredictor      predictor_service = TitanicSurvivorPredictor()   predictor_service.pack('model',mySampleModel)      
saved_path = predictor_service.save()

Here:

we will import the Prediction REST service that we created in step 1
We will create a package service with the model i.e. mySampleModel and save the service using the save() method

This will return a path on the local file system where the packaged service is now compiled and saved.

Step 3. Running the Model as REST API

According to BentoML:
The BentoML packaged model format contains all the code, files, and configs required to run and deploy the model.

BentoML also offers a model management service Yatai which I am yet to explore

For now, we will just use BentoML’s local API server and host the service

bentoml serve TitanicSurvivorPredictor:latest

This will essentially start the model server and provide a localhost URL. In my case, it was:

Running on http://127.0.0.1:5000/ (Press CTRL+C to quit)

Step 4. Validating using Postman

So far, we have a model predict function converted to a REST service using BentoML, and we have locally deployed using BentoML model server

Let’s fire up Postman and make a post call
For my simple example, the input data frame looks as below:

`PassengerClass, PassengerKidsCount, PassengerSiblings, FarePaid`

So, an example data frame would look like

[[3,4,1,16.7]]

note the double array, this is because we have the batch=True in step 1

Enter the URL as:
http://127.0.0.1:5000/predict

with body type as raw and content as above; Send the request, and you should see a response with the predictions

[1.0]

Congratulations!! You have successfully converted your ML model to a REST service