Serve any XGBoost model with FastAPI in less than 40 lines

Published in

Predictly on Tech

4 min readApr 2, 2022

While staying PEP 8 compliant

As I iterate over my models, they usually have a varying feature set. While testing I like to serve my models in a micro service for easy integration with my other services. Keeping the model-server up to date with the ever changing model has proven to be quite time consuming due to the changes in the features. I found myself spending more and more time tinkering with the model server than the actual model. Something needed to change.

FastAPI is a web framework for building APIs in python. It has become my go to framework when serving my models. It requires a minimal amount of code, which in turn decreases the time required for development and testing and also reduces the risk for errors.

Using FastAPI on uvicron, I wrote a model-serving service which loads my XGBoost model, creates a class for a simple REST endpoint which validates input data defined by the model itself, performs the prediction and returns the result.

All this with automatic OpenAPI documentation and argument validation (courtesy of FastAPI) in less than 40 lines of python, while also staying PEP8 compliant.

To allow FastAPI to generate the OpenAPI doc and model validation, we must supply a type hint to the input argument of the function. On line 32 I define the input argument must be a List of InputFeatures.

InputFeatures is a class which I create dynamically during runtime based on my models feature set. In this example I use an XGBoost model, but any model can be used, as long as the get_features_dict function is adjusted accordingly.

To create a new Class i use the type function which either returns the type of the object or returns a new type object based on the arguments passed.

On line 24 I pass in three arguments, first the name of the class, in this case “InputFeatures”. Second I supply a tuple containing the Base classes for my new class. In this case i supply the pydantic BaseModel, which will be used by FastAPI for argument validation and OpenAPI docs. Lastly I supply a dict which contains the body definition of the class. This is created using the get_features_dict function.

get_features_dict

This function will vary depending on your model, in this example I’m using an XGBoost model which I load memory using pickle on line 10.

XGBoost has functions which allows me to extract both the feature names and a string value of the type of the feature.

I can use this string value, for example float and pass it to the locate function from the pydoc module, to get the class type of matching string. I then invoke __call__() to create an instance of the type. I then zip the two lists together to create a dict.

For a model with 8 numerical features the resulting dict should look something like this

{
 'feat_0': 0.0,
 'feat_1': 0.0,
 'feat_2': 0.0,
 'feat_3': 0.0,
 'feat_4': 0.0,
 'feat_5': 0.0,
 'feat_6': 0.0,
 'feat_7': 0.0
}

This allows the type function to correctly type the fields in our new class, which in turn allows FastAPI to generate argument validation and documentation!

Before the model can do a prediction, it must first get the data in a format it can handle. On line 33 I simply use the built in __dict__() method to convert the InputFeature to a dict. I then call values() to get a the dict_values, which in turn is turned to a simple list, which can then be passed to numpys asarray function to convert our List of InputFeatures to a numpy ndarray which the model can handle.

To expose the model I define a REST endpoint on line 31–32 with the function predict_post. Using FastAPI annotations, I can define it as a POST endpoint with a path and a response type. The async keyword signals to the python interpreter that the function can run concurrently, allowing multiple requests to the service to be processed at the same time.

Last on line 36 I define the services main loop, which is just running a FastAPI app on uvicorn on port 8080.

Run the application and navigate to localhost:8080/doc to view the OpenAPI UI and test your brand new model server!

You can view the model for the InputFeatures and even try out the predict endpoint. If you supply broken or invalid data, you will get a pretty 422 response with detailed description on what went wrong.

Now imagine you do some more EDA and make some new discoveries. So you change the input feature set. Simply switch the model.dat file and restart the model server!

And just like that your model server is up to date with the new version of your model!

Summary

FastAPI enables you to easily, and rapidly expose any machine learning model for prediction in a scalable manner. In less than 40 lines of code. Reducing the development time, cost and risk of errors, allowing you to focus on model development rather than model serving.

You are still stuck with the issue of changing input features for the caller of the predict endpoint, but I’ll leave that problem for the next article!

Serve any XGBoost model with FastAPI in less than 40 lines

Summary

Written by Simon Lind