Tutorial: Serving Machine Learning Models with FastAPI in Python

Jan Forster
4 min readFeb 15, 2020

--

Photo by Kate Townsend on Unsplash

In this tutorial I’ll show how to use FastAPI to quickly and easily deploy and serve machine learning models in Python as a RESTful API.

The code covered in this tutorial can be found here: https://github.com/eightBEC/fastapi-ml-skeleton

Scenario

Imagine that you’ve just trained a new regression model with a nice data set to predict house prices using a little Python script. Now you want to enable others to use the model easily or offer it as an API. Often the question arises by which framework the REST API will be provided — as there are many to choose from: build one yourself from scratch using the Requests library, use Django or Flask and so on — just to name a few. Your API should be fast, reliable, easy and fast to develop and ready to be used in production.

Let’s have a look at FastAPI.

What’s FastAPI?

FastAPI is a framework created by Sebastián Ramírez for building APIs using Python ≥ 3.6. It’s super fast, easy and quick to learn and implement, production-ready. If you want to learn about all the great features, have a look at the documentation. After going through the tutorial you’ll probably understand why so many people and companies use and like it.

Walkthrough

Structure

The repository is organized based on best practices from various sources (e.g. The Hitchhiker’s Guide to Python and the FastAPI full stack app). It contains the following resources. I’ve tried to structure the code as intuitive as possible, but you’ll find a commented tree of the project structure below:

.
├── LICENSE
├── MANIFEST
├── README.md
├── docs
│ ├── DOCS.md
│ ├── authorize.png
│ ├── sample_payload.json # Sample Payload to test ML model
│ └── sample_payload.png
├── fastapi_skeleton # Skeleton Module
│ ├── __init__.py.
│ ├── api # API related code
│ │ ├── __init__.py
│ │ └── routes # All routes provided by API
│ │ ├── __init__.py
│ │ ├── heartbeat.py # Route to check if is server is up
│ │ ├── prediction.py # Route to predict using ML model
│ │ └── router.py # Main router to serves the routers
│ ├── core
│ │ ├── __init__.py
│ │ ├── config.py # Server configuration helper
│ │ ├── event_handlers.py # Handle server start/stop
│ │ ├── messages.py # Shared messages/resources
│ │ └── security.py # Common security helpers
│ ├── main.py # App entrypoint
│ ├── models
│ │ ├── __init__.py
│ │ ├── heartbeat.py # Data model for heartbeat response
│ │ ├── payload.py # Data model for ML model payload
│ │ └── prediction.py # Data model for prediction result
│ └── services
│ ├── __init__.py
│ └── models.py # Services to provide ML model
├── requirements.txt # Project requirements
├── sample_model # Sample ML model folder
│ ├── lin_reg_california_housing_model.joblib # Sample ML model
│ └── model_description.md # Description of ML model params
├── setup.py # Python setup script for dist
├── tests # Test folder
│ ├── __init__.py
│ ├── conftest.py # Test configuration / bootstrap
│ ├── test_api # API tests
│ │ ├── __init__.py
│ │ ├── test_api_auth.py # API authentication test cases
│ │ ├── test_heartbeat.py # API Heartbeat test cases
│ │ └── test_prediction.py # API ML model prediction test cases
│ └── test_service
│ ├── __init__.py
│ └── test_models.py # ML model test cases
└── tox.ini # Tox configuration

Install

Use your preferred virtual Python env (e.g. virtualenv or conda) and the required packages in your local environment with:

pip install -r requirements

Setup

  1. Duplicate the .env.example file and rename it to .env
  2. In the .env file configure the API_KEY entry. The key is used for authenticating our API.
    A sample API key can be generated using Python REPL:
import uuid
print(str(uuid.uuid4()))

Run It

  1. Start your app with:
uvicorn fastapi_skeleton.main:app

2. Go to http://localhost:8000/docs.

3. Click Authorize and enter the API key as created in the Setup step.

4. You can use the sample payload from the docs/sample_payload.json file when trying out the house price prediction model using the API.

Summary

In this tutorial we looked at how easy it is to deploy machine learning models using FastAPI. Feel free to reuse the skeleton code provided to get started. It’s fully tested and can be easily extended.

Next:

If you’re familiar with container technologies, you have probably missed the Dockerfile in this tutorial. No worries, in the next post, I’m going to refine the skeleton code so that it can be easily deployed in a containerised environment and eventually used in Kubernetes.

I hope this tutorial and code helps you to get started using the great FastAPI library!

If you enjoyed this post, please click 👏 so that others can read it as well. Follow me on Twitter @8B_EC for the latest updates or just to say hi :)

--

--

Jan Forster

I am an AI Engineer for the IBM Watson team in Europe. I’m coming from a professional services background with a strong interest in Natural Language Processing!