[Tutorial]: Serve an ML Model in Production using FastAPI

Ashmi Banerjee
5 min readJun 27, 2022

--

A step-by-step tutorial to serve a (pre-trained) image classifier model from TensorFlow Hub using FastAPI.

Our tech stack for the tutorial

FastAPI is a popular, high-performance Python backend framework used in web development.

In this tutorial, I will show you how to serve an image classifier model using Python and FastAPI.

The high-level architecture of the app can be summarised as follows:

  1. We have a front end (out of our current scope) which takes in an image URL as input.
  2. The front end makes API calls to the FastAPI backend.
  3. The FastAPI backend does the computation and sends back the predicted image along with the probability.
  4. The front end displays the results from the backend.
The architecture of our image classifier app

Step: 0. Prerequisites

  1. Make sure you’re using Python 3.6+
  2. Create a virtual environment and activate it
    virtualenv venv
    source venv/bin/activate
  3. Install dependencies
    — Create the requirements.txt file as following
    — Now install the requirements as pip3 install -r requirements.txt
fastapi~=0.75.0
uvicorn==0.17.6
numpy==1.22.4
pydantic==1.9.1
Pillow==9.1.1
tensorflow

4. Create the project structure as follows

fastapi-backend
├── src
│ ├── app
│ │ ├── app.py
│ └── pred
│ │ ├── models
│ │ │ ├── tf_pred.py
│ │ └── image_classifier.py
│ └── utils
│ │ ├── utilities.py
│ └── main.py
└── requirements.txt

Step: 1. Create API endpoint(s)

In the app.py file, implement the /predict/tf/ end-point.

First, after importing the required packages, in line 5, we initialise the FastAPI app with the name of our API (here referred to as the Image Classifier API, but you can address it by any name) as its title.

Since we are sending data to the backend, we implement the end-point as a POST request in line 10.

But before we implement this end-point, in line 7, we need to define the data model as a class that inherits from theBaseModel we imported from Pydantic in line 3.
Our data model just has one attribute: img_url which is a str storing the URL of the image used for classification.

The function predict_tftakes in input a request (of datatype Img, the data model created above) and returns a JSON containing the status code (HTTP code 200 in case of a valid prediction), the predicted label, and the prediction probability.

It then calls the run_classifier function where all the image classification happens.

In case of a null prediction, it raises an HTTPException with the status code 404, hinting that there is some problem with the image.

Step: 2. Implement prediction algorithm

Next, we need to implement our image classification algorithm, which should classify the image into its right class.

However, since getting the best model for prediction is not the focus of the tutorial, I have used a pre-trained TensorFlow MobileNet_V2 model from the TensorFlow hub here.

I will just highlight the outline of the steps here.

  1. Load the image
  2. Run predictor on the loaded image
  3. Return the results

A more detailed implementation can be accessed from the GitHub repository here.

Step: 3. Create main.py

So far we have implemented our image classification algorithm and its respective endpoint.

The next step would be to implement the main.py file so that we can run the server and interact with our end-point directly from the browser.

For this, we will be using uvicorn server, which is an ASGI web server implementation for Python.

Step: 4. View on http://127.0.0.1:8000/docs/

Voila! If you’ve successfully reached here, you should have your image classifier API up and running on http://127.0.0.1:8000/docs/ and should have a similar-looking page!

Image Classifier API using FastAPI

Next Steps

Some popular tools that can be used for testing, containerisation, and deployment of our application.

Testing

Once you have built the APIs, the next step would be to test your endpoints. Thoroughly testing the end-points come with the following benefits:

  1. Fewer bugs
  2. Smooth deployments
  3. Writing good code
  4. Test-driven development

A follow-up tutorial on load testing our endpoints using the open-source load testing tool Locust can be found here.

Containerisation

Applications running in containers can be deployed easily to multiple different operating systems and hardware platforms.

They offer the following advantages:

  1. Performance consistency
    DevOps teams know applications in containers will run the same, regardless of where they are deployed.
  2. Greater efficiency
    Containers allow applications to be more rapidly deployed, patched, or scaled.
  3. Less overhead
    Containers require fewer system resources than traditional or hardware virtual machine environments because they don’t include operating system images.

A detailed tutorial on containerisation of your FastAPI application using docker has been published here.

Deployment

Once the APIs have been thoroughly tested and containerised, the next step is to deploy them to some cloud service so that they can be publicly accessible.

A multitude of options is available for this purpose.

The source code on GitHub can be accessed here.
The references and further readings on this topic have been summarised here.

If you like the article, please subscribe to get my latest ones.
To get in touch, either reach out to me on
LinkedIn or via ashmibanerjee.com.

--

--

Ashmi Banerjee

👩‍💻 Woman in tech, excited about new technical challenges. You can read more about me at: https://ashmibanerjee.com/