Why You Should Deploy Machine Learning Models on FastAPI

Nimish Verma
Geek Culture
Published in
4 min readAug 1, 2021

--

Developing, Documenting, Deploying, and Typing

Fast API documentation

A lot of you might have heard about FastAPI or even read my article on it. FastAPI, as the name suggests, is an ASGI python-based framework that is much faster than Django and Flask. It comes with Pydantic and Starlette and is proving to be the best py-candidate.

Jumping right into the action, this article assumes you already have FastAPI set up with uvicorn for the server. We are going to use a frozen model trained on the MNIST dataset which can be found over here — Github. Download this and place it in your main folder.

Now we will write our server file — main.py. Let’s first write the static code where we instantiate the server, and load the model.

To load the model we write the following: —

import tensorflow.compat.v1 as tf
frozen_file_path = 'mnist.model.best.hdf5'
print("Loading Model")
model_graph = tf.Graph()

new_model = tf.keras.models.load_model(frozen_file_path)

To instantiate the server we import and call the following

from fastapi import FastAPI
app = FastAPI()

Now we write the endpoints to call the model for a random image.

@app.get("/random_prediction")
async def predict_random():
rand_int = random.randint(0,len(X_test))
print(rand_int)
random_image_data = X_test[rand_int]
rand_image = Image.fromarray(random_image_data,'L')
y_out = await new_model.predict(numpy.expand_dims(random_image_data,axis=0))
predicted_value = numpy.where(y_out[0]==1)[0][0]

return {"prediction": predicted_value}

Make sure you import the data for MNIST from keras using

from keras.datasets import mnist
(X_train, y_train), (X_test, y_test) = mnist.load_data()

Let us consider another example where the input data is numeric or categorical. Now we are receiving the input data from our front end when a user hits our API. To exploit the Pydantic support, we will be typing our input model, such as:

@app.post("/input_predict")
def get_output_category(data: InputType):
predicted_output = model.predict([[**data.dict()]])
return {'prediction': predicted_output}

This is useful as it will automatically handle the cases of invalid input. On top of it, this InputType is also shown in the generated API docs.

The InputType has to be defined in another (preferred) file, using the Pydantic’s BaseModel.

from pydantic import BaseModel
from typing import Optional

class InputType(BaseModel):
input_field1: float
input_field2: Optional[float] = 0.1
...

Now, FastAPI also allows HTMLResponse in case you want to generate HTML directly. Also for the people experienced with Jinja and Django, can opt to use Jinja2 templating available in FastAPI. More information here.

The auto-generated docs are hosted at http://127.0.0.1:8000/docs by default and automatically show information regarding your endpoints, the expected data type, and the return type if specified. It is useful in case you don't want to build a fancy frontend for your web app, as the swagger docs are interactive.

You can not only send the request (POST, GET, PUT, DELETE) but also specify default data for each endpoint. You can also include information in the description of each endpoint, field, and schema.

To use the ASGI support, you can define a function with async prefix, and call other async utils using await prefix. This is especially helpful in the case of processing big data, training models, and computer vision applications.

FastAPI has a very cool example of how parallelism and asynchronous code works using burgers! Check that here. Using asynchronous methods, you can train your model in parallel while still serving the API and even perform model prediction on a large amount of data, and show output once it is obtained.

For deploying on Docker, you can use the following image as a base image and copy the contents of your app to create your own Docker image.

FROM tiangolo/uvicorn-gunicorn-fastapi:python3.7  
COPY ./app /app

Since we would also have external libraries, we will have to install them in our docker file. This can be done using the run pip command as follows:

RUN pip install -r requirements.txt

In case your project does not have a requirements.txt file, then you can simply install the libraries by including

RUN pip install sklearn numpy tensorflow tensorflow-gpu

and any other libraries that you have used.

Conclusion

As most of you are coming here from Flask, you can see the similarity with Flask. But what makes FastAPI better, other than its speed, is the ASGI support. We can also have auto-generated docs from FastAPI, which are really helpful. FastAPI is the missing middle ground between Django and flask. Though as a Data Scientist or ML developer, Django is overkill. FastAPI provides the much-needed upper hand when it comes to speed and asynchronous calling.

References

  1. https://fastapi.tiangolo.com/tutorial/response-model/
  2. https://fastapi.tiangolo.com/deployment/docker/
  3. Shoutout to this simple GitHub project which can be cloned and run to see a basic RF model working on FastAPI https://github.com/kaustubhgupta/FastAPI-Demo

--

--