Stop wasting time, deploy an ML model using fastAPI

Use fastAPI to have API documentation in your model deployment

Published in

CodeX

3 min readOct 28, 2022

I had just recently finished creating an XGBoost regression model for a time series forecast on predicting the future pressure used from a cylinder. Once my model was trained up and ready to go to my next stage, I was wondering how could I quickly operationalize my output to be useful and production ready for others to use it.

I had always used Flask before to start up a web endpoint and serve the model. Taking in a REST request and serving back the prediction. Usually, that type of structure worked well if I was running and operating everything. I started hearing about people using FastAPI as a method to deploy an endpoint and fastAPI included self-documentation in the deployment. I was like wow, this could save me some time. I don’t have to explain how to use the API to point people to the /docs directory and off they should go. No more writing additional documentation about schemas or datatypes.

Let’s walk through the deployment. This is an ML model so the first item on the list is to load the model. I do that outside of the post request as to only load it once on the startup of the application. From the get-go, the code was very similar to Flask you are still creating a variable from the library and setting a path to a function. One of the good features I found about fastAPI is it’s fully compatible with Pydantic so in this code below I set all of my datatypes in a class and applied that to the inbound request. In order to get the response from the model I ingested the inbound data into a DataFrame. Next, the model is time series based so I did need to move the timestamp to be the index but after that, it’s just using the predict function to get the response from the model.

Just to work out any bugs and validate your code I suggest running the code in your IDE. Assuming you are already in your virtual environment. For a quick test in your command prompt use “uvicorn mlapi:app — reload” the — reload will automatically restart the server if you resave your code after an update.

I wanted to package everything up in a container to make it easily transportable, so I used Docker. You can review the Dockerfile below but it’s very simple to use Python 3.9, set a directory, install the requirements, copy the code and run the environment. (Yes I know I could do a single copy but I like to break it up)


FROM python:3.9  WORKDIR /code  COPY ./requirements.txt /code/requirements.txt  RUN pip install --no-cache-dir --upgrade -r /code/requirements.txtCOPY ./ /code  CMD ["uvicorn", "mlapi:app", "--host", "0.0.0.0", "--port", "80"]  EXPOSE 80

Now let's review the results once the container is spinning. I have mine run on Google cloud using Cloud Run and the following command to deploygcloud run deploy <service-name> --image <image_name>.

To give a quick review of the docs created fastAPI used the OpenAPI specification and creates a JSON schema. There is also the ability to try out the API in the interface, so you do not have to use postman.

Code creation for the model deployment was at about the same time as Flask but this was one of the fastest deployments I had for a model. There was no need to create any additional documentation or explain how the API call worked as it was all self-written from the code. So if you are deploying an endpoint next time try out fastAPI it worked well for me!

If you like the content I would appreciate a follow!

Stop wasting time, deploy an ML model using fastAPI

Use fastAPI to have API documentation in your model deployment

Written by Scott Dallman