Model Deployment using TorchServe
Synopsis: Deploy a trained ConvNeXt-B model instance on TorchServe for realtime image classification using the categories present in the Food 101 dataset.
TorchServe is a framework for serving pyTorch trained models in a seamless, performant manner.
In this tutorial, we will walk though the steps for deploying a PyTorch model locally and on a docker container.
Environment Setup:
We are using Anaconda, for managing libraries in python.
- Once we have installed Anaconda, we can setup the python environment as shown below.
conda create --prefix <env_name_location> \
-c pytorch -c nvidia -c conda-forge \
torchserve torch-model-archiver \
torch-workflow-archiver \
python=3.8.12conda activate <env_name_location>
- Create the project base directory, and clone the TorchServe git repository.
mkdir <base_torchserve_dir>git clone https://github.com/pytorch/serve.git
- TorchServe provides model deployment in 2 flavors — CPU and GPU.
In this tutorial, we will be using the CPU version for deployment.
Run the following commands, to install all the dependencies for TorchServe.
cd <base_torchserve_dir>/serve#Default for CPU based inference
python ./ts_scripts/install_dependencies.py# For GPU based inference
python ./ts_scripts/install_dependencies.py --cuda=cu113# Optional - create a conda requirements file
conda list -e > requirements_conda_torchserve_min.txt
- Create a directory to store the trained model and model definition for the trained weights state
For details on training an image classification model via transfer learning, please click here.
# creates a directory to store trained model definition / state files
mkdir -p <base_torchserve_dir>/serve/model_defs# creates a directory to store the model in torchserve format
mkdir -p <base_torchserve_dir>/serve/model_store
Add trained model to TorchServe:
- Create a ConvNeXt-B classifier definition for loading trained weights, as shown below.
This is required, as the model used in this tutorial, has been trained with a custom classifier for learning about the food categories.
- Create an index to user friendly category name mapping, for the food categories.
As we are using the model trained on the Food 101 dataset, we need to map indexes to food classes in the same order. - Note: PyTorch’s
ImageFolder
dataloader maps indexes to classes based on an alphabetical sort order. This list of sorted classes is available under theFood101/meta/classes.txt
.
We need to convert this plain list into a json string with key:<index> and value:<category name>.
Gist Link for json file with 101 indexed food categories from Food 101 dataset.
- Optional — Extend the TorchServe classifier for preprocess, overriding default attributes.
In this example, we are returning top 10 results as opposed to the default top 5 results.
from ts.torch_handler.image_classifier import ImageClassifier
class ImageClassifierConvNeXtBTorchServe(ImageClassifier): topk = 10
- Copy the following files to the
<base_torch_serve_dir/serve/model_defs
– pyTorch trained model —pth
state dict file
–ImageClassifierConvNeXtBTorchServe
torchserve custom image classfier file
–ImageClassfierConvNeXtB
model definition python file
–index_to_name.json
file which maps trained model indexes for user friendly category names
Build out the TorchServe archive file, which will be used for generating real time inferences
Once this command has been executed successfully, it should store a file convnextbfood101.mar
under the <base_dir>/serve/model_store
folder, as shown below.
Run TorchServe:
Run the following command, with the conda environment enabled, to start TorchServe.
torchserve --start --ncs \
--model-store <base_torchserve_dir>/serve/model_store \
--models foodnet=convnextbfood101.mar
- Optional — Add additional configuration properties for TorchServe, i.e. number of workers per model, etc.
Create a config.properties as shown below and store it under<base_torchserve_dir>/serve/model_configs
directory.
For running TorchServe with the configuration file above, add an extra parameter which points towards the configuration file.
torchserve --start --ncs \
--model-store <base_torchserve_dir>/serve/model_store \
--models foodclass=convnextbfood101.mar \
--ts-config <base_dir>/serve/model_configs/conf_food101.properties
Real Time Inferences:
For receiving real time inferences, we can either use gRPC API or REST API.
For example, let’s use the REST API, to classify the food category for the image below.
# Download the unsplash image
# Named as 'meal_salmon_zuch_caseylee.jpg'
# Url to download the image
https://unsplash.com/photos/awj7sRviVXo/download?force=true&w=1920#Run the inference on the image
curl http://127.0.0.1:8080/predictions/foodclass -T meal_salmon_zuch_caseylee.jpg
Inference Result:
Based on the output of the REST API call,grilled salmon
has the highest probability, which is correct.
Note: As the trained model has around 85% accuracy, the model predictions will also have incorrect categories assigned to certain images.
{
"grilled_salmon": 0.6429812908172607,
"shrimp_and_grits": 0.14269301295280457,
"paella": 0.07746776193380356,
"gnocchi": 0.049594469368457794,
"pork_chop": 0.026730146259069443,
"scallops": 0.020880170166492462,
"risotto": 0.017987729981541634,
"chicken_curry": 0.006645840592682362,
"ceviche": 0.003106149611994624,
"escargots": 0.0023537822999060154
}
Deployment on a Docker Container:
- Navigate to the docker folder under the cloned
torchserve
repository - Update the Dockerfile with the following command, as we ran into
gnupg
error illustrated below.
RUN apt-get update && apt-get install -y gnupg
- Run the
build_image.sh
script for creating the container.
cd <base_torchserve_dir>/serve/docker# CPU based inference container
./build_image.sh -b master -bt production -t food101convnextb1_0#GPU based inference container, with CUDA already installed, say 11.3
./build_image.sh -b master -bt production -t food101convnextb1_0gpucu113 -g -cv cu113
- Create a new docker configuration properties files.
Note: Only delta changes are displayed when comparing to the initial configuration file listed earlier in this tutorial.
- Run the following commands, to deploy the docker container for the food classification model.
# Run from <base_torchserve_dir>
# Mount the config file
# Mount the modelstore folder
# Override entrypoint command to point to the custom config filedocker run --rm -it \
-p 8080:8080 \
-p 8081:8081 \
-p 8082:8082 \
-p 7070:7070 \
-p 7071:7071 \
--mount type=bind,source=$(pwd)/serve/model_store,target=/tmp/modelstore \
--mount type=bind,source=$(pwd)/serve/model_configs/config_food101_docker.properties,target=/tmp/config.properties \
food101convnextb1_0:latest \
torchserve --model-store=/tmp/modelstore \
--ts-config=/tmp/config.properties
Docker based Real Time Inferences:
For testing the food classification inference when hosted on docker container, sample image is provided below.
- Download the image and run it against the torchserve model deployed on docker
# Url to download the image
https://unsplash.com/photos/RwDmUKpUP50/download?ixid=MnwxMjA3fDB8MXxzZWFyY2h8OHx8Y2FuYXBlfGVufDB8fHx8MTY1Mzk0NDE4NQ&force=true# Call torchserve on docker for this downloadedimage
curl http://127.0.0.1:8080/predictions/foodclass \
-T sebastian-coman-photography-RwDmUKpUP50-unsplash.jpg
Docker based Inference Results:
The model is able to classify the food as scallops
with a probability of 96%.
{
"scallops": 0.9640653133392334,
"grilled_salmon": 0.027381636202335358,
"filet_mignon": 0.005994760897010565,
"hamburger": 0.0007836359436623752,
"bibimbap": 0.0006225404795259237,
"foie_gras": 0.0004113266186323017,
"steak": 0.00012357658124528825,
"crab_cakes": 0.00010399901657365263,
"tuna_tartare": 8.00026609795168e-05,
"sushi": 7.516794721595943e-05
}
Conclusion:
TorchServe is a nice tool for deploying and scaling multiple models trained using pyTorch. It also provides integrations for Kubernetes, mlflow and Google Vertex AI Managed platform among others.
Other Articles in this series:
- Setting up multi GPU processing in Pytorch — Part 1
- Image Classification with ResNet, ConvNeXt using PyTorch — Part 2
References:
- PyTorch Documentation
- TorchServe Documentation
- Zhuang Liu, Hanzi Mao, Chao-Yuan Wu, Christoph Feichtenhofer, Trevor Darrell, Saining Xie. (2022). A ConvNet for the 2020s
- Ze Liu, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, Baining Guo. (2021). Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
- Dan Hendrycks, Kevin Gimpel. (2016). Gaussian Error Linear Units (GELUs)
- Food 101 Dataset
- Bartt, Alvaro. (2021). Serving PyTorch models with TorchServe