Model Deployment using TorchServe

Published in

exemplifyML.ai

6 min readMay 30, 2022

Synopsis: Deploy a trained ConvNeXt-B model instance on TorchServe for realtime image classification using the categories present in the Food 101 dataset.

TorchServe is a framework for serving pyTorch trained models in a seamless, performant manner.

In this tutorial, we will walk though the steps for deploying a PyTorch model locally and on a docker container.

Environment Setup:

We are using Anaconda, for managing libraries in python.

Once we have installed Anaconda, we can setup the python environment as shown below.

conda create --prefix <env_name_location> \
        -c pytorch -c nvidia -c conda-forge \
        torchserve torch-model-archiver \
        torch-workflow-archiver \
        python=3.8.12conda activate <env_name_location>

Create the project base directory, and clone the TorchServe git repository.

mkdir <base_torchserve_dir>git clone https://github.com/pytorch/serve.git

TorchServe provides model deployment in 2 flavors — CPU and GPU.
In this tutorial, we will be using the CPU version for deployment.
Run the following commands, to install all the dependencies for TorchServe.

cd <base_torchserve_dir>/serve#Default for CPU based inference
python ./ts_scripts/install_dependencies.py# For GPU based inference
python ./ts_scripts/install_dependencies.py --cuda=cu113# Optional - create a conda requirements file
conda list -e > requirements_conda_torchserve_min.txt

Create a directory to store the trained model and model definition for the trained weights state
For details on training an image classification model via transfer learning, please click here.

# creates a directory to store trained model definition / state files
mkdir -p <base_torchserve_dir>/serve/model_defs# creates a directory to store the model in torchserve format 
mkdir -p <base_torchserve_dir>/serve/model_store

Add trained model to TorchServe:

Create a ConvNeXt-B classifier definition for loading trained weights, as shown below.
This is required, as the model used in this tutorial, has been trained with a custom classifier for learning about the food categories.

Create an index to user friendly category name mapping, for the food categories.
As we are using the model trained on the Food 101 dataset, we need to map indexes to food classes in the same order.
Note: PyTorch’s ImageFolder dataloader maps indexes to classes based on an alphabetical sort order. This list of sorted classes is available under the Food101/meta/classes.txt .
We need to convert this plain list into a json string with key:<index> and value:<category name>.
Gist Link for json file with 101 indexed food categories from Food 101 dataset.

Index to food categories json file ( Image by Author)

Optional — Extend the TorchServe classifier for preprocess, overriding default attributes.
In this example, we are returning top 10 results as opposed to the default top 5 results.

from ts.torch_handler.image_classifier import ImageClassifier

class ImageClassifierConvNeXtBTorchServe(ImageClassifier):    topk = 10

Copy the following files to the <base_torch_serve_dir/serve/model_defs– pyTorch trained model — pth state dict file
–ImageClassifierConvNeXtBTorchServe torchserve custom image classfier file
–ImageClassfierConvNeXtB model definition python file
–index_to_name.json file which maps trained model indexes for user friendly category names

Build out the TorchServe archive file, which will be used for generating real time inferences

Once this command has been executed successfully, it should store a file convnextbfood101.mar under the <base_dir>/serve/model_store folder, as shown below.

Listing of files on the model defn, model store folder (Image by Author)

Run TorchServe:

Run the following command, with the conda environment enabled, to start TorchServe.

torchserve --start --ncs \
    --model-store <base_torchserve_dir>/serve/model_store \
    --models foodnet=convnextbfood101.mar

Optional — Add additional configuration properties for TorchServe, i.e. number of workers per model, etc.
Create a config.properties as shown below and store it under <base_torchserve_dir>/serve/model_configs directory.

Example TorchServe config for CPU inference

For running TorchServe with the configuration file above, add an extra parameter which points towards the configuration file.

torchserve --start --ncs \
    --model-store <base_torchserve_dir>/serve/model_store \
    --models foodclass=convnextbfood101.mar \
    --ts-config <base_dir>/serve/model_configs/conf_food101.properties

Real Time Inferences:

For receiving real time inferences, we can either use gRPC API or REST API.

For example, let’s use the REST API, to classify the food category for the image below.

Photo by Casey Lee on Unsplash — Testing classification for this food image

# Download the unsplash image
# Named as 'meal_salmon_zuch_caseylee.jpg'
# Url to download the image
https://unsplash.com/photos/awj7sRviVXo/download?force=true&w=1920#Run the inference on the image
curl http://127.0.0.1:8080/predictions/foodclass -T meal_salmon_zuch_caseylee.jpg

Inference Result:

Based on the output of the REST API call,grilled salmon has the highest probability, which is correct.

Note: As the trained model has around 85% accuracy, the model predictions will also have incorrect categories assigned to certain images.

{
  "grilled_salmon": 0.6429812908172607,
  "shrimp_and_grits": 0.14269301295280457,
  "paella": 0.07746776193380356,
  "gnocchi": 0.049594469368457794,
  "pork_chop": 0.026730146259069443,
  "scallops": 0.020880170166492462,
  "risotto": 0.017987729981541634,
  "chicken_curry": 0.006645840592682362,
  "ceviche": 0.003106149611994624,
  "escargots": 0.0023537822999060154
}

Deployment on a Docker Container:

Navigate to the docker folder under the cloned torchserve repository
Update the Dockerfile with the following command, as we ran into gnupg error illustrated below.

Error on using the build_image.sh if not updated if the docker command below

RUN apt-get update && apt-get install -y gnupg

Run the build_image.sh script for creating the container.

cd <base_torchserve_dir>/serve/docker# CPU based inference container
./build_image.sh -b master -bt production -t food101convnextb1_0#GPU based inference container, with CUDA already installed, say 11.3
./build_image.sh -b master -bt production -t food101convnextb1_0gpucu113 -g -cv cu113

Create a new docker configuration properties files.
Note: Only delta changes are displayed when comparing to the initial configuration file listed earlier in this tutorial.

Run the following commands, to deploy the docker container for the food classification model.

# Run from <base_torchserve_dir>
# Mount the config file
# Mount the modelstore folder
# Override entrypoint command to point to the custom config filedocker run --rm -it \
    -p 8080:8080 \
    -p 8081:8081 \
    -p 8082:8082 \
    -p 7070:7070 \
    -p 7071:7071  \
    --mount type=bind,source=$(pwd)/serve/model_store,target=/tmp/modelstore \
    --mount type=bind,source=$(pwd)/serve/model_configs/config_food101_docker.properties,target=/tmp/config.properties \
    food101convnextb1_0:latest \
    torchserve --model-store=/tmp/modelstore \
    --ts-config=/tmp/config.properties

Docker based Real Time Inferences:

For testing the food classification inference when hosted on docker container, sample image is provided below.

Photo by Sebastian Coman Photography on Unsplash — Testing classification for this food image

Download the image and run it against the torchserve model deployed on docker

# Url to download the image
https://unsplash.com/photos/RwDmUKpUP50/download?ixid=MnwxMjA3fDB8MXxzZWFyY2h8OHx8Y2FuYXBlfGVufDB8fHx8MTY1Mzk0NDE4NQ&force=true# Call torchserve on docker for this downloadedimage
curl http://127.0.0.1:8080/predictions/foodclass \
    -T sebastian-coman-photography-RwDmUKpUP50-unsplash.jpg

Docker based Inference Results:

The model is able to classify the food as scallops with a probability of 96%.

{
  "scallops": 0.9640653133392334,
  "grilled_salmon": 0.027381636202335358,
  "filet_mignon": 0.005994760897010565,
  "hamburger": 0.0007836359436623752,
  "bibimbap": 0.0006225404795259237,
  "foie_gras": 0.0004113266186323017,
  "steak": 0.00012357658124528825,
  "crab_cakes": 0.00010399901657365263,
  "tuna_tartare": 8.00026609795168e-05,
  "sushi": 7.516794721595943e-05
}

Conclusion:

TorchServe is a nice tool for deploying and scaling multiple models trained using pyTorch. It also provides integrations for Kubernetes, mlflow and Google Vertex AI Managed platform among others.

Other Articles in this series:

References:

PyTorch Documentation
TorchServe Documentation
Zhuang Liu, Hanzi Mao, Chao-Yuan Wu, Christoph Feichtenhofer, Trevor Darrell, Saining Xie. (2022). A ConvNet for the 2020s
Ze Liu, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, Baining Guo. (2021). Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
Dan Hendrycks, Kevin Gimpel. (2016). Gaussian Error Linear Units (GELUs)
Food 101 Dataset
Bartt, Alvaro. (2021). Serving PyTorch models with TorchServe

Model Deployment using TorchServe

Written by Kaustav Mandal