Model Deployment using TorchServe

Kaustav Mandal
exemplifyML.ai
Published in
6 min readMay 30, 2022

Synopsis: Deploy a trained ConvNeXt-B model instance on TorchServe for realtime image classification using the categories present in the Food 101 dataset.

TorchServe is a framework for serving pyTorch trained models in a seamless, performant manner.

Photo by Kevin Ku on Unsplash

In this tutorial, we will walk though the steps for deploying a PyTorch model locally and on a docker container.

Environment Setup:

We are using Anaconda, for managing libraries in python.

  • Once we have installed Anaconda, we can setup the python environment as shown below.
conda create --prefix <env_name_location> \
-c pytorch -c nvidia -c conda-forge \
torchserve torch-model-archiver \
torch-workflow-archiver \
python=3.8.12
conda activate <env_name_location>
  • Create the project base directory, and clone the TorchServe git repository.
mkdir <base_torchserve_dir>git clone https://github.com/pytorch/serve.git
  • TorchServe provides model deployment in 2 flavors — CPU and GPU.
    In this tutorial, we will be using the CPU version for deployment.
    Run the following commands, to install all the dependencies for TorchServe.
cd <base_torchserve_dir>/serve#Default for CPU based inference
python ./ts_scripts/install_dependencies.py
# For GPU based inference
python ./ts_scripts/install_dependencies.py --cuda=cu113
# Optional - create a conda requirements file
conda list -e > requirements_conda_torchserve_min.txt
  • Create a directory to store the trained model and model definition for the trained weights state
    For details on training an image classification model via transfer learning, please click here.
# creates a directory to store trained model definition / state files
mkdir -p <base_torchserve_dir>/serve/model_defs
# creates a directory to store the model in torchserve format
mkdir -p <base_torchserve_dir>/serve/model_store

Add trained model to TorchServe:

  • Create a ConvNeXt-B classifier definition for loading trained weights, as shown below.
    This is required, as the model used in this tutorial, has been trained with a custom classifier for learning about the food categories.
  • Create an index to user friendly category name mapping, for the food categories.
    As we are using the model trained on the Food 101 dataset, we need to map indexes to food classes in the same order.
  • Note: PyTorch’s ImageFolder dataloader maps indexes to classes based on an alphabetical sort order. This list of sorted classes is available under the Food101/meta/classes.txt .
    We need to convert this plain list into a json string with key:<index> and value:<category name>.
    Gist Link for json file with 101 indexed food categories from Food 101 dataset.
Index to food categories json file ( Image by Author)
  • Optional — Extend the TorchServe classifier for preprocess, overriding default attributes.
    In this example, we are returning top 10 results as opposed to the default top 5 results.
from ts.torch_handler.image_classifier import ImageClassifier

class ImageClassifierConvNeXtBTorchServe(ImageClassifier):
topk = 10
  • Copy the following files to the <base_torch_serve_dir/serve/model_defs
    – pyTorch trained model — pth state dict file
    ImageClassifierConvNeXtBTorchServe torchserve custom image classfier file
    ImageClassfierConvNeXtB model definition python file
    index_to_name.json file which maps trained model indexes for user friendly category names

Build out the TorchServe archive file, which will be used for generating real time inferences

Once this command has been executed successfully, it should store a file convnextbfood101.mar under the <base_dir>/serve/model_store folder, as shown below.

Listing of files on the model defn, model store folder (Image by Author)

Run TorchServe:

Run the following command, with the conda environment enabled, to start TorchServe.

torchserve --start --ncs \
--model-store <base_torchserve_dir>/serve/model_store \
--models foodnet=convnextbfood101.mar
  • Optional — Add additional configuration properties for TorchServe, i.e. number of workers per model, etc.
    Create a config.properties as shown below and store it under <base_torchserve_dir>/serve/model_configs directory.
Example TorchServe config for CPU inference

For running TorchServe with the configuration file above, add an extra parameter which points towards the configuration file.

torchserve --start --ncs \
--model-store <base_torchserve_dir>/serve/model_store \
--models foodclass=convnextbfood101.mar \
--ts-config <base_dir>/serve/model_configs/conf_food101.properties

Real Time Inferences:

For receiving real time inferences, we can either use gRPC API or REST API.

For example, let’s use the REST API, to classify the food category for the image below.

Photo by Casey Lee on Unsplash — Testing classification for this food image
# Download the unsplash image
# Named as 'meal_salmon_zuch_caseylee.jpg'
# Url to download the image
https://unsplash.com/photos/awj7sRviVXo/download?force=true&w=1920
#Run the inference on the image
curl http://127.0.0.1:8080/predictions/foodclass -T meal_salmon_zuch_caseylee.jpg

Inference Result:

Based on the output of the REST API call,grilled salmon has the highest probability, which is correct.

Note: As the trained model has around 85% accuracy, the model predictions will also have incorrect categories assigned to certain images.

{
"grilled_salmon": 0.6429812908172607,
"shrimp_and_grits": 0.14269301295280457,
"paella": 0.07746776193380356,
"gnocchi": 0.049594469368457794,
"pork_chop": 0.026730146259069443,
"scallops": 0.020880170166492462,
"risotto": 0.017987729981541634,
"chicken_curry": 0.006645840592682362,
"ceviche": 0.003106149611994624,
"escargots": 0.0023537822999060154
}

Deployment on a Docker Container:

  • Navigate to the docker folder under the cloned torchserve repository
  • Update the Dockerfile with the following command, as we ran into gnupg error illustrated below.
Error on using the build_image.sh if not updated if the docker command below
RUN apt-get update && apt-get install -y gnupg
  • Run the build_image.sh script for creating the container.
cd <base_torchserve_dir>/serve/docker# CPU based inference container
./build_image.sh -b master -bt production -t food101convnextb1_0
#GPU based inference container, with CUDA already installed, say 11.3
./build_image.sh -b master -bt production -t food101convnextb1_0gpucu113 -g -cv cu113
  • Create a new docker configuration properties files.
    Note: Only delta changes are displayed when comparing to the initial configuration file listed earlier in this tutorial.
  • Run the following commands, to deploy the docker container for the food classification model.
# Run from <base_torchserve_dir>
# Mount the config file
# Mount the modelstore folder
# Override entrypoint command to point to the custom config file
docker run --rm -it \
-p 8080:8080 \
-p 8081:8081 \
-p 8082:8082 \
-p 7070:7070 \
-p 7071:7071 \
--mount type=bind,source=$(pwd)/serve/model_store,target=/tmp/modelstore \
--mount type=bind,source=$(pwd)/serve/model_configs/config_food101_docker.properties,target=/tmp/config.properties \
food101convnextb1_0:latest \
torchserve --model-store=/tmp/modelstore \
--ts-config=/tmp/config.properties

Docker based Real Time Inferences:

For testing the food classification inference when hosted on docker container, sample image is provided below.

Photo by Sebastian Coman Photography on Unsplash — Testing classification for this food image
  • Download the image and run it against the torchserve model deployed on docker
# Url to download the image
https://unsplash.com/photos/RwDmUKpUP50/download?ixid=MnwxMjA3fDB8MXxzZWFyY2h8OHx8Y2FuYXBlfGVufDB8fHx8MTY1Mzk0NDE4NQ&force=true
# Call torchserve on docker for this downloadedimage
curl http://127.0.0.1:8080/predictions/foodclass \
-T sebastian-coman-photography-RwDmUKpUP50-unsplash.jpg

Docker based Inference Results:

The model is able to classify the food as scallops with a probability of 96%.

{
"scallops": 0.9640653133392334,
"grilled_salmon": 0.027381636202335358,
"filet_mignon": 0.005994760897010565,
"hamburger": 0.0007836359436623752,
"bibimbap": 0.0006225404795259237,
"foie_gras": 0.0004113266186323017,
"steak": 0.00012357658124528825,
"crab_cakes": 0.00010399901657365263,
"tuna_tartare": 8.00026609795168e-05,
"sushi": 7.516794721595943e-05
}

Conclusion:

TorchServe is a nice tool for deploying and scaling multiple models trained using pyTorch. It also provides integrations for Kubernetes, mlflow and Google Vertex AI Managed platform among others.

Other Articles in this series:

References:

--

--

Kaustav Mandal
exemplifyML.ai

Software Engineer with an interest in Machine Learning / Data science , ML Ops