Inference your Object Detection Transformer (DETR) custom trained model on AWS Sagemaker using PyTorchModel Library
Hello and welcome to a new tutorial! In this tutorial we find ourselves deploying a custom trained DETR model for Object Detection using the AWS resources directly. We will be utilizing the PyTorch Deep Learning Inference Containers. The object detection model is trained to classify and label particles found in SEM images of Metal powder feedstock used for metal Additive Manufacturing (AM) purposes.
I will not get into details regarding the custom model itsself, while the purpose of this tutorial is to showcase how easy it is to deploy your DETR model using AWS PyTorch Deep Learning Containers.
A great article on how to Inference your own NLP trained-model on AWS SageMaker with PyTorchModel or HuggingeFaceModel is provided in this post by Spyros Dimitriadis.
To train your DETR model on a custom dataset, feel free to follow the instructions found in this github repo here by Niels Rogge. The training is based on the repo provided by Facebook Research group found here.
The post is split into 4 parts:
- Save the model
- Custom inference.py script
- Create a model.tar.gz file and upload it on S3
- Deploy the endpoint with AWS Sagemaker Python SDK
- Save the model
To save the model you have trained, follow the code found below
import os
import torch# ... trained `model`, then save it to `model_dir`
with open(os.path.join(args.model_dir, 'model.pt'), 'wb') as f:
torch.save(model.state_dict(), f)
2. Custom inference script
The inference.py script is used to fit user needs when it comes to model inference. We will overwrite the existing functions provided by AWS. The functions that we will overwrite are the following:
- model_fn: loads the model, returns the model used in the predict_fn
- input_fn: This function handles data decoding. In our case we have encoded images, and sent them via a json file, thus we decode the input request file
- predict_fn: After the inference request has been deserialized by input_fn, the SageMaker PyTorch model server invokes predict_fn on the return value of input_fn
- output_fn: After invoking predict_fn, the model server invokes output_fn for data post-process
The following gist includes all changes that are needed to overwrite the inference.py script. It includes also the definition of the DETR model as well as the DETR inference model.
3. Create and upload the model.tar.gz file on S3
In order for the inference to run smoothly, a specific document format is needed within the tar.gz file
model.tar.gz/
├── model.pth
└── code/
├── inference.py
└── requirements.txt
Make sure that the format is correct, or else the inference will not succeed!!
The requirements.txt file includes the following libraries:
numpy
pillow
torchvision
pytorch-lightning
transformers
timm
Let’s create the .tar.gz file:
!tar -cvpzf model.tar.gz model.pt ./code
We assume that you are already familiar with AWS services. We are going to use boto3 (AWS Python SDK) and sagemaker Python library to upload the artifacts directly on S3:
Note: You need to create an S3 bucket before you run the code found below
4. Deploy model and endpoint with the AWS PyTorchModel
From our point of view, it would be better to run all the cells provided above and below into a notebook.
Notes:
- PyTorchModel: Creates a model within AWS Sagemaker. You need to specify model.tar.gz location on S3, PyTorch version, Python version, an IAM role with permissions to create an Endpoint and finaly which is the inference code to use (entrypoint) and where is it located (code/inferency.py)
- predictor: this is the the lines of code that create the endpoint. The user specifies instance types (here), a chosen endpoint_name and (de)serializer. The Serializer specifies that the model will be expecting a JSON file and the Deserializer specifies that the output will be sent as a JSON file.
Wait for a couple of minutes for the model to be deployed. Once its finished, we can use it for inference using a few lines of code:
Output is a list of lists containing the bound boxes of the identified object with format [xC, yC, width, height]
That is it!
Don’t forget to delete model and endpoint because you will see extra charges!
predictor.delete_model()
predictor.delete_endpoint()
EXTRA:
If the model has already been deployed and you want to test it, feel free to use the code found below:
Thanks for reading! Stay tuned for more!