Train FasterRCNN faster with 16-bit precision in Detectron2

Siladittya Manna
The Owl
Published in
2 min readSep 2, 2023

One of the features of Detectron2 is that it is faster than its previous versions. Now, with the release of the wonderful library PyTorch Lightning, it is possible to train models using float16, bfloat16, float32, and other formats. For running large models, accommodating a large batch size, or faster training, it is advisable to use Automatic Mixed Precision in PyTorch.

Using PyTorch Lightning

This can be easily done in PyTorch Lightning by setting the precision argument to ‘16’ or ‘bf16’ when instantiating the Trainer. Detectron2 provides the provision for training FasterRCNN using PyTorch Lightning by providing a training script lightning_train_net.py in the detectron2/tree/main/tools directory.

At the time of writing this article (2nd Sept 2023), there are some minor bugs in the lightning_train_net.py script that will throw an error when running without rectifying them. To avoid those errors, please make the changes as reported in this Issues thread.

Without using PyTorch Lightning

Detectron2 also provides the provision for automatic mixed precision without PyTorch Lighting. Detectron2 provides an AMPTrainer class in detectron2/blob/main/detectron2/engine/train_loop.py. Thus, using the training script train_net.py, we can train Object Detection or Segmentation models with 16-bit precision.

How to do it?

To train a FasterRCNN, MaskRCNN, or any other models available in Detectron2 MODEL_ZOO with automatic mixed precision, we need to set the argument SOLVER.AMP.ENABLED to True.

With PyTorch Lightning

This sets the precision argument in PyTorch Lightning Trainer to ‘16’ for 16-bit precision in lightning_train_net.py. This value can be changed to ‘bf16’ in lightning_train_net.py for bf16 precision training if supported.

Without PyTorch Lightning

In train_net.py, however, setting the above argument to True sets the trainer to an AMPTrainer instance, and a SimpleTrainer instance if set to False.

Other ways to do the same thing

Otherwise, this argument can also be set in the ‘.yaml’ config file by directly adding the following lines under SOLVER

SOLVER
.
.
.
AMP
ENABLED: True

Finally

The whole configuration file given here for PASCAL VOC Object Detection will look like this when modified

_BASE_: "../Base-RCNN-C4.yaml"
MODEL:
WEIGHTS: "detectron2://ImageNetPretrained/MSRA/R-50.pkl"
MASK_ON: False
RESNETS:
DEPTH: 50
ROI_HEADS:
NUM_CLASSES: 20
INPUT:
MIN_SIZE_TRAIN: (480, 512, 544, 576, 608, 640, 672, 704, 736, 768, 800)
MIN_SIZE_TEST: 800
DATASETS:
TRAIN: ('voc_2007_trainval', 'voc_2012_trainval')
TEST: ('voc_2007_test',)
SOLVER
STEPS: (12000, 16000)
MAX_ITER: 18000 # 17.4 epochs
WARMUP_ITERS: 100
IMS_PER_BATCH: 16
BASE_LR: 0.1
AMP
ENABLED: True

Clap and share if you like this article. Follow for more.

--

--

Siladittya Manna
The Owl

Senior Research Fellow @ CVPR Unit, Indian Statistical Institute, Kolkata || Research Interest : Computer Vision, SSL, MIA. || https://sadimanna.github.io