Hi Shiv, There are a lot of factors to consider. What GPU do you have? How many? What video resolution are you working on? What do you mean by realtime? What FPS would satisfy you?

On my machine, I have RTX 2080 Ti and it is going at 10 FPS on assets/video/traffic.small.mp4 video file running the following command:

python process_video.py -i assets/video/traffic.small.mp4 -p -d

There is always a trade-off between accuracy and speed. In my examples, I’m using mask_rcnn_R_50_FPN_3x from theDetectron2 model zoo, which looks like the fastest one.

You can try to create your own model with fewer parameters to train but that’s another story.

There is an interesting discussion about “need for speed” here and here.

I’m looking forward to Detectron2go:

Detectron2go: Facebook AI’s computer vision engineers have implemented an additional software layer, Detectron2go, to make it easier to deploy advanced new models to production. These features include standard training workflows with in-house data sets, network quantization, and model conversion to optimized formats for cloud and mobile deployment. — Detectron2: A PyTorch-based modular object detection library

