Detectron2 vs. Yolov5 (Which One Suits Your Use Case Better?)

Published in

iReadRx

4 min readAug 3, 2021

Choosing an AI model for a particular problem can’t be that hard, right?. For typical machine learning tasks like classification, good old Random Forest never lets you down. But when you enter the world of deep learning, with so many variations of the same model for the same task (all of them being State Of The Art in their own way ), it can get overwhelming. In our case, we started with a simple problem, just detecting compound structures in pdf documents. An object detection task basically, so any object detection model that could quickly run inference on all pages and detect compound structures would do.

We began by using YOLOv5 as it was pretty fast and decently accurate in detecting compound structures. As it rendered good results, we decided to broaden the problem scope. Now instead of just detecting all chemical compounds, we wanted to distinguish between “relevant structures,” “Markush structures,” “intermediates,” “substitutes,” and “reactions.”

After getting the required training data, we continued to train YOLOv5 on the new dataset expecting it to do just as well as the last time. As you might have guessed, the results weren’t as good as expected. The reason was, the classes we were trying to distinguish looked pretty similar. Here are some samples from our training data to show what I mean.

As you can see, the “relevant structures” look pretty similar to “Markush structures” or “intermediates,” and the “substitutes” look like any other paragraph. Our problem was no longer a typical object detection problem; it was now a fine-grained object detection problem.

Detectron2 Results

We had to switch to a more accurate model that could better differentiate these classes. Detectron2’s Faster RCNN turned out to be a great choice. We reached a 92% accuracy within 1500 epochs (20 mins of training). The mean average precision was the same as YOLOv5. However, it didn’t matter much as the model could accurately detect the required classes.

Comparison With YOLOv5

So Which Model Should You Use?

The answer to this question depends on three crucial factors.

Training Data Size and Accuracy

If the number of images in your dataset is less than, let’s say, 150, Detectron2 would be your best bet in most cases. However, suppose you don’t want to compromise on speed. In that case, you can use YOLOv5 only if your classes have unique features that are easily distinguishable (a chair and a laptop, for example).

On the other hand, YOLOv5 should be the model of choice if you have enough training images. There would be no significant difference in accuracy between YOLOv5 and Detectron2. YOLOv5, however, would be much faster.

Model Size

This is rather simple. YOLOv5 has a much smaller model size compared to Detectron2. So if both models perform similarly on your dataset, YOLOv5 would be a better choice. However, Detectron2 did a better job in our case, so we chose it even though it had a much larger model size.

Resource Usage and Training Time

YOLOv5 uses lesser resources compared to Detectron2 partly because of its small size. If both these models give you decent results, you should choose the one that uses lesser resources.

Surprisingly YOLOv5 takes longer to train than Detectron2, nearly double the time in our case. Detectron2 makes it easier to experiment with different hyperparameters as you get to see results pretty fast.

Conclusion

It is easy to see that Detectron2 is more accurate while YOLOv5 is faster and more efficient.

We used Detectron2 for our problem as it was the only model which gave us decent predictions. It even predicted classes that were underrepresented in the dataset while YOLOv5 ignored them.