YOLOv4 — Version 4: Final Verdict

An Introductory Guide on the Fundamentals and Algorithmic Flows of YOLOv4 Object Detector

Shreejal Trivedi
VisionWizard
4 min readMay 28, 2020

--

Source: Photo by Joanna Kosinska on Unsplash

Welcome to the final part of YOLOv4¹ mini-series.

YOLOv4 — Version 0: Introduction

YOLOv4 — Version 1: Bag of Freebies

YOLOv4 — Version 2: Bag of Specials

YOLOv4 — Version 3: Proposed Workflow

YOLOv4 — Version 4: Final Verdict

I hope we were able to do a thorough walk through of all nuts and bolts of this amazing research.

This article’s main focus is on analytical results rather than any informative explanations. One last ride, let’s begin the finale.

This article will state the analytical comparisons between yolov4 and other object detectors.

1. Finalizing Bag of Freebies attributes

  • As discussed in the introduction of this series, many candidates were taken into consideration and were finalized to a small subset of them. We can analyze from the given tables below, how it affects the accuracy of the model.
  • Results given below in the form of tables are self-explanatory. The number’s speak for itself.
Fig. 1 Final Candidates for Bag of Freebies
Fig. 2: Bag of Freebies Abbreviations([1])

1.1 Results of BoF + Detector

Table. 1: Detector + BoF Ablation Study: Architecture used CSPResNext-50-SPP-PANet-512X512[1]

1.2 Results of BoF + Classifier

Table. 2: BoF + Classifier Ablation Study. Architecture: CSPResNext-50[1].
Table.3 BoF + Classifier Ablation Study. Architecture: CSPDarkNet-53[1].

2. Finalizing Bag of Specials attributes

Fig. 3 Bag of Specials Finalized Candidates
Table. 4: Ablation Study of BoS on CSPResNext-50 architecture[1].
  • Mish and DIoU-NMS are taken into consideration during the inference stage.

3. Effect of BoF + BoS on Training MiniBatch Size

Table. 5: Ablation study of different batch sizes with and without using BoF+BoS[1].

4. Results of Backbone + Neck + Head

Table. 6: Selection of Backbone, Neck, and Head of the detector. This table shows that Backbone: CSPDarkNet53, Neck: PAN + SPP, and Head: YOLOv3 outperforms all the models that were taken into consideration (BoF included)[1].
  • As discussed in the first article of these series, the architecture of CSPDarknet53 was proved to be most optimal in terms of the receptive field, FPS, FLOPs, etc.
  • After leveraging the techniques from Bag of Specials and Bag of Freebies with the given backbone CSPDarknet53 proved to give the best results of 43% AP on COCO test-2017.

5. Comparison of mAP and FPS with different Object Detectors

Fig. 4 Graph showing mAP vs FPS results for different object detectors.

Yolov4 state-of-the-art detector which is faster (FPS)and more accurate (MS COCO AP[50…95] and AP50) than all available alternative real time detectors.

The original concept of one-stage anchor-based detectors has proven its viability. We have verified a large number of features, and selected for use such of them for improving the accuracy of both the classifier and the detector. [1]

Here ends the final part of this series. I know, it was a long journey, but I hope you now have a good grip in terms of the algorithmic perspective of YOLOv4.

Please check out other parts of our entire series on Yolov4 on our page VisionWizard.

It looks like you have a real interest in quality research work if you have reached till here. If you like our content and want more of it, please follow our page VisionWizard.

Do clap if you have learned something interesting and useful. It would motivate us to curate more quality content for you guys.

Thanks for your time :).

References

[1] YOLOv4

--

--

Shreejal Trivedi
VisionWizard

Deep Learning || Computer Vision || AI || Editor — VisionWizard