Comparing YOLOv10, YOLOv9, and YOLOv8: A Performance Study

Azam Kowalczyk
4 min readJun 19, 2024

--

In the growing field of computer vision, object detection models are continually being improved and refined. In this article, I share the results of my study comparing three versions of the YOLO (You Only Look Once) model family: YOLOv10 (new model released last month), YOLOv9, and YOLOv8. The focus of this study was to evaluate these models based on their accuracy, speed, and model parameters, specifically in the context of object detection tasks.

Model Inference Video

To illustrate the differences in performance, I did inference tests with each model and recorded the results. Below are the inference results for YOLOv8m, YOLOv9c, and YOLOv10m.

YOLOv8m inference result
YOLOv9c inference result
YOLOv10m inference result

Comparison Results

To compare these models, I used YOLOv8m, YOLOv9c, YOLOv10m. The reason for comparing the medium-sized models is that there is only YOLOv9c pretrained weight available, and the size of this model is similar to the medium-sized YOLOv8 and YOLOv10 models.

As you see in the picture, YOLOv8m mislabeled an object, whereas YOLOv9c correctly labeled it. This is a minor mistake, and with further training, the model will likely improve.

YOLOv8m detection
YOLOv9c detection

Interestingly, YOLOv10m detected the mislabeled object by YOLOv8m, but its confidence score was generally lower compared to YOLOv9c.

YOLOv10m detection

Detailed Comparison

Here’s another set of comparison results, highlighting the performance of YOLOv8n, YOLOv8m, YOLOv9c, and YOLOv10m.

YOLOv8n detects all marked objects correctly
yolov8m detected all marked objects correctly
YOLOv9c incorrectly labeled one object but correctly detected two other marked objects
YOLOv10m failed to detect any of the marked objects

Observations

The training results show that while the YOLOv10 model is much smaller in size compared to YOLOv8, its accuracy is significantly lower on my dataset. These results can vary depending on the dataset used. Therefore, it’s always crucial to benchmark new models based on your specific use case, as a newer model does not necessarily mean it will perform better for your particular needs.

Comparison Table

One other thing I noticed is that the dataset I used was imbalanced. When object instances are frequent, the accuracy for YOLOv8, YOLOv9, and YOLOv10 is similar. However, for rare objects (like van and truck in my dataset), the accuracy of YOLOv10 drops significantly compared to versions 8 and 9.

YOLOv8m training output
YOLOv9c training output
YOLOv10m training output

By sharing these results, I hope to provide valuable insights into the performance of different YOLO models and highlight the importance of tailoring model selection to specific use cases. As always, continuous benchmarking and testing are key to achieving the best results in any machine learning project.

For those interested in the technical details, please check out the the study on Github https://github.com/Azitt/YOLO_V8-V9-V10_Object-detection_Comparison

dataset:

Kaggle dataset https://www.kaggle.com/datasets/javiersanchezsoriano/roundabout-aerial-images-for-vehicle-detection/code

References:

--

--