COMPUTER VISION | IMAGE DATA | INSTANCES VS IMAGES

MORE IMAGES OR MORE INSTANCES? Which is preferable for your computer vision project?

data size, images, and instances, my experiences related to this trade-off

Chinmay Bhalerao

3 min readJan 27, 2024

In this brief blog post, I aim to address the most important questions drawing from my own experiences.

WE CAN CONNECT ON :| LINKEDIN | MEDIUM

➤ The question is

“Should we have more images for training or should we have more instances for training for better results for computer vision projects”?

Also,

“How many minimum images do we need to train to get better result”?
“How many minimum instances do we need to train to get better results”?

To answer these questions, few things we have to consider.

✦ Minimum Dataset Size

According to Saleh Shahinfar, et al. (2020) in the “How many images do I need?” paper, training an object detection model shows an inflection point of around 150–500 images per class where the earlier sharp performance gain starts to level off.

Also, Changsin Lee explains very briefly related to the number of images related to project-specific tasks in detail. HERE is the link.

✦ Minimum instances

Normally, at least (minimum) number of 200 bounding_boxes_annotations per object should be present. That is, each of your classes should have at least 200 annotations at minimum.

You can find this thread in this StackOverflow discussion.

My thoughts from experiences

✦ About Increasing the Number of Images 

Having a larger and more diverse dataset with a wide range of scenes, lighting conditions, backgrounds, and object appearances can help the model generalize better.
More images provide the model with more varied examples of objects, making it more robust and less prone to overfitting 
A larger dataset also allows the model to learn and capture various object appearances and variations, improving its ability to detect objects in new, unseen images.

✦ About Having More Class Instances

In object detection and segmentation, the model needs to learn to detect and differentiate between different classes of objects. Having more instances of each class in the dataset allows the model to better understand the characteristics of each class, leading to improved accuracy for those specific classes.
More instances of a class enable the model to learn class-specific features and better distinguish them from other objects. Do we need more images but with less instances for good performance or more instances but with less images? Your question is excellent.

So the conclusion is

There are pros and cons for both approaches. A model trained with more images with less instances tends to generalize better but might not perform well for certain instances while overfitting on some other instances.On the other hand, a model trained with more instances but with less images might cover pretty well for specific instance but not generalize well for unseen cases.
A good rule of thumb is always look at the actual distribution of the real world and try to emulate similar distributions in your training, validation, and test data. Of course, after it’s been deployed, you need to monitor the distribution and recollect/re-annotate/re-train as needed.

After the above conclusion, you can conclude about the number of instances and number of images about your project. It seems a pretty dataset-dependent thing but by knowing the above conclusions you can decide well about the dataset.

If you have found this article insightful

It is a proven fact that “Generosity makes you a happier person”; therefore, Give claps to the article if you liked it. If you found this article insightful, follow me on Linkedin and medium. You can also subscribe to get notified when I publish articles. Let’s create a community! Thanks for your support!

Also, medium doesn’t give me anything for writing, if you want to support me then you can click here to buy me coffee.

COMPUTER VISION | IMAGE DATA | INSTANCES VS IMAGES

MORE IMAGES OR MORE INSTANCES? Which is preferable for your computer vision project?

data size, images, and instances, my experiences related to this trade-off

My thoughts from experiences

If you have found this article insightful

You can read my other blogs related to :

How to Choose the Best Algorithm for Your Machine Learning Project

Optimizing ML Model Performance: A Guide to Algorithm Selection

Comprehensive Guide: Top Computer Vision Resources All in One Blog

Save this blog for comprehensive resources for computer vision

Who Owns Copyright of AI-Generated Art: Legal Implications You Need to Know

AI and Copyright Law: The Debate Around Ownership of AI-Generated Art

Written by Chinmay Bhalerao