This post explains some of the points to be taken care of while implementing yolov5 for simple custom object detection. YOLOv5 is one of the fastest object detection models(with relevant accuracy) and for the very same reason, it became a go-to choice for detection use cases. The code is written in PyTorch that can be referred from the ultralytics link. The convolutional architecture(along with anchor boxes) can be directly used with a minimal tweak to the parameters to accommodate our requirement.
Table of contents:
- Image set and the labelling
- Formating the bounding box coordinates and labels
- Splitting the dataset and creating folder structure for yolov5
- data.yaml to specify image & label location
- config file modification for custom label count
Image set and the labelling: We can either bring our own set of images or download some of the datasets that are available online for experiments. The next step would be to use any one of the annotation tools(CVAT, Labellmg or Roboflow) to mark the bounding boxes and the corresponding class name. The output of this step will be an image, four coordinates(bb box) and the class label.
Formating the bounding box coordinates and labels: The below image has width = 5184 and height = 3456. The bb box coordinates are represented as follows xmin = 1727.76, ymin = 1481.52, xmax=2505.36 and ymax= 2215.92 where xmin & ymin refers to the bottom left corner and the xmax & ymax denotes the top right corner of the bounding box.
yolov5 expects the label and bounding box coordinates as a .txt file. Usually, the labels/classes are indicated by numbers starting from zero.
x-center = xmin + xmax / 2
y-center = ymin + ymax / 2
width = xmax-xmin
height = ymax-ymin
All of the above values should be normalised by the width and height of the image respectively. The name of the label file(text format) should be the same as the image file name. For instance, say our image file name is ‘sun.PNG’, then the label file looks like the below,
Since we have only one label, it is given as ‘0’ in the example specified. Most of the annotation tools have the option to download in the Yolo format.
Splitting the dataset and creating folder structure for yolov5: Below is the folder structure needed for the yolov5 format. The train, valid and test folders have a set of images and the corresponding labels as subfolders.
data.yaml to specify image & label location: Till now we have just created all the required dataset. In order for the algorithm to fetch the information, the details are communicated in the form of data.yaml file. It includes the path for train and validation image folders, the number of classes and the corresponding class names.
Notes: When we use the Roboflow for dataset annotation and download, all of the above steps will be automatically taken care of (even the creation of data.yaml). We need to specifically mention the architecture we are executing and based on that the afore-mentioned details are downloaded.
config file modification for custom label count: The final step is the configuration file. If we list down the different variants of the yolov5, we have almost 4 configurations represented as ‘.yaml’ files. The main differentiating factor is the resolution and the speed. For example, yolov5s is the smaller version whereas yolo5l corresponds to the higher end. Either one is compromised(speed or accuracy), yolov5l is capable of producing high-resolution outcomes but the computational cost is comparatively heavy.
Since yolov5 is pre-trained on coco dataset, the number of classes in the config yaml files contain nc=80. This needs to be modified based on the labels we have. We can copy the existing .yaml file and make the necessary updates.
After this setup, we can train the model by executing the train.py file and the predictions are made by processing detect.py. we can refer to the actual execution from the Roboflow code.
We can refer to a couple of below links for the questions related to the input image sizes.
input size ? · Issue #122 · ultralytics/yolov5
I use yolov5x.pt to test images. and the screen shows: My question : 1.My image size are all not 384x512 or 320x512…
Object sizing and Image size · Issue #700 · ultralytics/yolov5
I am sure this question has been beat to death, but nothing in the past questions really answered this, so here we go…
How to Train A Custom Object Detection Model with YOLO v5
In this post, we will walk through how you can train the new YOLO v5 model to recognize your custom objects for your…