Day 86(DL) — Custom Object detector setup(yolov5)

Nandhini N
Apr 4 · 5 min read

This post explains some of the points to be taken care of while implementing yolov5 for simple custom object detection. YOLOv5 is one of the fastest object detection models(with relevant accuracy) and for the very same reason, it became a go-to choice for detection use cases. The code is written in PyTorch that can be referred from the ultralytics link. The convolutional architecture(along with anchor boxes) can be directly used with a minimal tweak to the parameters to accommodate our requirement.

Table of contents:

  • Image set and the labelling
  • Formating the bounding box coordinates and labels
  • Splitting the dataset and creating folder structure for yolov5
  • data.yaml to specify image & label location
  • config file modification for custom label count

Image set and the labelling: We can either bring our own set of images or download some of the datasets that are available online for experiments. The next step would be to use any one of the annotation tools(CVAT, Labellmg or Roboflow) to mark the bounding boxes and the corresponding class name. The output of this step will be an image, four coordinates(bb box) and the class label.

Formating the bounding box coordinates and labels: The below image has width = 5184 and height = 3456. The bb box coordinates are represented as follows xmin = 1727.76, ymin = 1481.52, xmax=2505.36 and ymax= 2215.92 where xmin & ymin refers to the bottom left corner and the xmax & ymax denotes the top right corner of the bounding box.

Photo by Ella Baxter on Unsplash

yolov5 expects the label and bounding box coordinates as a .txt file. Usually, the labels/classes are indicated by numbers starting from zero.

Label file format for yolov5

x-center = xmin + xmax / 2

y-center = ymin + ymax / 2

width = xmax-xmin

height = ymax-ymin

All of the above values should be normalised by the width and height of the image respectively. The name of the label file(text format) should be the same as the image file name. For instance, say our image file name is ‘sun.PNG’, then the label file looks like the below,

Since we have only one label, it is given as ‘0’ in the example specified. Most of the annotation tools have the option to download in the Yolo format.

Splitting the dataset and creating folder structure for yolov5: Below is the folder structure needed for the yolov5 format. The train, valid and test folders have a set of images and the corresponding labels as subfolders.

data.yaml to specify image & label location: Till now we have just created all the required dataset. In order for the algorithm to fetch the information, the details are communicated in the form of data.yaml file. It includes the path for train and validation image folders, the number of classes and the corresponding class names.

Notes: When we use the Roboflow for dataset annotation and download, all of the above steps will be automatically taken care of (even the creation of data.yaml). We need to specifically mention the architecture we are executing and based on that the afore-mentioned details are downloaded.

config file modification for custom label count: The final step is the configuration file. If we list down the different variants of the yolov5, we have almost 4 configurations represented as ‘.yaml’ files. The main differentiating factor is the resolution and the speed. For example, yolov5s is the smaller version whereas yolo5l corresponds to the higher end. Either one is compromised(speed or accuracy), yolov5l is capable of producing high-resolution outcomes but the computational cost is comparatively heavy.

Since yolov5 is pre-trained on coco dataset, the number of classes in the config yaml files contain nc=80. This needs to be modified based on the labels we have. We can copy the existing .yaml file and make the necessary updates.

After this setup, we can train the model by executing the train.py file and the predictions are made by processing detect.py. we can refer to the actual execution from the Roboflow code.

We can refer to a couple of below links for the questions related to the input image sizes.

Recommended Reading::

https://blog.roboflow.com/yolov5-improvements-and-evaluation/

https://github.com/ultralytics/yolov5/wiki/Tips-for-Best-Training-Results

https://github.com/ultralytics/yolov5

Nerd For Tech

From Confusion to Clarification

Nerd For Tech

NFT is an Educational Media House. Our mission is to bring the invaluable knowledge and experiences of experts from all over the world to the novice. To stay up to date on other topics, follow us on LinkedIn. https://www.linkedin.com/company/nerdfortech

Nandhini N

Written by

AI Enthusiast

Nerd For Tech

NFT is an Educational Media House. Our mission is to bring the invaluable knowledge and experiences of experts from all over the world to the novice. To stay up to date on other topics, follow us on LinkedIn. https://www.linkedin.com/company/nerdfortech

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store