Complete Step-by-Step Guide to Build a Custom Object Detection Model with YOLOv5 — Part 1

Yinchuang Sum
Classifai
Published in
8 min readMay 17, 2021

Object Detection

Object detection is a computer vision technique that allows machines to identify and locate objects present in an image or video. This article describes a custom object detection model training workflow, along with a step-by-step guided example of wheel chair detection model using YOLOv5.

This project is split into 2 posts. The first part will be discussing the overall preparation of the custom dataset, whereas the second part will be elaborating on model training, evaluation and inference.

By the end of these posts, you will be able to

  • Automate deep learning dataset creation
  • Train a custom YOLOv5 model efficiently
  • Understand the criteria of a high performing model

Codebase

The scripts for this project is in the Github repository. It is completely replicable, so make sure you are following along!
How to clone a repository

Clone the repository

git clone https://github.com/CertifaiAI/classifai-blogs.git

The codebase will be in the folder 0_Complete_Guide_To_Custom_Object_Detection_Model_With_Yolov5

Getting Started

Installations

To get the script running, Anaconda, ChromeDriver and ClassifAI are required to be installed on your machine. A quick installation guide is provided here.

Environment Setup

After installing all the required softwares, a new conda environment is required.

  1. Start your terminal/Anaconda Prompt from 0_Complete_Guide_To_Custom_Object_Detection_Model_With_Yolov5 folder.
  2. Create a new conda environment with all the required packages installed.
conda env create -f environment.yml

Machine Learning Project Lifecycle

A typical machine learning project lifecycle is briefly described by the figure above, we will be tightly following the workflow during the whole project.

Project Scoping

Before any deep learning project is started, the problem statement must be defined clearly. With a clear and fixed use case of the model, it will be easier to decide on trade-offs along the way.

Use Case

For the current example, the model is trained to detect wheel chairs. If applied to public utilities, privilege could be given to a disabled person autonomously.

Model Selection

Under this circumstance, YOLOv5 is chosen for reasons:

  1. Only bounding boxes are required
    We are only interested to detect the presence of objects instead of masking them. Information about what is a bounding box.
  2. Fast computation speed
    Real-time reaction is required for this application.

Classes to Detect

  1. Person: To define if a person is sitting in a wheel chair
  2. Wheel chair: The main object to be detected
  3. Not wheel chair: Vehicles that are not wheel chair

Data Collection

Since YOLOv5 is a deep learning model, it requires data to train the model. One of the popular ways to obtain data is through the internet. The technique demonstrated here is called web scraping.

Web Scraping

Web scraping is the process of using bots to extract content and data from a website. It is a widely used technique to get publicly available raw data that exists on the internet.

There are many tools to perform web scraping. Selenium is one of the versatile tools to perform web scraping due to its JavaScript rendering ability. It could simulate real users’ actions, enabling more flexibility to scrape data from various websites.

Here, the script to scrape images from Google is provided. To learn more about Selenium, here’s a good post to read about.

Running the Code

  1. Start your terminal from WebScraping folder.
  2. Activate conda environment with command.
conda activate object-detection

3. Scrape the images. Image scraping might take some times.
Note: Replace variables with <<>> to respective arguments

I. keyword to scrape: The keyword to be searched in Google.
II. folder name: The folder name to store the scraped image. It will also be used as the prefix of the name of scraped images. Names with space are not recommended to be used here.
III. number of images: Specify the number of images to be scraped. You are suggested to scrape more images than the amount you need. You may also enter “-1” to scrape all images.

python ./src/main.py <<keyword to scrape>> <<folder name>> <<number of images>>

Eg.

python ./src/main.py "wheelchair" "wheelchair1" -1

The scraped images will be in the WebScraping/images folder.

Results

Great! Now you have the data required to train the model. However, there will be some irrelevant data too in your collection. Therefore, image filtering is required to remove all the unwanted images. For example, cartoon images should be filtered because they will not be detected during inference.

Data Annotation

Date annotation is the process of labelling the data. In a supervised machine learning model, labelled data is used to provide the “correct answer” to the model for it to “learn from mistakes”.

The quality of annotation directly affects the model’s performance as it is the only part that jeopardizes the data correctness and consistency. The model would be able to learn faster and better if these two characteristics are well taken care of.

ClassifAI

ClassifAI is one of the most comprehensive open-source data annotation tools. It supports the labelling of various data types for AI model training.

In this example, ClassifAI is used to label the data. The reasons for using ClassifAI are:

  1. Easy installation and multi-platform support
    It supports Windows, Mac and Linux OS. Besides, only a few clicks required to install ClassifAI into your machine.
  2. User-friendly UI
    The user interface is user-friendly that requires no programming background from its users to perform data labelling.
  3. Autosaving feature
    Every click is autosaved in the database. This allows the pause and resume of projects easily. This is extremely useful for a project that involves a large amount of data.

User Guide

Dataset

Click here to obtain the dataset. The dataset consists of 1266 images with classes of ‘person’, ‘wheel chair’ and ‘not wheel chair’. They are already well labelled, but you can import them to try out ClassifAI. Alternatively, you may import images of your own into ClassifAI and label them to have your own hands-on experience.

Bounding Box Project Setup

  1. Launch ClassifAI from your computer.
  2. Click the left-most button to launch the ClassifAI browser application.

3. Click the Image column to navigate to project type selection.

4. Select the left one which is the bounding box project. You will be navigated to a project creation interface.

5. Enter the project name and click the ‘Create’ button to create a new project.

6. In the labelling interface, the label list must be configured first. Use the “+” and “-” buttons to add and remove labels.

7. In this project, there are three classes; person, wheelchair, and non-wheelchair. Remove the default class.

Image Import

  1. The first and second buttons of the toolbar are the import from folder and import from files button respectively. Here, we will import images from a folder.

2. When you click the import button, a folder selection window will pop up. Select the folder containing the images and click open.

3. After importing, you will see the first image in the center canvas. A complete image list will be in the image list column.

Bounding Box Labelling

  1. Click the “fit to screen” button to ensure that the image fills the whole screen. This would ease labelling more accurately.

2. Click the “draw rectangle” button to switch into drawing mode.

3. Hold the left-click and drag the mouse pointer to draw a bounding box. The label selection will pop up for you to pick the right label.
Tips: Make use of the yellow referencing line to draw a perfect bounding box!

Label Export

  1. When you’re done labelling, click the “save” button. A window with multiple saving options prompts.

2. Click the bulk saving option, then click the YOLO button to save all the labels in YOLO format.

3. All the labels will be zipped and downloaded to your browser default download location.

Dataset Structure

To train YOLOv5 model using the script provided in the repository, the file structure of the dataset needs to be fixed. If you are using the dataset provided, the structure had been configured for you. If you are trying to build your own custom dataset, refer to the following diagram.

.
+-- dataset
+-- train
| +-- images
| | +-- <<images>>
| +-- labels
| +-- <<labels>>
+-- valid
| +-- images
| | +-- <<images>>
| +-- labels
| +-- <<labels>>
+-- test
| +-- images
| | +-- <<images>>
| +-- labels
| +-- <<labels>>
+-- data.yaml
  1. train/valid/test folder

Each train/valid/test folder should contain two folders: images and labels.

  • images folder contains all the images
  • labels folder contains all the label file in txt file with format:
label x y w h

Note: Data labelled in ClassifAI is already in this format. The only required step is putting them into the right file structure

2. data.yaml

data.yaml is a config file for the model containing data paths and class names. Make sure the number of classes is correct. Verify the name too because they will not be modifiable after model training
Note: Replace the <<number of classes>> and <<array of class name>>

train: ../train/images
val: ../valid/images

nc: <<number of classes>>
names: [<<array of class names>>]

In this project, it will be:

train: ../train/images 
val: ../valid/images
nc: 3
names: ['person', 'wheel chair', 'not wheel chair']

In the project, data augmentation and train-validation-test split are done using roboflow. This is optional, as long as the dataset is following the structure mentioned above.

What’s Next?

Tune on to Part 2 for interesting details on the following topics:

  1. Model Training
  2. Model Evaluation
  3. Model Inference

--

--

Yinchuang Sum
Classifai

Deep Learning Enthusiast | System Engineer | Python | Java