Custom Object Detection with YOLOv4

9 min readJun 21, 2022

This article will mainly discuss how to build YOLOv4 to detect custom objects. Starting with the YOLOv4 introduction, how to get or build our own dataset, and how to build YOLOv4 to detect custom objects in our dataset.

1. What is YOLO?

YOLO stands for You Only Look Once. YOLO is a state-of-the-art, real-time object detection system that was developed by Joseph Redmon. It is a real-time object recognition system that can recognize multiple objects in a single frame.

YOLO applies a single neural network to the full image. This network divides the image into regions and predicts bounding boxes and probabilities for each region. These bounding boxes are weighted by the predicted probabilities.

2. YOLOv4 in a nutshell

YOLOv4 is an object detection algorithm that was created by Alexey Bochkovskiy, Chien-Yao Wang, and Hong-Yuan Mark Liao. YOLOv4 achieves 43.5% AP / 65.7% AP50 accuracy according to the Microsoft COCO test at speed 62 FPS TitanV or 34 FPS RTX 2070.

YOLOv4’s architecture is composed of CSPDarknet53 as a backbone, spatial pyramid pooling additional module, PANet path-aggregation neck, and YOLOv3 head. Features of YOLOv4: Weighted-Residual-Connections (WRC), Cross-Stage-Partial-connections (CSP), Cross mini-Batch Normalization (CmBN), self-adversarial-training (SAT), mish activation, mosaic data augmentation, dropBlock regularization, and Complete Intersection over Union loss (CIoU loss).

YOLOv4 model comparison with other state-of-the-art models

3. How to get your own dataset

If you want a ready-to-use dataset, you can find it on Kaggle, COCO dataset, ImageNet, or Google Open Images. When you already found the ready-to-use dataset, make sure that the annotation file is in YOLO format (.txt). The YOLO annotation file contains one text file per image (containing the annotations and a numeric representation of the label) and a label map that maps the numeric IDs to human-readable strings. The annotations are normalized to lie within the range [0, 1] which makes them easier to work with even after scaling or stretching images.

If you want to build your custom dataset, prepare your image files. You can manually prepare or download the image, or automatically download it from a search engine. If you want to automatically download images from, for example, Google, you can use the python script below. Keywords are the keywords used to search the image, the limit is the number of images you want to download, and extensions are the image type.

from simple_image_download import simple_image_download as sidresponse = sid.simple_image_downloadresponse().download(keywords='Shiba Inu', limit=100, extensions='.jpeg')

After downloading the images, you can annotate them using LabelImg (local), MakeSense.ai (web-based), or another annotation tool with YOLO format. After preparing your dataset, save the image and its annotation files in the same folder. The files in this folder will be used as training data.

4. YOLOv4 Prerequisites

To build the YOLOv4 model on your device, you have to install some prerequisites.

Python (with Anaconda)

Install Anaconda and build an environment with Python installed.

Visual Studio

Install the newest Visual Studio version (2022), community edition is okay. Check the Desktop Development with C++ when installing.

CUDA and CUDNN

Install the CUDA and CUDNN based on the specification of your PC. After installing the CUDA and CUDNN, move all CUDNN files to C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v11.5 (your CUDA version).

CMake

Download and install CMake with default options.

OpenCV

Download OpenCV-Sources and place it in a new folder (e.g. OpenCV). Download OpenCV-Contrib with the same tag version as your OpenCV and place it in the OpenCV folder. Extract both files.

Open CMake, select the opencv-4.5.5 file as the source code folder. In the where to build binaries box, create a file named build in the OpenCV folder and select it. Check the grouped box. Then, select configure, Visual Studio 2022 as the generator, and x64 as platform.

To enable the cv2 and python bindings, after it finishes loading, delete the CMake cache in File > Delete Cache. Then upgrade the NumPy using Anaconda Prompt with pip install --upgrade numpy command. Open CMake and configure again.

After it finishes loading, start the configurations :

Enable WITH_CUDA, BUILD_opencv_dnn, OPENCV_DNN_CUDA, ENABLE_FAST_MATH, BUILD_opencv_world, BUILD_opencv_python3 value box.
In the OPENCV_EXTRA_MODULES_PATH, click browse and select the path to opencv_contrib-4.5.5/modules.
Enable CUDA_FAST_MATH value box.
In the CUDA_ARCH_BIN, change the value according to your GPU (find your compute capability version for the GPU you are using in this table).
In the CMAKE_INSTALL_PREFIX, create a new file named install in the main OpenCV folder and select it.
In the CMAKE_CONFIGURATION_TYPES, remove Debug so the value is only Release.
Select Configure, once it finishes loading, select Generate.

After it finishes generating, open the command prompt and build the CMake file using "CMake\bin\cmake.exe" --build "OpenCV\build" --target INSTALL --config Release.

You can check whether it is successfully built by opening Anaconda Prompt terminal, run python, import cv2, and check the cv2 version using cv2.__version__ command.

5. Configuring YOLOv4 Default Model

5. 1. Download Darknet

Download the Darknet zip file in the github repository. Create a new file (e.g.darknet) and extract the zip file in that folder.
From C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.5\bin folder, copy cudnn64_8.dll file and paste it to darknet\darknet-master\build\darknet\x64 folder.
From OpenCV_CUDA\build\install\x64\vc16\bin folder, copy opencv_world440.dll file and paste it to darknet\darknet-master\build\darknet\x64 folder.

5. 2. Configure Darknet Files

Open darknet.vcxproj in darknet\darknet-master\build\darknet folder using a text editor. Find two CUDA 10.1, change it to your CUDA version, and save the file.
Open yolo_cpp_dll.vcxproj in darknet\darknet-master\build\darknet folder using a text editor. Find two CUDA 10.1, change it to your CUDA version and save the file.
Open yolo_cpp_dll.vcxproj in darknet\darknet-master\build\darknet folder using Visual Studio. Change Debug to Release and Win32 to x64. Right-click on yolo_cpp_dll and click build. Make sure it is built without any errors/failures.
NOTES : If when you build you encounter errors like ‘you need a lower version of Visual Studio (2019, 2017, or lower)’, don’t uninstall your Visual Studio 2022 version. Just install the build tools for the lower Visual Studio version (2019, 2017, or lower). When you want to build, first right-click on the file and click properties. On the Configuration Properties > General > General Properties, change Platform Toolset to Visual Studio (2019) or lower version. Click OK and start the build.
Open darknet.sln in darknet\darknet-master\build\darknet folder using Visual Studio. Change Debug to Release and Win32 to x64. Right-click on darknet and click properties.

Go to C/C++ > General, edit Additional Include Directories and add the OpenCV_CUDA\build\install\include path.
Go to C/C++ > Preprocessor, edit Preprocessor Definitions and remove CUDNN_HALF.
Go to CUDA C/C++ > Device, edit Code Generation and remove compute_75, sm_75
Go to Linker > General, edit Additional Library Directories and add OpenCV_CUDA\build\install\x64\vc16\lib path.
Apply all changes.

Right-click on darknet and click build

5. 3. YOLOv4 Weights File

Download yolov4.weights file from the Darknet github repository and save it in the darknet\darknet-master\build\darknet\x64 folder.

5. 4. Test YOLOv4 on Images, Videos, and Webcam

To run the darknet on images, open Anaconda Prompt and change the directory to darknet\darknet-master\build\darknet\x64 folder. Run darknet.exe detector test cfg/coco.data cfg/yolov4.cfg yolov4.weights and enter the image path that wants to be detected. The image and the predicted bounding boxes will be automatically saved with predictions.jpg file name.

To run the darknet on videos, open Anaconda Prompt and change the directory to darknet\darknet-master\build\darknet\x64 folder. Run darknet.exe detector demo cfg/coco.data cfg/yolov4.cfg yolov4.weights test.mp4. The test.mp4 file must be in the x64 darknet folder. If you want to automatically save the video and its bounding boxes, you can add -out_filename output.mp4 with the run command.
To run the darknet on webcam, open Anaconda Prompt and change the directory to darknet\darknet-master\build\darknet\x64 folder. Run darknet.exe detector demo cfg/coco.data cfg/yolov4.cfg yolov4.weights -c 0.

6. YOLOv4 on Custom Dataset

6. 1. Setup Custom Configurations to Train on Custom Dataset

Go to darknet\darknet-master\build\darknet\x64\data\ and create dataset folder obj.
Move your custom image and annotation files to the obj folder.
Copy coco.data and coco.names from darknet\darknet-master\build\darknet\x64\data\and paste in the same folder.
Rename coco - Copy.data to obj.data and coco - Copy.names to obj.names.
Open obj.names using text editor and list all the classes in your dataset.
Open obj.data using text editor. Follow these steps to configure the file:

Edit classes line with the number of classes in your custom dataset.
Edit train line with data/train.txt.
Remove valid and #valid line because we don't need validation during training.
Edit names line with data/obj.names.
Remove eval = coco line.
Save and close the file.

Go to darknet\darknet-master\build\darknet\x64\cfg folder. Copy yolov4-custom.cfg file and paste in the same folder.
Rename yolov4-custom - Copy.cfg as yolov4-obj.cfg and open the file with a text editor. Follow these steps to configure the training parameters :

Change the subdivisions to 32. If you encounter CUDA memory error, change the subdivisions to 64 or more.
You can also change the batch number according to GPU memory. If the GPU is out of memory when allocating the data, change the batch to a lower number like 32, 16, or lower.
Change width and height according to your image file. Keep in mind that bigger width and height provides more detail but consumes more memory.
Change max_batches to 2000*classes. In this use case, the PCB defect has 6 classes, so the max_batches is 12000.
Change the steps to 80% and 90% of max_batches. In this use case, the steps will be 9600, 10800.
Find [yolo] layers (there are three yolo layers) and change classes according to your number of classes.
Above every [yolo] layers, there are filters parameter and change it to ((num_of_classes + 5) * 3). In this usecase we have 6 classes, so the filters is 33.
After changing the parameters, save and close the file.

6. 2. Download Pre-Trained Weights

Download yolov4.conf.137 on the Darknet github repository and place it in the darknet\darknet-master\build\darknet\x64 directory folder.

6. 3. Create List of Training Images

Save the list as train.txt file and save it to darknet\darknet-master\build\darknet\x64\data directory folder.

6. 4. Start Training on Custom Dataset

Train the YOLOv4 model on custom dataset by opening Anaconda Prompt and change the working directory to darknet\darknet-master\build\darknet\x64.
Then run this command : darknet.exe detector train data/obj.data cfg/yolov4-obj.cfg yolov4.conv.137.
The training usually takes a lot of time, just stop it if the loss graph is already converging.
When you stop the training, in the darknet\darknet-master\build\darknet\x64 folder, there will be a new file chart.png and chart_yolov4-obj.png that contains the loss graph. There also a new weights file in the darknet\darknet-master\build\darknet\x64\backup folder that records all the weights after 1000 iterations and its subsequent multiples.
If you want to continue the training from the last checkpoint, run this command : darknet.exe detector train data/obj.data cfg/yolov4-obj.cfg backup/yolov4-obj_last.weights.

6. 5. Testing the Custom Trained YOLOv4 Model

To try your YOLOv4 custom-trained model, open Anaconda Prompt and change the directory to darknet\darknet-master\build\darknet\x64 folder and run these commands depending on the input types.

Inferencing on images :

darknet.exe detector test data/obj.data cfg/yolov4-obj.cfg backup/yolov4-obj_last.weights (enter the image path after running this command). The image and the predicted bounding boxes will be automatically saved with predictions.jpg file name.

Inferencing on videos :

darknet.exe detector demo data/obj.data cfg/yolov4-obj.cfg backup/yolov4-obj_last.weights test.mp4 -thresh 0.6 (the test.mp4 is the test video path, thresh can be changed as needed). If you want to automatically save the video and its bounding boxes, you can add -out_filename output.mp4 with the run command.

Inferencing on webcam :

darknet.exe detector data/obj.data cfg/yolov4-obj.cfg backup/yolov4-obj_last.weights -c 0

6. 6. Result

I am using the PCB defect dataset from The Open Lab on Human-Robot Interaction of Peking University that I found on Kaggle. There are 693 images that contain 6 types of defects that are made by Adobe Photoshop. The defects defined in the dataset are missing hole, mouse bite, open circuit, short, spur, and spurious copper.

custom YOLOv4 model result on validation image (1)

custom YOLOv4 model result on validation image (2)

From the result, we can see that all defects in the PCB image are successfully detected with high accuracy, precision, and speed using the YOLOv4 custom detection model.

References

Darknet repository: AlexeyAB/darknet
YOLOv4 paper: YOLOv4: Optimal Speed and Accuracy of Object Detection
PCB defect dataset: PCB Defects
OpenCV installation tutorial: Build and Install OpenCV With CUDA GPU Support on Windows 10 | OpenCV 4.5.1 | 2021
YOLO default model tutorial: Darknet YOLOv4 Object Detection Tutorial for Windows 10 on Images, Videos, and Webcams
YOLO on custom dataset: YOLOv4 Custom Object Detection Tutorial: Part 2 (Training YOLOv4 Darknet on Custom Dataset)
Github repository: https://github.com/AngeliaHartono/PCBDefect_YOLOv4.git