Yolo V3 in Use

7 min readSep 14, 2022

This Article will Help you to use Yolo v3 in any Object Detection project you desire.

Introduction

This tutorial will demonstrate you how to employ Yolo v3 with the Darknet and use it to find any objects in your own custom datasets or the COCO dataset (which is the default).
Additionally, it helps you put together images and prepare them for the Yolo algorithm.

If you want to learn about YoloV2, feel free to visit my other Article too;

Yolo V2 in Use

This Article will Help you to use Yolo v2 (Yolo 9000) in any Object Detection algorithm you desire.

medium.com

Here, we’ll talk about the following subjects:

What is Yolo?
What is Object Detection?
Dataset
Implementation
Predict
Wrap Up

1. What is Yolo?

“You Only Look Once’’ is abbreviated version of the word “yolo.”

One of the most effective algorithms for spotting and identifying different things in a photo or video, this one is also incredibly quick and capable of operating in real-time.
The class probabilities for each identified object are provided, and it can do object detection as a regression problem.

You can learn about the Yolo v3 in its paper:

YOLOv3: An Incremental Improvement

2. What is Object Detection?

One of the most important areas in the field of computer vision is object detection, which is used to locate and identify objects in an image or a video.
This technique attempts to identify the object by creating a bounding box around it; while this may seem easy to humans because we have the gift of sight, it is challenging for computers.

4. Implementation

You can get the original Yolo v3 files from Darknet;

YOLO: Real-Time Object Detection

You only look once (YOLO) is a state-of-the-art, real-time object detection system. On a Pascal Titan X it processes…

pjreddie.com

or you can access my code from GitHub and run it on google colab;

YOLO/Yolo_v3.ipynb at main · mralamdari/YOLO

Contribute to mralamdari/YOLO development by creating an account on GitHub.

github.com

I’ll break down this notebook section by section so you can easily grasp it and develop your own Yolo method.

0. Step 0: Essentials

Import necessary libraries like Os, CV2, Shutil, and Matplotlib.

Then create an imShow function to demonstrate the predicted images.

I. Step 1: Darknet

To use the Yolo v3 model, you can get it quickly from the Darknet.

You must follow to following instructions;

Clone Darknet from GitHub
Adjust “makefile” so it can work with GPU and CUDA, and OpenCV
Build(make) Darknet

II. Step 2: Dataset

In This Step, we are going to get some images from Open Images Dataset, so we need a tool like OIDv4 ToolKit to get the images and then convert their labels into Yolo’s input types; This is a link to my version of OIDv4_ToolKit that I’ve forked from here and adjusted some parts because it wasn’t working that well.

GitHub - mralamdari/OIDv4_ToolKit: Download and visualize single or multiple classes from the huge…

convert_annotations.py Use toolkit normally to gather images from open images dataset. After gathering images just run…

github.com

So follow these simple steps to get your data ready to train;

Clone OIDv4_ToolKit from My GitHub
Install Necessary Libraries
Move to the OIDv4_ToolKit folder
Mention the number of objects you want to train
Write your desired objects and how many images you want. (use separate names like Human arm with a ‘_’ between words like Human_arm)

As you can see, I want to get two objects, ‘Person’ and ‘Car’, as the training dataset, and I only wish 500 images of each object.

Image 1 — OIDv4_ToolKit(image by author)

You simply need to write two “Y” characters, as shown in Image 1, and this is adequate, as shown in Image 2. It downloaded 500 photographs and generated their labels.

Image 2 — OIDv4_ToolKit Results(image by author)

Then, their labels must be converted into Yolo’s input types; As I said above, I modified the OIDv4 ToolKit so that you may complete these time-consuming tasks by simply running a single line of code, such as;

Convert the labels (Original type) into Yolo’s inputs format (Target type)

Original type: <class name> <x_min> <y_min> <x_max> <y_max>
Target type: <class id> <center_x> <center_y> <width> <height>

2. Delete the labels folder

because we don’t need the old labels in the labels folder

3. Transfer ‘/content/darknet/OIDv4_ToolKit/OID/Dataset/{train/test}’ to ‘/content/darknet/data/obj/{train/test}’

we need to transfer the train or the test folder from ‘OIDv4_ToolKit/OID/Dataset’ folder to the ‘darknet/data/obj’ folder, so all the train or the test images are in the darknet folder.

4. Create obj.names in /content/darknet/data/ directory

obj.names, which contains the names of the objects we downloaded and in our case, it contains ‘Person\nCar\n’

5. Create {train/test}.txt in /content/darknet/data/ directory

The train.text file contains the path of each Image file in the ‘/content/darknet/data/obj/train’ directory, so while training, the Yolo can get the path to each train image by only having this file.
And this is The same for the test.txt file too.

Now obj.names, {train/test}.text and images are prepared, and we want to combine them into a single file and send it to the Yolo model. To do this, we may use the code below to store the images in a text file called “data.obj.”

As you can see, we need to specify how many classes we downloaded, the location of each train.txt, test.txt, and obj.names file, as well as a directory to store weight backups while training. It is advisable to use Google Drive to store the backups so that you won’t lose the weights in the event that Google Colab crashes.
The yolov3.cfg file, which is essential to the Yolo algorithm because it contains the entire Yolo structure, needs to be modified before training. You need to set “batch=64” and “subdivisions=16” in the file.
Establish the class size, and filter size for each Yolo layer and the maximum batch size and its steps.

If you have a single object, you must specify max_batches=4000 and not less; steps are %80 and %90 of the max batches. The max batches count depends on the number of classes.
Additionally, you can adjust num, batch, and subdivisions and see the results.
Finally, you must save them as a fresh cfg file and place it in the “darknet/cfg” folder. (optional)

III. Step 3: Model

Notice:

Download this weights file, go to the following step without training, and use it to forecast if you wish to use the COCO dataset.

Download the pre-trained darknet53 weights instead of starting from scratch if you’d rather train your own Yolo.

And then, let’s start training;

You must provide the detector file that is being trained (train) and provide the weights, obj.data, and yolov3-custom.cfg (darknet57.conv.74).

The Yolo is currently under training, and Image 3 should show the outcomes;

Image 3 — Training Results(image by author)

Notice:

If the majority of the variables in the table above are zero or frozen, it can appear that your model isn’t performing all that well. Please get in touch with me, and I’ll be happy to help.

Notice:

In order to prevent Colab from crashing if you are idle for a short period of time while training it in Google Colab, you can hit “ctrl+shift+i” and paste the following code into the console.

Don’t worry if Google Colab crashes; you can keep using the weights that are kept in the backup files to train it.

Simply Switch the (darknet53.conv.74) for(yolov3-custom_last.weights).

In order to avoid receiving an error, we use ‘-dont show’ at the end of each train and predict lines because it is not possible to monitor the training process in real time in Google Colab.

However, you can monitor your progress when training is complete by executing imShow(‘chart.png’).

Image 4 — Training Plot(image by author)

5. Predict

If you want to predict with the Yolo, first, change batch=1 and subdivisions=1 and yolov3-custom.cfg.

And then, Use the first line to predict an image and the second part of predicting a video.

You need to write the test for an image and demo for the video, then give the model obj.data, cfg, trained-weights, and finally, the image or video.

You can specify the path to store the prediction of the video and illustrate the predicted picture only by calling imShow(‘predictions.jpg’).

Results:

I’ve got this image from pixabay and predicted it, and here is the result;

Labeled Image by Masashi Wakui from Pixabay

6. Wrap Up

As this Story comes to a close, we should have a better understanding of what Yolo is and how to create a pre-trained model from scratch to detect objects. As you can see, building a Yolov2 model is not difficult; you only need to follow a few simple instructions.

Enjoy learning!👋

If you enjoy this reading, follow me to get my news articles about data science, and if you have any questions/suggestions, contact me on LinkedIn or Gmail.

Yolo V3 in Use

Introduction

Yolo V2 in Use

This Article will Help you to use Yolo v2 (Yolo 9000) in any Object Detection algorithm you desire.

1. What is Yolo?

2. What is Object Detection?

4. Implementation

YOLO: Real-Time Object Detection

You only look once (YOLO) is a state-of-the-art, real-time object detection system. On a Pascal Titan X it processes…

YOLO/Yolo_v3.ipynb at main · mralamdari/YOLO

Contribute to mralamdari/YOLO development by creating an account on GitHub.

GitHub - mralamdari/OIDv4_ToolKit: Download and visualize single or multiple classes from the huge…

convert_annotations.py Use toolkit normally to gather images from open images dataset. After gathering images just run…

5. Predict

6. Wrap Up

Written by Mr Alamdari

No responses yet