Implementing YOLO v7 — Training with Your Own Dataset

Teddddd
NTUST-AIVC
Published in
11 min readAug 16, 2023

YOLO v7 is the current state-of-the-art object detection framework, offering improved accuracy and speed compared to previous versions. This article will demonstrate how to utilize a pre-trained YOLO v7 model to recognize custom objects.

  1. To install Git

If you haven’t installed Git yet, you need to install it first in order to download the official YOLO v7 project from GitHub.

sudo apt-get install git

2. Official YOLO v7 GitHub repository: WongKinYiu/yolov7(github.com)

The README.md provides simple operational instructions and performance results of models with different scales and functionalities. You can refer to the speed and accuracy to choose the model architecture that suits you. The following demonstration will use the general YOLOv7 model.

WongKinYiu/yolov7: Implementation of paper — YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors (github.com)
WongKinYiu/yolov7: Implementation of paper — YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors (github.com)

3. Download the YOLO v7 project

Open the terminal and enter the following command to download the project into the current directory. Then install the required packages.


git clone https://github.com/WongKinYiu/yolov7.git
cd yolov7
pip install --upgrade pip
pip install -r requirements.txt

4. Preparing the Dataset

The image below shows the common data structure used for the YOLO training set. Images and their corresponding labels in text format need to be stored separately. Additionally, separate directories for training and validation data are required.

【小白教学】如何用YOLOv7训练自己的数据集 — 知乎 (zhihu.com)

Here, we’ll demonstrate using the D-Fire: an image data set for fire and smoke detection. (github.com). We aim to utilize this dataset to identify whether images contain fire or smoke. You can observe that the data structure of this dataset roughly matches the illustration shown above.

5. Modify YOLO configuration file

Locate the cfg/training/yolov7.yaml file, make a copy of it in the same directory, and rename it accordingly. Next, change the value of nc in the second line to 2 (as we need to recognize smoke and fire, totaling two categories). The number of classes should be determined based on the categories to be recognized in the dataset.

Next, locate the data/coco.yaml file, make a copy of it in the same directory, and rename it accordingly. You need to change the paths for train, val, and test to correspond to the locations of your dataset. Then, modify line 12 to set nc to 2 and line 15 to set names to [smoke, fire], as this dataset contains annotations for smoke and fire, totaling two classes.

6. Start training

python train.py --weights weights/yolov7_training.pt --cfg cfg/training/yolov7_D-Fire.yaml --data data/coco_D-Fire.yaml  --device 0 --batch-size 10 --epoch 30--name yolov7-D-Fire

There are many parameters you can adjust:

  • weights: Pre-trained model weights. Loading pre-trained weights can help the model converge faster and reduce training time.
  • cfg: Path to the modified cfg/training/yolov7_D-Fire.yaml configuration file.
  • data: Path to the modified data/coco_D-Fire.yaml dataset configuration file.
  • device: If you want to train with multiple GPUs simultaneously, you can adjust this parameter.
  • batch-size: If you encounter out-of-memory errors with your GPU, you can decrease this parameter. A larger batch size can reduce training time.
  • epochs: Controls the number of times the data is shuffled and trained. If there are no specific requirements, you can set it to 30.
  • name: Name of this training session.

7. Training Completion

After training, you can find the trained weights, log files, and data results in the runs/train/yolov7-D-Fire folder. Now you can use this model to detect the objects you need.

8. Image detection

python detect.py --weights runs/train/yolov7-D-Fire/weights/best.pt --source test.mp4
  • weights: The location of the newly trained weights.
  • source: The source for detection, which can be a video, image, webcam, etc.
  • There are many other parameters that can be adjusted, and you can directly input commands to view explanations of the parameters.
python detect.py -h


usage: detect.py [-h] [--weights WEIGHTS [WEIGHTS ...]] [--source SOURCE]
[--img-size IMG_SIZE] [--conf-thres CONF_THRES]
[--iou-thres IOU_THRES] [--device DEVICE] [--view-img]
[--save-txt] [--save-conf] [--nosave]
[--classes CLASSES [CLASSES ...]] [--agnostic-nms]
[--augment] [--update] [--project PROJECT] [--name NAME]
[--exist-ok] [--no-trace]

optional arguments:
-h, --help show this help message and exit
--weights WEIGHTS [WEIGHTS ...]
model.pt path(s)
--source SOURCE source
--img-size IMG_SIZE inference size (pixels)
--conf-thres CONF_THRES
object confidence threshold
--iou-thres IOU_THRES
IOU threshold for NMS
--device DEVICE cuda device, i.e. 0 or 0,1,2,3 or cpu
--view-img display results
--save-txt save results to *.txt
--save-conf save confidences in --save-txt labels
--nosave do not save images/videos
--classes CLASSES [CLASSES ...]
filter by class: --class 0, or --class 0 2 3
--agnostic-nms class-agnostic NMS
--augment augmented inference
--update update all models
--project PROJECT save results to project/name
--name NAME save results to project/name
--exist-ok existing project/name ok, do not increment
--no-trace don`t trace model

Here are the results:

--

--

Teddddd
NTUST-AIVC

A bachelor of NTUST EE who is learning Linux, Docker, CV, and ML.