Object detection with YOLO

Elenche Zététique
Analytics Vidhya
Published in
19 min readAug 3, 2020

The following post shows how to train object detection models based on YOLO-architecture (links to research articles on this topic in the «References» down below), get mAP, average loss statistics in Google Colab and test trained models using custom Python scripts.

Repository preparation

1. Clone the repository (DarkNet framework):

!git clone https://github.com/AlexeyAB/darknet.git

Note: bash-commands are started with exclamation mark

«The exclamation mark is used for executing commands from the underlying operating system.»
Source - https://towardsdatascience.com/an-effective-way-of-managing-files-on-google-colab-ac37f792690b?gi=7e7ac2742a2d

2. Create folder «build-release»:

cd /content/drive/My\ Drive/darknet/
!mkdir build-release
cd /content/drive/My\ Drive/darknet/build-release/

Note: Optionally, before compilation user might comment out the lines in the source code that produce printed output in the terminal, since by default training process produces output which is too redundant, additionally loads operating memory even to display the printed output (!), at some point causes freezing and even sudden termination (experienced several times by the author). The excerpt below contains output only for two iterations:

truncated
7: 679.634888, 679.691040 avg loss, 0.000000 rate, 0.385118 seconds, 224 images, 419.996108 hours left
Loaded: 2.210867 seconds - performance bottleneck on CPU or Disk HDD/SSD
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 16 Avg (IOU: 0.605842, GIOU: 0.574251), Class: 0.500359, Obj: 0.501755, No Obj: 0.500504, .5R: 1.000000, .75R: 0.000000, count: 5, class_loss = 271.598236, iou_loss = 0.801147, total_loss = 272.399384
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 23 Avg (IOU: 0.000000, GIOU: 0.000000), Class: 0.000000, Obj: 0.000000, No Obj: 0.500954, .5R: 0.000000, .75R: 0.000000, count: 1, class_loss = 1087.291870, iou_loss = 0.000000, total_loss = 1087.291870
total_bbox = 248, rewritten_bbox = 0.000000 %
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 16 Avg (IOU: 0.614312, GIOU: 0.605125), Class: 0.500208, Obj: 0.501565, No Obj: 0.500505, .5R: 0.750000, .75R: 0.000000, count: 4, class_loss = 271.537018, iou_loss = 0.536957, total_loss = 272.073975
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 23 Avg (IOU: 0.000000, GIOU: 0.000000), Class: 0.000000, Obj: 0.000000, No Obj: 0.500934, .5R: 0.000000, .75R: 0.000000, count: 1, class_loss = 1087.293945, iou_loss = 0.000000, total_loss = 1087.293945
total_bbox = 252, rewritten_bbox = 0.000000 %
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 16 Avg (IOU: 0.640582, GIOU: 0.634447), Class: 0.499148, Obj: 0.500193, No Obj: 0.500503, .5R: 0.600000, .75R: 0.200000, count: 5, class_loss = 271.477295, iou_loss = 0.716827, total_loss = 272.194122
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 23 Avg (IOU: 0.000000, GIOU: 0.000000), Class: 0.000000, Obj: 0.000000, No Obj: 0.500943, .5R: 0.000000, .75R: 0.000000, count: 1, class_loss = 1087.298706, iou_loss = 0.000000, total_loss = 1087.298706
total_bbox = 257, rewritten_bbox = 0.000000 %
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 16 Avg (IOU: 0.578494, GIOU: 0.560010), Class: 0.500962, Obj: 0.502538, No Obj: 0.500505, .5R: 0.800000, .75R: 0.000000, count: 5, class_loss = 271.405792, iou_loss = 0.654083, total_loss = 272.059875
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 23 Avg (IOU: 0.000000, GIOU: 0.000000), Class: 0.000000, Obj: 0.000000, No Obj: 0.500935, .5R: 0.000000, .75R: 0.000000, count: 1, class_loss = 1087.299683, iou_loss = 0.000000, total_loss = 1087.299683
total_bbox = 262, rewritten_bbox = 0.000000 %
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 16 Avg (IOU: 0.635607, GIOU: 0.608577), Class: 0.500960, Obj: 0.502537, No Obj: 0.500506, .5R: 0.750000, .75R: 0.250000, count: 4, class_loss = 271.092285, iou_loss = 0.535187, total_loss = 271.627472
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 23 Avg (IOU: 0.000000, GIOU: 0.000000), Class: 0.000000, Obj: 0.000000, No Obj: 0.500919, .5R: 0.000000, .75R: 0.000000, count: 1, class_loss = 1087.300903, iou_loss = 0.000000, total_loss = 1087.300903
total_bbox = 266, rewritten_bbox = 0.000000 %
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 16 Avg (IOU: 0.571984, GIOU: 0.536152), Class: 0.500218, Obj: 0.501519, No Obj: 0.500505, .5R: 0.500000, .75R: 0.000000, count: 4, class_loss = 271.347656, iou_loss = 0.689331, total_loss = 272.036987
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 23 Avg (IOU: 0.000000, GIOU: 0.000000), Class: 0.000000, Obj: 0.000000, No Obj: 0.500935, .5R: 0.000000, .75R: 0.000000, count: 1, class_loss = 1087.295410, iou_loss = 0.000000, total_loss = 1087.295410
total_bbox = 270, rewritten_bbox = 0.000000 %
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 16 Avg (IOU: 0.674173, GIOU: 0.652403), Class: 0.500960, Obj: 0.502531, No Obj: 0.500503, .5R: 1.000000, .75R: 0.250000, count: 4, class_loss = 271.155334, iou_loss = 0.497925, total_loss = 271.653259
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 23 Avg (IOU: 0.000000, GIOU: 0.000000), Class: 0.000000, Obj: 0.000000, No Obj: 0.500935, .5R: 0.000000, .75R: 0.000000, count: 1, class_loss = 1087.292358, iou_loss = 0.000000, total_loss = 1087.292358
total_bbox = 274, rewritten_bbox = 0.000000 %
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 16 Avg (IOU: 0.639989, GIOU: 0.600127), Class: 0.500965, Obj: 0.502533, No Obj: 0.500506, .5R: 0.750000, .75R: 0.500000, count: 4, class_loss = 270.776886, iou_loss = 0.559204, total_loss = 271.336090
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 23 Avg (IOU: 0.000000, GIOU: 0.000000), Class: 0.000000, Obj: 0.000000, No Obj: 0.500933, .5R: 0.000000, .75R: 0.000000, count: 1, class_loss = 1087.296143, iou_loss = 0.000000, total_loss = 1087.296143
total_bbox = 278, rewritten_bbox = 0.000000 %

8: 679.609314, 679.682861 avg loss, 0.000000 rate, 0.393729 seconds, 256 images, 418.879547 hours left
Loaded: 1.570684 seconds - performance bottleneck on CPU or Disk HDD/SSD
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 16 Avg (IOU: 0.471209, GIOU: 0.434264), Class: 0.500210, Obj: 0.501556, No Obj: 0.500503, .5R: 0.500000, .75R: 0.000000, count: 4, class_loss = 271.537018, iou_loss = 0.976410, total_loss = 272.513428
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 23 Avg (IOU: 0.000000, GIOU: 0.000000), Class: 0.000000, Obj: 0.000000, No Obj: 0.500933, .5R: 0.000000, .75R: 0.000000, count: 1, class_loss = 1087.296387, iou_loss = 0.000000, total_loss = 1087.296387
total_bbox = 282, rewritten_bbox = 0.000000 %
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 16 Avg (IOU: 0.616591, GIOU: 0.601557), Class: 0.500208, Obj: 0.501559, No Obj: 0.500503, .5R: 1.000000, .75R: 0.250000, count: 4, class_loss = 271.410980, iou_loss = 0.328857, total_loss = 271.739838
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 23 Avg (IOU: 0.000000, GIOU: 0.000000), Class: 0.000000, Obj: 0.000000, No Obj: 0.500925, .5R: 0.000000, .75R: 0.000000, count: 1, class_loss = 1087.294678, iou_loss = 0.000000, total_loss = 1087.294678
total_bbox = 286, rewritten_bbox = 0.000000 %
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 16 Avg (IOU: 0.560825, GIOU: 0.559239), Class: 0.499954, Obj: 0.501222, No Obj: 0.500504, .5R: 0.833333, .75R: 0.000000, count: 6, class_loss = 271.661560, iou_loss = 0.975983, total_loss = 272.637543
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 23 Avg (IOU: 0.000000, GIOU: 0.000000), Class: 0.000000, Obj: 0.000000, No Obj: 0.500941, .5R: 0.000000, .75R: 0.000000, count: 1, class_loss = 1087.299683, iou_loss = 0.000000, total_loss = 1087.299683
total_bbox = 292, rewritten_bbox = 0.000000 %
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 16 Avg (IOU: 0.594659, GIOU: 0.592429), Class: 0.499454, Obj: 0.500582, No Obj: 0.500505, .5R: 0.750000, .75R: 0.250000, count: 4, class_loss = 271.161102, iou_loss = 0.735657, total_loss = 271.896759
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 23 Avg (IOU: 0.000000, GIOU: 0.000000), Class: 0.000000, Obj: 0.000000, No Obj: 0.500947, .5R: 0.000000, .75R: 0.000000, count: 1, class_loss = 1087.291870, iou_loss = 0.000000, total_loss = 1087.291870
total_bbox = 296, rewritten_bbox = 0.000000 %
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 16 Avg (IOU: 0.546971, GIOU: 0.527006), Class: 0.500188, Obj: 0.501606, No Obj: 0.500506, .5R: 0.500000, .75R: 0.000000, count: 4, class_loss = 271.537018, iou_loss = 0.620148, total_loss = 272.157166
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 23 Avg (IOU: 0.000000, GIOU: 0.000000), Class: 0.000000, Obj: 0.000000, No Obj: 0.500932, .5R: 0.000000, .75R: 0.000000, count: 1, class_loss = 1087.293457, iou_loss = 0.000000, total_loss = 1087.293457
total_bbox = 300, rewritten_bbox = 0.000000 %
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 16 Avg (IOU: 0.703408, GIOU: 0.697468), Class: 0.500990, Obj: 0.502102, No Obj: 0.500505, .5R: 1.000000, .75R: 0.200000, count: 5, class_loss = 271.091461, iou_loss = 0.492645, total_loss = 271.584106
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 23 Avg (IOU: 0.326191, GIOU: 0.083068), Class: 0.502488, Obj: 0.502020, No Obj: 0.500926, .5R: 0.000000, .75R: 0.000000, count: 1, class_loss = 1087.353516, iou_loss = 0.583740, total_loss = 1087.937256
total_bbox = 306, rewritten_bbox = 0.000000 %
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 16 Avg (IOU: 0.652279, GIOU: 0.637695), Class: 0.500962, Obj: 0.502520, No Obj: 0.500505, .5R: 0.750000, .75R: 0.250000, count: 4, class_loss = 270.902740, iou_loss = 0.349670, total_loss = 271.252411
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 23 Avg (IOU: 0.000000, GIOU: 0.000000), Class: 0.000000, Obj: 0.000000, No Obj: 0.500928, .5R: 0.000000, .75R: 0.000000, count: 1, class_loss = 1087.301758, iou_loss = 0.000000, total_loss = 1087.301758
total_bbox = 310, rewritten_bbox = 0.000000 %
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 16 Avg (IOU: 0.673402, GIOU: 0.651781), Class: 0.500969, Obj: 0.502514, No Obj: 0.500506, .5R: 0.750000, .75R: 0.500000, count: 4, class_loss = 271.029327, iou_loss = 0.494690, total_loss = 271.524017
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 23 Avg (IOU: 0.000000, GIOU: 0.000000), Class: 0.000000, Obj: 0.000000, No Obj: 0.500919, .5R: 0.000000, .75R: 0.000000, count: 1, class_loss = 1087.297485, iou_loss = 0.000000, total_loss = 1087.297485
total_bbox = 314, rewritten_bbox = 0.000000 %
truncated

The length per iteration depends on the amount of subdivisions (affects only memory utilization — the less subdivisions, the higher workload on memory since it must process together more images at the same time) and convnet (DarkNet19, DarkNet53, DenseNet, ResNet etc.)

Open «darknet/src/region_layer.c» and comment out the following line:

printf(“Region Avg IOU: %f, Class: %f, Obj: %f, No Obj: %f, Avg Recall: %f, count: %d\n”, avg_iou/count, avg_cat/class_count, avg_obj/count, avg_anyobj/(l.w*l.h*l.n*l.batch), recall/count, count);

Open «darknet/src/yolo_layer.c» and comment out the following line:

fprintf(stderr, “v3 (%s loss, Normalizer: (iou: %.2f, cls: %.2f) Region %d Avg (IOU: %f, GIOU: %f), Class: %f, Obj: %f, No Obj: %f, .5R: %f, .75R: %f, count: %d, class_loss = %f, iou_loss = %f, total_loss = %f \n”,
(l.iou_loss == MSE ? “mse” : (l.iou_loss == GIOU ? “giou” : “iou”)), l.iou_normalizer, l.cls_normalizer, state.index, tot_iou / count, tot_giou / count, avg_cat / class_count, avg_obj / count, avg_anyobj / (l.w*l.h*l.n*l.batch), recall / count, recall75 / count, count,
classification_loss, iou_loss, loss);

In some cases in the end is used «region» layer instead of «yolo» (for ex., for DenseNet) therefore one should comment out lines in both files.

3. Compile repository:

!cmake ..
!make
!make install

Note #1: If user wants to change source code he has to recompile afterwards.
Note #2:
Change privileges for «darknet» each new session:

!chmod -R 777 darknet

Otherwise each command using darknet will provide:

/bin/bash: ./darknet: Permission denied

Data preparation

Note: data preparation must be done not in Colab

1. Get images

2. For labeling images clone LabelImg and open it:

git clone https://github.com/tzutalin/labelImg.git
cd labelImg

3. Install required packages (for Linux):

sudo apt install pyqt5-dev-tools
sudo pip3 install -r requirements/requirements-linux-python3.txt
make qt5py3

Note: For other OS follow the installation guide on https://github.com/
tzutalin/labelImg

4. Run LabelImg:

python3 labelImg.py

5. Choose PASCAL VOC labeling format:

Note: Better to choose PASCAL VOC format since it contains more information, one can use labeled images later with Tensorflow and it is possible to convert to YOLO-format anytime using Python-script (discussed below in «Model configuration») but not the other way around — format contains float values relative to width and height of image, it can be equal from 0 to 1, so, it is not possible to recover absolute value from relative one.

6. Label images:

Choose «Create/nRectBox»
Highlight the object and assign the name to it
Object is labeled now

7. Image and XML label file:

Before moving further it is better to get acquainted with couple of scripts that could be useful for preparing the data and testing the final models. The code is based on YOLO object detection with OpenCV , OpenCV ‘dnn’ with NVIDIA GPUs: 1549% faster YOLO, SSD, and Mask R-CNN , Faster video file FPS with cv2.VideoCapture and OpenCV from Adrian Rosebrock.
Note: The following software will be further improved.

8. First clone the repository:

git clone https://ElencheZetetique@bitbucket.org/ElencheZetetique/rtod.git

Script «prepare.py» contains several functions like greyscaling, resizing, assigning unique names to images, PASCAL VOC label modification and convertion to YOLO-format.

9. Convert to YOLO-format:

python3 prepare.py -s /path/to/Documents/Dataset/ -yf 25 -cpy /path/to/

where

  • flag -s/--src points at source folder (in this case at dataset folder);
  • flag -yf/--yolo_format converts to YOLO and takes the integer number from 5 to 35 in order to split the dataset into training and validation subsets;
  • flag -cpy/--change_path_yolo sets path before image names in «train.txt», «valid.txt»

Use flag -h/--help to get more details

Results of the command:

  • Each image now has additional txt-file with YOLO-format:

YOLO label file has the following contents:

0 0.571111 0.445141 0.764444 0.495298

<object-class> <x_center> <y_center> <width> <height>
Where:
-<object-class> - integer object number from 0 to (classes-1)
<x_center> <y_center> <width> <height>
- float values relative to width and height of image, it can be equal from (0.0 to 1.0]
for example: <x> = <absolute_x> / <image_width> or <height> = <absolute_height> / <image_height>
atention: <x_center> <y_center> - are center of rectangle (are not top-left corner)
Source https://github.com/AlexeyAB/darknet#how-to-train-to-detect-your-custom-objects (item #5)

  • «cfg» folder:

«cfg» folder contains:

  • «obj.data» file in the folder «cfg» with the following contents:
classes=num_of_classes
train=/path/to/train.txt
valid=/path/to/valid.txt
names=/path/to/obj.names
backup=/path/to/backup

Change «/path/to/backup» to «/mydrive» which is a link to the path where all «weights» (trained models after some amount of iterations) shall be saved during the training, for ex.:

!ln -s “/content/drive/My Drive/YOLO/backup” /mydrive

Note #1: Link is used to avoid error that caused by folder name «My Drive» which cannot be modified.
Note #2: Do not use Colab’s folder «sample_data» for saving weights— each new session the whole data is wiped out and filled with default example files, so, you will loose all your results once current session is over.

  • «obj.names» file with list of all which the model shall be trained on. For example if to train model on dataset with all COCO objects:
person
bicycle
car
motorbike
aeroplane
bus
train
truck
boat
traffic light
fire hydrant
stop sign
parking meter
bench
bird
cat
dog
horse
sheep
cow
elephant
bear
zebra
giraffe
backpack
umbrella
handbag
tie
suitcase
frisbee
skis
snowboard
sports ball
kite
baseball bat
baseball glove
skateboard
surfboard
tennis racket
bottle
wine glass
cup
fork
knife
spoon
bowl
banana
apple
sandwich
orange
broccoli
carrot
hot dog
pizza
donut
cake
chair
sofa
pottedplant
bed
diningtable
toilet
tvmonitor
laptop
mouse
remote
keyboard
cell phone
microwave
oven
toaster
sink
refrigerator
book
clock
vase
scissors
teddy bear
hair drier
toothbrush

Downloaded from https://github.com/AlexeyAB/darknet
Sourcehttps://camo.githubusercontent.com/d60dfdba6c007a5df888747c2c03664c91c12c1e/68747470733a2f2f6873746f2e6f72672f776562742f79642f766c2f61672f7964766c616775746f66327a636e6a6f64737467726f656e3861632e6a706567

  • «train.txt» and «valid.txt»

The contents of files:

10. Get «*.cfg» from https://github.com/AlexeyAB/darknet/tree/master/cfg and modify it:

change [filters=255] to filters=(classes + 5)x3 in the 3 [convolutional] before each [yolo] layer, keep in mind that it only has to be the last [convolutional] before each of the [yolo] layers.
Source - https://github.com/AlexeyAB/darknet#how-to-train-to-detect-your-custom-objects (item #1)

Note: «*.cfg» must correspond to the pretrained «weights» otherwise the execution fails. The compatibility list is shown in https://github.com/AlexeyAB/darknet#how-to-train-to-detect-your-custom-objects (item #7)

Training

1. Now one can start training:

!./darknet detector train /content/sample_data/obj.data /content/sample_data/yolov3-tiny.cfg /content/drive/My\ Drive/pretrained/darknet19_448.conv.23 -dont_show

Note #1: Check carefully paths for all files otherwise execution fails (especially paths for images specified in «train.txt» and «valid.txt»).
Note #2:
flag -dont_show is obligatory to use in Colab otherwise execution is terminated at the very beginning.

truncated
If error occurs - run training with flag: -dont_show
Unable to init server: Could not connect: Connection refused

(chart_yolov3-tiny.png:2780): Gtk-WARNING **: 20:15:19.955: cannot open display:

(to disable Loss-Window use darknet.exe detector train data/obj.data yolo-obj.cfg yolov4.conv.137 -dont_show, if you train on computer without monitor like a cloud Amazon EC2)
Source - https://github.com/AlexeyAB/darknet#how-to-train-to-detect-your-custom-objects (item #8)

Loss-window (downloaded from https://github.com/AlexeyAB/darknet Source - https://camo.githubusercontent.com/d60dfdba6c007a5df888747c2c03664c91c12c1e/68747470733a2f2f6873746f2e6f72672f776562742f79642f766c2f61672f7964766c616775746f66327a636e6a6f64737467726f656e3861632e6a706567)

In order to save the output (training process) which might be used for further analysis (explained in «Analyzing the output») one can redirect the output to the text file:

!./darknet detector train /content/sample_data/obj.data /content/sample_data/yolov3-tiny.cfg /content/drive/My\ Drive/pretrained/darknet19_448.conv.23 -dont_show > /content/drive/My\ Drive/backup/training_output.txt

2. Find results in the backup folder specified for /mydrive

Now one can use trained model together with «obj.names» and «*.cfg» (see details in «Testing»).

Testing

Now it is time to test the trained model. Download the «weights»-file you prefer to use for object detection, save it on host machine and put it in one folder together with «obj.names» and «*.cfg»-file that was used to train the current model.

1. Run the command for videocapture:

python3 test.py -d /path/to/Model -vc 0 -sf

where

  • flag -vc/--video_cap — process video from camera. Set the camera number. (in this case it is ‘0’)
  • flag -d/--darknet — path to darknet(-based) directory (must contain *.cfg, *.names (list of objects to detect), *.weights — each one only once)
  • flag -sf/--save_frame —if true then save frames with detected objects

Call -h/--help for details

One can find frames with detected objects in the folder «Results»:

And find the results. Each so-called session has a timestamp that corresponds to the time of running the command:

Example of detection:

2. For inferencing images put images in one folder and run:

python3 test.py -d /path/to/Model -is /path/to/folder/with/images

where

  • flag -is/--image_src — folder with image(s)

The progress is shown in the terminal:

After execution is over all processed images are in the folder «Results».

Example of image with detected objects:

3. For video:

python3 test.py -d /path/to/Model -vs /path/to/folder/with/images

where

  • flag -vs/--video_src —path to video

The progress is shown in the terminal:

Example of video with detected objects

After execution is over processed video can be found in the folder «Results».

Analyzing the output

1. mAP

If you are unfamiliar with mAP-characteristic one can read corresponding articles in the «References».
In order to get mAP for all weights run the following bash-script (download the script):

!bash /content/drive/My\ Drive/for_mAP/get_mAP.sh

The contents:

#!/usr/bin/bash
cd /content/drive/My\ Drive/darknet/build-release
for i in {1000..LAST_WEIGHT..1000}
do
./darknet detector map /content/sample_data/obj.data /content/drive/My\ Drive/for_mAP/yolov3.cfg /content/drive/My\ Drive/for_mAP/yolov3_$i.weights -points 0 >> /content/drive/My\ Drive/for_mAP/output$i.txt
./darknet detector map /content/sample_data/obj.data /content/drive/My\ Drive/for_mAP/yolov3.cfg /content/drive/My\ Drive/for_mAP/yolov3_$i.weights -iou_thresh 0.75 -points 0 >> /content/drive/My\ Drive/for_mAP/output$i.txt
./darknet detector map /content/sample_data/obj.data /content/drive/My\ Drive/for_mAP/yolov3.cfg /content/drive/My\ Drive/for_mAP/yolov3_$i.weights -points 11 >> /content/drive/My\ Drive/for_mAP/output$i.txt
./darknet detector map /content/sample_data/obj.data /content/drive/My\ Drive/for_mAP/yolov3.cfg /content/drive/My\ Drive/for_mAP/yolov3_$i.weights -iou_thresh 0.75 -points 11 >> /content/drive/My\ Drive/for_mAP/output$i.txt
./darknet detector map /content/sample_data/obj.data /content/drive/My\ Drive/for_mAP/yolov3.cfg /content/drive/My\ Drive/for_mAP/yolov3_$i.weights -points 101 >> /content/drive/My\ Drive/for_mAP/output$i.txt
./darknet detector map /content/sample_data/obj.data /content/drive/My\ Drive/for_mAP/yolov3.cfg /content/drive/My\ Drive/for_mAP/yolov3_$i.weights -iou_thresh 0.75 -points 101 >> /content/drive/My\ Drive/for_mAP/output$i.txt
done

where

  • instead of LAST_WEIGHT type in the last saved weight

Script gets mAP for 0.5 IoU (no flag, by default) and 0.75 IoU (-iou_thresh 0.75) according to PASCAL VOC2007 (-points 11), PASCAL VOC2012 (-points 0) and MS COCO (-points 101) standards.

The script provides:

Each file contains the following information:

net.optimized_memory = 0 
mini_batch = 1, batch = 16, time_steps = 1, train = 0
seen 64, trained: 64 K-images (1 Kilo-batches_64)calculation mAP (mean average precision)...detections_count = 136, unique_truth_count = 109
rank = 0 of ranks = 136
rank = 100 of ranks = 136
class_id = 0, name = Nutria, ap = 91.94% (TP = 99, FP = 4)
for conf_thresh = 0.25, precision = 0.96, recall = 0.91, F1-score = 0.93
for conf_thresh = 0.25, TP = 99, FP = 4, FN = 10, average IoU = 75.43 %
IoU threshold = 50 %, used Area-Under-Curve for each unique Recall
mean average precision (mAP@0.50) = 0.919386, or 91.94 %
Set -points flag:
`-points 101` for MS COCO
`-points 11` for PascalVOC 2007 (uncomment `difficult` in voc.data)
`-points 0` (AUC) for ImageNet, PascalVOC 2010-2012, your custom dataset
net.optimized_memory = 0
mini_batch = 1, batch = 16, time_steps = 1, train = 0
seen 64, trained: 64 K-images (1 Kilo-batches_64)calculation mAP (mean average precision)...detections_count = 136, unique_truth_count = 109
rank = 0 of ranks = 136
rank = 100 of ranks = 136
class_id = 0, name = Nutria, ap = 49.81% (TP = 67, FP = 36)
for conf_thresh = 0.25, precision = 0.65, recall = 0.61, F1-score = 0.63
for conf_thresh = 0.25, TP = 67, FP = 36, FN = 42, average IoU = 54.34 %
IoU threshold = 75 %, used Area-Under-Curve for each unique Recall
mean average precision (mAP@0.75) = 0.498097, or 49.81 %
Set -points flag:
`-points 101` for MS COCO
`-points 11` for PascalVOC 2007 (uncomment `difficult` in voc.data)
`-points 0` (AUC) for ImageNet, PascalVOC 2010-2012, your custom dataset
net.optimized_memory = 0
mini_batch = 1, batch = 16, time_steps = 1, train = 0
seen 64, trained: 64 K-images (1 Kilo-batches_64)calculation mAP (mean average precision)...detections_count = 136, unique_truth_count = 109
rank = 0 of ranks = 136
rank = 100 of ranks = 136
class_id = 0, name = Nutria, ap = 90.26% (TP = 99, FP = 4)
for conf_thresh = 0.25, precision = 0.96, recall = 0.91, F1-score = 0.93
for conf_thresh = 0.25, TP = 99, FP = 4, FN = 10, average IoU = 75.43 %
IoU threshold = 50 %, used 11 Recall-points
mean average precision (mAP@0.50) = 0.902615, or 90.26 %
Set -points flag:
`-points 101` for MS COCO
`-points 11` for PascalVOC 2007 (uncomment `difficult` in voc.data)
`-points 0` (AUC) for ImageNet, PascalVOC 2010-2012, your custom dataset
net.optimized_memory = 0
mini_batch = 1, batch = 16, time_steps = 1, train = 0
seen 64, trained: 64 K-images (1 Kilo-batches_64)calculation mAP (mean average precision)...detections_count = 136, unique_truth_count = 109
rank = 0 of ranks = 136
rank = 100 of ranks = 136
class_id = 0, name = Nutria, ap = 49.67% (TP = 67, FP = 36)
for conf_thresh = 0.25, precision = 0.65, recall = 0.61, F1-score = 0.63
for conf_thresh = 0.25, TP = 67, FP = 36, FN = 42, average IoU = 54.34 %
IoU threshold = 75 %, used 11 Recall-points
mean average precision (mAP@0.75) = 0.496749, or 49.67 %
Set -points flag:
`-points 101` for MS COCO
`-points 11` for PascalVOC 2007 (uncomment `difficult` in voc.data)
`-points 0` (AUC) for ImageNet, PascalVOC 2010-2012, your custom dataset
net.optimized_memory = 0
mini_batch = 1, batch = 16, time_steps = 1, train = 0
seen 64, trained: 64 K-images (1 Kilo-batches_64)calculation mAP (mean average precision)...detections_count = 136, unique_truth_count = 109
rank = 0 of ranks = 136
rank = 100 of ranks = 136
class_id = 0, name = Nutria, ap = 92.34% (TP = 99, FP = 4)
for conf_thresh = 0.25, precision = 0.96, recall = 0.91, F1-score = 0.93
for conf_thresh = 0.25, TP = 99, FP = 4, FN = 10, average IoU = 75.43 %
IoU threshold = 50 %, used 101 Recall-points
mean average precision (mAP@0.50) = 0.923423, or 92.34 %
Set -points flag:
`-points 101` for MS COCO
`-points 11` for PascalVOC 2007 (uncomment `difficult` in voc.data)
`-points 0` (AUC) for ImageNet, PascalVOC 2010-2012, your custom dataset
net.optimized_memory = 0
mini_batch = 1, batch = 16, time_steps = 1, train = 0
seen 64, trained: 64 K-images (1 Kilo-batches_64)calculation mAP (mean average precision)...detections_count = 136, unique_truth_count = 109
rank = 0 of ranks = 136
rank = 100 of ranks = 136
class_id = 0, name = Nutria, ap = 49.90% (TP = 67, FP = 36)
for conf_thresh = 0.25, precision = 0.65, recall = 0.61, F1-score = 0.63
for conf_thresh = 0.25, TP = 67, FP = 36, FN = 42, average IoU = 54.34 %
IoU threshold = 75 %, used 101 Recall-points
mean average precision (mAP@0.75) = 0.498992, or 49.90 %
Set -points flag:
`-points 101` for MS COCO
`-points 11` for PascalVOC 2007 (uncomment `difficult` in voc.data)
`-points 0` (AUC) for ImageNet, PascalVOC 2010-2012, your custom dataset

Save all files to disk on host machine, put in one folder and run the command using Python script:

python3 prepare.py -msy -s /path/to/folder_with_mAP_files -d /path/to/output/folder

where

  • flag-msy/--mAP_stat_yolo gets mAP statistics for YOLO from given folder and saves it in Excel file.

Output:

2. Avg.loss
In order to get average loss dynamics, how it change during the training, whethr it goes down or up, save on host machine txt-file «training_output.txt» and run the following command:

python3 prepare.py -s /path/to/folder/with/output/file -tsy training_output.txt -d /path/to/output/folder

where

  • flag-tsy/--train_stat_yolo gets training statistics for YOLO from given folder and saves it in Excel.

3. With given information it is possible now to see which «weights»-file is more preferable to use by ploting the graph (using built-in Excel or OpenDocument functionality).

For avg.loss:

Models with different batch sizes

For mAP:

The same model for different mAP-assessments

Links to download

References

  1. You Only Look Once:Unified, Real-Time Object Detection
  2. YOLO9000:Better, Faster, Stronger
  3. YOLOv3: An Incremental Improvement
  4. YOLOv4: Optimal Speed and Accuracy of Object Detection
  5. Darknet: Open Source Neural Networks in C
  6. DarkNet framework on GitHub (original)
  7. DarkNet framework on GitHub (fork by AlexeyAB)
  8. An Effective Way of Managing Files on Google Colab
  9. mAP (mean Average Precision) for Object Detection
  10. Intuition behind Average Precision and MAP
  11. Why we use mAp score for evaluate object detectors in deep learning?
  12. Measuring Object Detection models — mAP — What is Mean Average Precision?
  13. What do we learn from single shot object detectors (SSD, YOLOv3), FPN & Focal loss (RetinaNet)?
  14. YOLO object detection with OpenCV
  15. OpenCV ‘dnn’ with NVIDIA GPUs: 1549% faster YOLO, SSD, and Mask R-CNN
  16. Faster video file FPS with cv2.VideoCapture and OpenCV

--

--