Object detection with YOLO

Published in

Analytics Vidhya

19 min readAug 3, 2020

The following post shows how to train object detection models based on YOLO-architecture (links to research articles on this topic in the «References» down below), get mAP, average loss statistics in Google Colab and test trained models using custom Python scripts.

Repository preparation

1. Clone the repository (DarkNet framework):

!git clone https://github.com/AlexeyAB/darknet.git

Note: bash-commands are started with exclamation mark

«The exclamation mark is used for executing commands from the underlying operating system.»
Source - https://towardsdatascience.com/an-effective-way-of-managing-files-on-google-colab-ac37f792690b?gi=7e7ac2742a2d

2. Create folder «build-release»:

cd /content/drive/My\ Drive/darknet/
!mkdir build-release
cd /content/drive/My\ Drive/darknet/build-release/

Note: Optionally, before compilation user might comment out the lines in the source code that produce printed output in the terminal, since by default training process produces output which is too redundant, additionally loads operating memory even to display the printed output (!), at some point causes freezing and even sudden termination (experienced several times by the author). The excerpt below contains output only for two iterations:

truncated
7: 679.634888, 679.691040 avg loss, 0.000000 rate, 0.385118 seconds, 224 images, 419.996108 hours left
Loaded: 2.210867 seconds - performance bottleneck on CPU or Disk HDD/SSD
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 16 Avg (IOU: 0.605842, GIOU: 0.574251), Class: 0.500359, Obj: 0.501755, No Obj: 0.500504, .5R: 1.000000, .75R: 0.000000, count: 5, class_loss = 271.598236, iou_loss = 0.801147, total_loss = 272.399384 
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 23 Avg (IOU: 0.000000, GIOU: 0.000000), Class: 0.000000, Obj: 0.000000, No Obj: 0.500954, .5R: 0.000000, .75R: 0.000000, count: 1, class_loss = 1087.291870, iou_loss = 0.000000, total_loss = 1087.291870 
 total_bbox = 248, rewritten_bbox = 0.000000 % 
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 16 Avg (IOU: 0.614312, GIOU: 0.605125), Class: 0.500208, Obj: 0.501565, No Obj: 0.500505, .5R: 0.750000, .75R: 0.000000, count: 4, class_loss = 271.537018, iou_loss = 0.536957, total_loss = 272.073975 
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 23 Avg (IOU: 0.000000, GIOU: 0.000000), Class: 0.000000, Obj: 0.000000, No Obj: 0.500934, .5R: 0.000000, .75R: 0.000000, count: 1, class_loss = 1087.293945, iou_loss = 0.000000, total_loss = 1087.293945 
 total_bbox = 252, rewritten_bbox = 0.000000 % 
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 16 Avg (IOU: 0.640582, GIOU: 0.634447), Class: 0.499148, Obj: 0.500193, No Obj: 0.500503, .5R: 0.600000, .75R: 0.200000, count: 5, class_loss = 271.477295, iou_loss = 0.716827, total_loss = 272.194122 
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 23 Avg (IOU: 0.000000, GIOU: 0.000000), Class: 0.000000, Obj: 0.000000, No Obj: 0.500943, .5R: 0.000000, .75R: 0.000000, count: 1, class_loss = 1087.298706, iou_loss = 0.000000, total_loss = 1087.298706 
 total_bbox = 257, rewritten_bbox = 0.000000 % 
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 16 Avg (IOU: 0.578494, GIOU: 0.560010), Class: 0.500962, Obj: 0.502538, No Obj: 0.500505, .5R: 0.800000, .75R: 0.000000, count: 5, class_loss = 271.405792, iou_loss = 0.654083, total_loss = 272.059875 
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 23 Avg (IOU: 0.000000, GIOU: 0.000000), Class: 0.000000, Obj: 0.000000, No Obj: 0.500935, .5R: 0.000000, .75R: 0.000000, count: 1, class_loss = 1087.299683, iou_loss = 0.000000, total_loss = 1087.299683 
 total_bbox = 262, rewritten_bbox = 0.000000 % 
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 16 Avg (IOU: 0.635607, GIOU: 0.608577), Class: 0.500960, Obj: 0.502537, No Obj: 0.500506, .5R: 0.750000, .75R: 0.250000, count: 4, class_loss = 271.092285, iou_loss = 0.535187, total_loss = 271.627472 
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 23 Avg (IOU: 0.000000, GIOU: 0.000000), Class: 0.000000, Obj: 0.000000, No Obj: 0.500919, .5R: 0.000000, .75R: 0.000000, count: 1, class_loss = 1087.300903, iou_loss = 0.000000, total_loss = 1087.300903 
 total_bbox = 266, rewritten_bbox = 0.000000 % 
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 16 Avg (IOU: 0.571984, GIOU: 0.536152), Class: 0.500218, Obj: 0.501519, No Obj: 0.500505, .5R: 0.500000, .75R: 0.000000, count: 4, class_loss = 271.347656, iou_loss = 0.689331, total_loss = 272.036987 
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 23 Avg (IOU: 0.000000, GIOU: 0.000000), Class: 0.000000, Obj: 0.000000, No Obj: 0.500935, .5R: 0.000000, .75R: 0.000000, count: 1, class_loss = 1087.295410, iou_loss = 0.000000, total_loss = 1087.295410 
 total_bbox = 270, rewritten_bbox = 0.000000 % 
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 16 Avg (IOU: 0.674173, GIOU: 0.652403), Class: 0.500960, Obj: 0.502531, No Obj: 0.500503, .5R: 1.000000, .75R: 0.250000, count: 4, class_loss = 271.155334, iou_loss = 0.497925, total_loss = 271.653259 
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 23 Avg (IOU: 0.000000, GIOU: 0.000000), Class: 0.000000, Obj: 0.000000, No Obj: 0.500935, .5R: 0.000000, .75R: 0.000000, count: 1, class_loss = 1087.292358, iou_loss = 0.000000, total_loss = 1087.292358 
 total_bbox = 274, rewritten_bbox = 0.000000 % 
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 16 Avg (IOU: 0.639989, GIOU: 0.600127), Class: 0.500965, Obj: 0.502533, No Obj: 0.500506, .5R: 0.750000, .75R: 0.500000, count: 4, class_loss = 270.776886, iou_loss = 0.559204, total_loss = 271.336090 
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 23 Avg (IOU: 0.000000, GIOU: 0.000000), Class: 0.000000, Obj: 0.000000, No Obj: 0.500933, .5R: 0.000000, .75R: 0.000000, count: 1, class_loss = 1087.296143, iou_loss = 0.000000, total_loss = 1087.296143 
 total_bbox = 278, rewritten_bbox = 0.000000 % 

 8: 679.609314, 679.682861 avg loss, 0.000000 rate, 0.393729 seconds, 256 images, 418.879547 hours left
Loaded: 1.570684 seconds - performance bottleneck on CPU or Disk HDD/SSD
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 16 Avg (IOU: 0.471209, GIOU: 0.434264), Class: 0.500210, Obj: 0.501556, No Obj: 0.500503, .5R: 0.500000, .75R: 0.000000, count: 4, class_loss = 271.537018, iou_loss = 0.976410, total_loss = 272.513428 
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 23 Avg (IOU: 0.000000, GIOU: 0.000000), Class: 0.000000, Obj: 0.000000, No Obj: 0.500933, .5R: 0.000000, .75R: 0.000000, count: 1, class_loss = 1087.296387, iou_loss = 0.000000, total_loss = 1087.296387 
 total_bbox = 282, rewritten_bbox = 0.000000 % 
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 16 Avg (IOU: 0.616591, GIOU: 0.601557), Class: 0.500208, Obj: 0.501559, No Obj: 0.500503, .5R: 1.000000, .75R: 0.250000, count: 4, class_loss = 271.410980, iou_loss = 0.328857, total_loss = 271.739838 
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 23 Avg (IOU: 0.000000, GIOU: 0.000000), Class: 0.000000, Obj: 0.000000, No Obj: 0.500925, .5R: 0.000000, .75R: 0.000000, count: 1, class_loss = 1087.294678, iou_loss = 0.000000, total_loss = 1087.294678 
 total_bbox = 286, rewritten_bbox = 0.000000 % 
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 16 Avg (IOU: 0.560825, GIOU: 0.559239), Class: 0.499954, Obj: 0.501222, No Obj: 0.500504, .5R: 0.833333, .75R: 0.000000, count: 6, class_loss = 271.661560, iou_loss = 0.975983, total_loss = 272.637543 
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 23 Avg (IOU: 0.000000, GIOU: 0.000000), Class: 0.000000, Obj: 0.000000, No Obj: 0.500941, .5R: 0.000000, .75R: 0.000000, count: 1, class_loss = 1087.299683, iou_loss = 0.000000, total_loss = 1087.299683 
 total_bbox = 292, rewritten_bbox = 0.000000 % 
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 16 Avg (IOU: 0.594659, GIOU: 0.592429), Class: 0.499454, Obj: 0.500582, No Obj: 0.500505, .5R: 0.750000, .75R: 0.250000, count: 4, class_loss = 271.161102, iou_loss = 0.735657, total_loss = 271.896759 
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 23 Avg (IOU: 0.000000, GIOU: 0.000000), Class: 0.000000, Obj: 0.000000, No Obj: 0.500947, .5R: 0.000000, .75R: 0.000000, count: 1, class_loss = 1087.291870, iou_loss = 0.000000, total_loss = 1087.291870 
 total_bbox = 296, rewritten_bbox = 0.000000 % 
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 16 Avg (IOU: 0.546971, GIOU: 0.527006), Class: 0.500188, Obj: 0.501606, No Obj: 0.500506, .5R: 0.500000, .75R: 0.000000, count: 4, class_loss = 271.537018, iou_loss = 0.620148, total_loss = 272.157166 
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 23 Avg (IOU: 0.000000, GIOU: 0.000000), Class: 0.000000, Obj: 0.000000, No Obj: 0.500932, .5R: 0.000000, .75R: 0.000000, count: 1, class_loss = 1087.293457, iou_loss = 0.000000, total_loss = 1087.293457 
 total_bbox = 300, rewritten_bbox = 0.000000 % 
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 16 Avg (IOU: 0.703408, GIOU: 0.697468), Class: 0.500990, Obj: 0.502102, No Obj: 0.500505, .5R: 1.000000, .75R: 0.200000, count: 5, class_loss = 271.091461, iou_loss = 0.492645, total_loss = 271.584106 
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 23 Avg (IOU: 0.326191, GIOU: 0.083068), Class: 0.502488, Obj: 0.502020, No Obj: 0.500926, .5R: 0.000000, .75R: 0.000000, count: 1, class_loss = 1087.353516, iou_loss = 0.583740, total_loss = 1087.937256 
 total_bbox = 306, rewritten_bbox = 0.000000 % 
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 16 Avg (IOU: 0.652279, GIOU: 0.637695), Class: 0.500962, Obj: 0.502520, No Obj: 0.500505, .5R: 0.750000, .75R: 0.250000, count: 4, class_loss = 270.902740, iou_loss = 0.349670, total_loss = 271.252411 
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 23 Avg (IOU: 0.000000, GIOU: 0.000000), Class: 0.000000, Obj: 0.000000, No Obj: 0.500928, .5R: 0.000000, .75R: 0.000000, count: 1, class_loss = 1087.301758, iou_loss = 0.000000, total_loss = 1087.301758 
 total_bbox = 310, rewritten_bbox = 0.000000 % 
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 16 Avg (IOU: 0.673402, GIOU: 0.651781), Class: 0.500969, Obj: 0.502514, No Obj: 0.500506, .5R: 0.750000, .75R: 0.500000, count: 4, class_loss = 271.029327, iou_loss = 0.494690, total_loss = 271.524017 
v3 (mse loss, Normalizer: (iou: 0.75, cls: 1.00) Region 23 Avg (IOU: 0.000000, GIOU: 0.000000), Class: 0.000000, Obj: 0.000000, No Obj: 0.500919, .5R: 0.000000, .75R: 0.000000, count: 1, class_loss = 1087.297485, iou_loss = 0.000000, total_loss = 1087.297485 
 total_bbox = 314, rewritten_bbox = 0.000000 % 
truncated

The length per iteration depends on the amount of subdivisions (affects only memory utilization — the less subdivisions, the higher workload on memory since it must process together more images at the same time) and convnet (DarkNet19, DarkNet53, DenseNet, ResNet etc.)

Open «darknet/src/region_layer.c» and comment out the following line:

printf(“Region Avg IOU: %f, Class: %f, Obj: %f, No Obj: %f, Avg Recall: %f, count: %d\n”, avg_iou/count, avg_cat/class_count, avg_obj/count, avg_anyobj/(l.w*l.h*l.n*l.batch), recall/count, count);

Open «darknet/src/yolo_layer.c» and comment out the following line:

fprintf(stderr, “v3 (%s loss, Normalizer: (iou: %.2f, cls: %.2f) Region %d Avg (IOU: %f, GIOU: %f), Class: %f, Obj: %f, No Obj: %f, .5R: %f, .75R: %f, count: %d, class_loss = %f, iou_loss = %f, total_loss = %f \n”,
 (l.iou_loss == MSE ? “mse” : (l.iou_loss == GIOU ? “giou” : “iou”)), l.iou_normalizer, l.cls_normalizer, state.index, tot_iou / count, tot_giou / count, avg_cat / class_count, avg_obj / count, avg_anyobj / (l.w*l.h*l.n*l.batch), recall / count, recall75 / count, count,
 classification_loss, iou_loss, loss);

In some cases in the end is used «region» layer instead of «yolo» (for ex., for DenseNet) therefore one should comment out lines in both files.

3. Compile repository:

!cmake ..
!make
!make install

Note #1: If user wants to change source code he has to recompile afterwards.
Note #2:
Change privileges for «darknet» each new session:

!chmod -R 777 darknet

Otherwise each command using darknet will provide:

/bin/bash: ./darknet: Permission denied

Data preparation

Note: data preparation must be done not in Colab

1. Get images

2. For labeling images clone LabelImg and open it:

git clone https://github.com/tzutalin/labelImg.git
cd labelImg

3. Install required packages (for Linux):

sudo apt install pyqt5-dev-tools
sudo pip3 install -r requirements/requirements-linux-python3.txt
make qt5py3

Note: For other OS follow the installation guide on https://github.com/
tzutalin/labelImg

4. Run LabelImg:

python3 labelImg.py

5. Choose PASCAL VOC labeling format:

Note: Better to choose PASCAL VOC format since it contains more information, one can use labeled images later with Tensorflow and it is possible to convert to YOLO-format anytime using Python-script (discussed below in «Model configuration») but not the other way around — format contains float values relative to width and height of image, it can be equal from 0 to 1, so, it is not possible to recover absolute value from relative one.

6. Label images:

Highlight the object and assign the name to it

7. Image and XML label file:

Before moving further it is better to get acquainted with couple of scripts that could be useful for preparing the data and testing the final models. The code is based on YOLO object detection with OpenCV , OpenCV ‘dnn’ with NVIDIA GPUs: 1549% faster YOLO, SSD, and Mask R-CNN , Faster video file FPS with cv2.VideoCapture and OpenCV from Adrian Rosebrock.
Note: The following software will be further improved.

8. First clone the repository:

git clone https://ElencheZetetique@bitbucket.org/ElencheZetetique/rtod.git

Script «prepare.py» contains several functions like greyscaling, resizing, assigning unique names to images, PASCAL VOC label modification and convertion to YOLO-format.

9. Convert to YOLO-format:

python3 prepare.py -s /path/to/Documents/Dataset/ -yf 25 -cpy /path/to/

where

flag -s/--src points at source folder (in this case at dataset folder);
flag -yf/--yolo_format converts to YOLO and takes the integer number from 5 to 35 in order to split the dataset into training and validation subsets;
flag -cpy/--change_path_yolo sets path before image names in «train.txt», «valid.txt»

Use flag -h/--help to get more details

Results of the command:

Each image now has additional txt-file with YOLO-format:

YOLO label file has the following contents:

0 0.571111 0.445141 0.764444 0.495298

<object-class> <x_center> <y_center> <width> <height>Where:
-<object-class> - integer object number from 0 to (classes-1) <x_center> <y_center> <width> <height> - float values relative to width and height of image, it can be equal from (0.0 to 1.0]for example: <x> = <absolute_x> / <image_width> or <height> = <absolute_height> / <image_height>atention: <x_center> <y_center> - are center of rectangle (are not top-left corner)
Source — https://github.com/AlexeyAB/darknet#how-to-train-to-detect-your-custom-objects (item #5)

«cfg» folder:

«cfg» folder contains:

«obj.data» file in the folder «cfg» with the following contents:

classes=num_of_classes
train=/path/to/train.txt
valid=/path/to/valid.txt
names=/path/to/obj.names
backup=/path/to/backup

Change «/path/to/backup» to «/mydrive» which is a link to the path where all «weights» (trained models after some amount of iterations) shall be saved during the training, for ex.:

!ln -s “/content/drive/My Drive/YOLO/backup” /mydrive

Note #1: Link is used to avoid error that caused by folder name «My Drive» which cannot be modified.
Note #2: Do not use Colab’s folder «sample_data» for saving weights— each new session the whole data is wiped out and filled with default example files, so, you will loose all your results once current session is over.

«obj.names» file with list of all which the model shall be trained on. For example if to train model on dataset with all COCO objects:

person
bicycle
car
motorbike
aeroplane
bus
train
truck
boat
traffic light
fire hydrant
stop sign
parking meter
bench
bird
cat
dog
horse
sheep
cow
elephant
bear
zebra
giraffe
backpack
umbrella
handbag
tie
suitcase
frisbee
skis
snowboard
sports ball
kite
baseball bat
baseball glove
skateboard
surfboard
tennis racket
bottle
wine glass
cup
fork
knife
spoon
bowl
banana
apple
sandwich
orange
broccoli
carrot
hot dog
pizza
donut
cake
chair
sofa
pottedplant
bed
diningtable
toilet
tvmonitor
laptop
mouse
remote
keyboard
cell phone
microwave
oven
toaster
sink
refrigerator
book
clock
vase
scissors
teddy bear
hair drier
toothbrush

Downloaded from https://github.com/AlexeyAB/darknet
Source — https://camo.githubusercontent.com/d60dfdba6c007a5df888747c2c03664c91c12c1e/68747470733a2f2f6873746f2e6f72672f776562742f79642f766c2f61672f7964766c616775746f66327a636e6a6f64737467726f656e3861632e6a706567

«train.txt» and «valid.txt»

The contents of files:

10. Get «*.cfg» from https://github.com/AlexeyAB/darknet/tree/master/cfg and modify it:

change [filters=255] to filters=(classes + 5)x3 in the 3 [convolutional] before each [yolo] layer, keep in mind that it only has to be the last [convolutional] before each of the [yolo] layers.
Source - https://github.com/AlexeyAB/darknet#how-to-train-to-detect-your-custom-objects (item #1)

Note: «*.cfg» must correspond to the pretrained «weights» otherwise the execution fails. The compatibility list is shown in https://github.com/AlexeyAB/darknet#how-to-train-to-detect-your-custom-objects (item #7)

Training

1. Now one can start training:

!./darknet detector train /content/sample_data/obj.data /content/sample_data/yolov3-tiny.cfg /content/drive/My\ Drive/pretrained/darknet19_448.conv.23 -dont_show

Note #1: Check carefully paths for all files otherwise execution fails (especially paths for images specified in «train.txt» and «valid.txt»).
Note #2: flag -dont_show is obligatory to use in Colab otherwise execution is terminated at the very beginning.

truncated
If error occurs - run training with flag: -dont_show 
Unable to init server: Could not connect: Connection refused

(chart_yolov3-tiny.png:2780): Gtk-WARNING **: 20:15:19.955: cannot open display:

(to disable Loss-Window use darknet.exe detector train data/obj.data yolo-obj.cfg yolov4.conv.137 -dont_show, if you train on computer without monitor like a cloud Amazon EC2)
Source - https://github.com/AlexeyAB/darknet#how-to-train-to-detect-your-custom-objects (item #8)

Loss-window (downloaded from https://github.com/AlexeyAB/darknet **Source** - https://camo.githubusercontent.com/d60dfdba6c007a5df888747c2c03664c91c12c1e/68747470733a2f2f6873746f2e6f72672f776562742f79642f766c2f61672f7964766c616775746f66327a636e6a6f64737467726f656e3861632e6a706567)

In order to save the output (training process) which might be used for further analysis (explained in «Analyzing the output») one can redirect the output to the text file:

!./darknet detector train /content/sample_data/obj.data /content/sample_data/yolov3-tiny.cfg /content/drive/My\ Drive/pretrained/darknet19_448.conv.23 -dont_show > /content/drive/My\ Drive/backup/training_output.txt

2. Find results in the backup folder specified for /mydrive

Now one can use trained model together with «obj.names» and «*.cfg» (see details in «Testing»).

Testing

Now it is time to test the trained model. Download the «weights»-file you prefer to use for object detection, save it on host machine and put it in one folder together with «obj.names» and «*.cfg»-file that was used to train the current model.

1. Run the command for videocapture:

python3 test.py -d /path/to/Model -vc 0 -sf

where

flag -vc/--video_cap — process video from camera. Set the camera number. (in this case it is ‘0’)
flag -d/--darknet — path to darknet(-based) directory (must contain *.cfg, *.names (list of objects to detect), *.weights — each one only once)
flag -sf/--save_frame —if true then save frames with detected objects

Call -h/--help for details

One can find frames with detected objects in the folder «Results»:

And find the results. Each so-called session has a timestamp that corresponds to the time of running the command:

Example of detection:

2. For inferencing images put images in one folder and run:

python3 test.py -d /path/to/Model -is /path/to/folder/with/images

where

flag -is/--image_src — folder with image(s)

The progress is shown in the terminal:

After execution is over all processed images are in the folder «Results».

Example of image with detected objects:

3. For video:

python3 test.py -d /path/to/Model -vs /path/to/folder/with/images

where

flag -vs/--video_src —path to video

The progress is shown in the terminal:

Example of video with detected objects

After execution is over processed video can be found in the folder «Results».

Analyzing the output

1. mAP

If you are unfamiliar with mAP-characteristic one can read corresponding articles in the «References».
In order to get mAP for all weights run the following bash-script (download the script):

!bash /content/drive/My\ Drive/for_mAP/get_mAP.sh

The contents:

#!/usr/bin/bash
cd /content/drive/My\ Drive/darknet/build-releasefor i in {1000..LAST_WEIGHT..1000}
    do
        ./darknet detector map /content/sample_data/obj.data /content/drive/My\ Drive/for_mAP/yolov3.cfg /content/drive/My\ Drive/for_mAP/yolov3_$i.weights -points 0 >> /content/drive/My\ Drive/for_mAP/output$i.txt
        ./darknet detector map /content/sample_data/obj.data /content/drive/My\ Drive/for_mAP/yolov3.cfg /content/drive/My\ Drive/for_mAP/yolov3_$i.weights -iou_thresh 0.75 -points 0 >> /content/drive/My\ Drive/for_mAP/output$i.txt
        ./darknet detector map /content/sample_data/obj.data /content/drive/My\ Drive/for_mAP/yolov3.cfg /content/drive/My\ Drive/for_mAP/yolov3_$i.weights -points 11 >> /content/drive/My\ Drive/for_mAP/output$i.txt
        ./darknet detector map /content/sample_data/obj.data /content/drive/My\ Drive/for_mAP/yolov3.cfg /content/drive/My\ Drive/for_mAP/yolov3_$i.weights -iou_thresh 0.75 -points 11 >> /content/drive/My\ Drive/for_mAP/output$i.txt
        ./darknet detector map /content/sample_data/obj.data /content/drive/My\ Drive/for_mAP/yolov3.cfg /content/drive/My\ Drive/for_mAP/yolov3_$i.weights -points 101 >> /content/drive/My\ Drive/for_mAP/output$i.txt
        ./darknet detector map /content/sample_data/obj.data /content/drive/My\ Drive/for_mAP/yolov3.cfg /content/drive/My\ Drive/for_mAP/yolov3_$i.weights -iou_thresh 0.75 -points 101 >> /content/drive/My\ Drive/for_mAP/output$i.txt
    done

where

instead of LAST_WEIGHT type in the last saved weight

Script gets mAP for 0.5 IoU (no flag, by default) and 0.75 IoU (-iou_thresh 0.75) according to PASCAL VOC2007 (-points 11), PASCAL VOC2012 (-points 0) and MS COCO (-points 101) standards.

The script provides:

Each file contains the following information:

net.optimized_memory = 0 
mini_batch = 1, batch = 16, time_steps = 1, train = 0seen 64, trained: 64 K-images (1 Kilo-batches_64)calculation mAP (mean average precision)...detections_count = 136, unique_truth_count = 109  
 rank = 0 of ranks = 136 
 rank = 100 of ranks = 136 
class_id = 0, name = Nutria, ap = 91.94%     (TP = 99, FP = 4)for conf_thresh = 0.25, precision = 0.96, recall = 0.91, F1-score = 0.93 
 for conf_thresh = 0.25, TP = 99, FP = 4, FN = 10, average IoU = 75.43 %IoU threshold = 50 %, used Area-Under-Curve for each unique Recall 
 mean average precision (mAP@0.50) = 0.919386, or 91.94 %Set -points flag:
 `-points 101` for MS COCO 
 `-points 11` for PascalVOC 2007 (uncomment `difficult` in voc.data) 
 `-points 0` (AUC) for ImageNet, PascalVOC 2010-2012, your custom dataset
net.optimized_memory = 0 
mini_batch = 1, batch = 16, time_steps = 1, train = 0seen 64, trained: 64 K-images (1 Kilo-batches_64)calculation mAP (mean average precision)...detections_count = 136, unique_truth_count = 109  
 rank = 0 of ranks = 136 
 rank = 100 of ranks = 136 
class_id = 0, name = Nutria, ap = 49.81%     (TP = 67, FP = 36)for conf_thresh = 0.25, precision = 0.65, recall = 0.61, F1-score = 0.63 
 for conf_thresh = 0.25, TP = 67, FP = 36, FN = 42, average IoU = 54.34 %IoU threshold = 75 %, used Area-Under-Curve for each unique Recall 
 mean average precision (mAP@0.75) = 0.498097, or 49.81 %Set -points flag:
 `-points 101` for MS COCO 
 `-points 11` for PascalVOC 2007 (uncomment `difficult` in voc.data) 
 `-points 0` (AUC) for ImageNet, PascalVOC 2010-2012, your custom dataset
net.optimized_memory = 0 
mini_batch = 1, batch = 16, time_steps = 1, train = 0seen 64, trained: 64 K-images (1 Kilo-batches_64)calculation mAP (mean average precision)...detections_count = 136, unique_truth_count = 109  
 rank = 0 of ranks = 136 
 rank = 100 of ranks = 136 
class_id = 0, name = Nutria, ap = 90.26%     (TP = 99, FP = 4)for conf_thresh = 0.25, precision = 0.96, recall = 0.91, F1-score = 0.93 
 for conf_thresh = 0.25, TP = 99, FP = 4, FN = 10, average IoU = 75.43 %IoU threshold = 50 %, used 11 Recall-points 
 mean average precision (mAP@0.50) = 0.902615, or 90.26 %Set -points flag:
 `-points 101` for MS COCO 
 `-points 11` for PascalVOC 2007 (uncomment `difficult` in voc.data) 
 `-points 0` (AUC) for ImageNet, PascalVOC 2010-2012, your custom dataset
net.optimized_memory = 0 
mini_batch = 1, batch = 16, time_steps = 1, train = 0seen 64, trained: 64 K-images (1 Kilo-batches_64)calculation mAP (mean average precision)...detections_count = 136, unique_truth_count = 109  
 rank = 0 of ranks = 136 
 rank = 100 of ranks = 136 
class_id = 0, name = Nutria, ap = 49.67%     (TP = 67, FP = 36)for conf_thresh = 0.25, precision = 0.65, recall = 0.61, F1-score = 0.63 
 for conf_thresh = 0.25, TP = 67, FP = 36, FN = 42, average IoU = 54.34 %IoU threshold = 75 %, used 11 Recall-points 
 mean average precision (mAP@0.75) = 0.496749, or 49.67 %Set -points flag:
 `-points 101` for MS COCO 
 `-points 11` for PascalVOC 2007 (uncomment `difficult` in voc.data) 
 `-points 0` (AUC) for ImageNet, PascalVOC 2010-2012, your custom dataset
net.optimized_memory = 0 
mini_batch = 1, batch = 16, time_steps = 1, train = 0seen 64, trained: 64 K-images (1 Kilo-batches_64)calculation mAP (mean average precision)...detections_count = 136, unique_truth_count = 109  
 rank = 0 of ranks = 136 
 rank = 100 of ranks = 136 
class_id = 0, name = Nutria, ap = 92.34%     (TP = 99, FP = 4)for conf_thresh = 0.25, precision = 0.96, recall = 0.91, F1-score = 0.93 
 for conf_thresh = 0.25, TP = 99, FP = 4, FN = 10, average IoU = 75.43 %IoU threshold = 50 %, used 101 Recall-points 
 mean average precision (mAP@0.50) = 0.923423, or 92.34 %Set -points flag:
 `-points 101` for MS COCO 
 `-points 11` for PascalVOC 2007 (uncomment `difficult` in voc.data) 
 `-points 0` (AUC) for ImageNet, PascalVOC 2010-2012, your custom dataset
net.optimized_memory = 0 
mini_batch = 1, batch = 16, time_steps = 1, train = 0seen 64, trained: 64 K-images (1 Kilo-batches_64)calculation mAP (mean average precision)...detections_count = 136, unique_truth_count = 109  
 rank = 0 of ranks = 136 
 rank = 100 of ranks = 136 
class_id = 0, name = Nutria, ap = 49.90%     (TP = 67, FP = 36)for conf_thresh = 0.25, precision = 0.65, recall = 0.61, F1-score = 0.63 
 for conf_thresh = 0.25, TP = 67, FP = 36, FN = 42, average IoU = 54.34 %IoU threshold = 75 %, used 101 Recall-points 
 mean average precision (mAP@0.75) = 0.498992, or 49.90 %Set -points flag:
 `-points 101` for MS COCO 
 `-points 11` for PascalVOC 2007 (uncomment `difficult` in voc.data) 
 `-points 0` (AUC) for ImageNet, PascalVOC 2010-2012, your custom dataset

Save all files to disk on host machine, put in one folder and run the command using Python script:

python3 prepare.py -msy -s /path/to/folder_with_mAP_files -d /path/to/output/folder

where

flag-msy/--mAP_stat_yolo gets mAP statistics for YOLO from given folder and saves it in Excel file.

Output:

2. Avg.loss
In order to get average loss dynamics, how it change during the training, whethr it goes down or up, save on host machine txt-file «training_output.txt» and run the following command:

python3 prepare.py -s /path/to/folder/with/output/file -tsy training_output.txt -d /path/to/output/folder

where

flag-tsy/--train_stat_yolo gets training statistics for YOLO from given folder and saves it in Excel.

3. With given information it is possible now to see which «weights»-file is more preferable to use by ploting the graph (using built-in Excel or OpenDocument functionality).

For avg.loss:

For mAP:

Object detection with YOLO

Repository preparation

Data preparation

Training

Testing

Analyzing the output

Links to download

References

Written by Elenche Zététique