Training YOLOv3 Convolutional Neural Networks Using darknet

Tom Lever
38 min readNov 8, 2019

1) Purpose of this Blog Post

Hi. My name is Tom Lever. I’ve recorded this blog post, and recorded a corresponding video series starting with

https://www.youtube.com/watch?v=-2C3E_U4C-k&list=PLyZBahNf39pfdOIHMEO74kNUna6EY82Dk&index=2

to offer you a guide to getting started using Google Colaboratory to train a YOLOv3 convolutional neural network on the 2014 COCO dataset, the Pascal VOC 2007 and 2012 datasets, or a custom dataset so as to detect objects in images, videos, and camera streams. More technically, I’m offering you a guide to getting started using a Python3 kernel on a Linux virtual machine hosted by a Google server to compile AlexeyAB’s deep-learning framework, called darknet, and using the kernel to run a compiled executable file, also called darknet. In running darknet the kernel will forward-propagate a matrix representing training images through a YOLOv3 convolutional neural network, will compare a manipulated matrix to a matrix that encodes whether objects are present in different regions in each image, where they are, and of what classes they are, and will update a weights file based on the above comparison that encodes the values of each cell in each filter in the network. The key take-away from training a YOLOv3 convolutional neural network is a weights file, which I can use to detect objects in images, video frames, or camera-stream frames by forward-propagating those images through the trained network.

Figure 1: A screenshot of “Training YOLOv3 CNN’s Using darknet: Part 1 of 14: Purpose of This Series” at https://www.youtube.com/watch?v=-2C3E_U4C-k&list=PLyZBahNf39pfdOIHMEO74kNUna6EY82Dk&index=2.

2) Jupyter Notebook and Google Colaboratory

Jupyter Notebook is a web browser based application installed on my Windows-10 PC that allows me to create documents that contain live code, equations, visualizations, and narrative text. The documents that I create using Jupyter Notebook are also called jupyter notebooks. While using Jupyter Notebook, I can employ the Python3 kernel to execute Python code, execute Command Prompt Commands, access local files, access local hardware (like my webcam), and access folders and files on remote servers.

Google Colaboratory is a Jupyter-like application installed on a Linux virtual machine hosted by a Google server. The documents that I create using Google Colaboratory are also jupyter notebooks. While using Google Colaboratory, I can employ the Python3 kernel to execute Python code, execute Linux Terminal commands, access files local to my virtual machine, access information received by the browser on my local Windows-10 PC (like requested webcam images), and access folders and files on remote servers.

Figure 2: A screenshot of a presentation of a Jupyter-Notebook jupyter notebook and a Google-Colaboratory jupyter notebook.

3) Creating a Jupyter Notebook Using Google Colaboratory

To create a jupyter notebook using Google Colaboratory, I open Google Chrome, enter “https://drive.google.com” in Chrome’s OmniBar, press “enter”, scroll down a little bit, left click “Go to Google Drive”, enter my email address, press “enter”, enter my password, press “enter”, press “Save” when prompted to save my password, right click within the window to all the folders and files in the root folder of my Google Drive, left click “New folder”, enter “darknet” in the dialog box that appears, press “enter”, double left click the “darknet” folder, right click within the window to all the folders and files in my darknet folder, place my mouse cursor over “More”, left click “Google Colaboratory”, press “Stay Opted-In” when prompted to continue using a version of Google Colaboratory new as of October 30, 2019, left click “File”, left click “Rename…”, enter “YOLOv3_Training.ipynb” in the jupyter notebook’s title bar, and press “enter”.

Figure 3: Creating a “YOLOv3_Training.ipynb”, saved in my Google Drive, using Google Colaboratory.

4) Setting Up YOLOv3_Training.ipynb

To set up our jupyter notebook YOLOv3_Training.ipynb, I left click “Edit”, left click “Notebook settings”, left click the down-pointing triangle below “Hardware accelerator”, left click “GPU”, and left click “Save”. I left click the right-pointing chevron left of the first code cell and left click “Files”.

Figure 4: Enabling use of the Graphical Processing Units on the Google server hosting our jupyter notebook.

5) Navigating My Virtual Machine

When I employ the Python3 kernel in accessing files, the kernel assumes by default that the files are stored in the “ROOT/content” directory, where ROOT is the root directory of my virtual machine. To navigate to the root directory of my virtual machine, type “%cd ..” in the first code cell and execute the command by left clicking the play button or pressing and holding “shift” and pressing “enter”. To return to the “ROOT/content” directory, execute the command “%cd content”. To view the contents of the “ROOT/content” directory, execute the command “!ls”.

Figure 5: A screenshot presenting Linux commands for changing directory and listing files in directories.

6) Copying the darknet Deep-Learning Framework

To copy darknet, AlexeyAB’s deep-learning framework, to the “ROOT/content” directory of our virtual machine, I execute the command “!git clone https://github.com/AlexeyAB/darknet".

Figure 6: Copying the darknet deep-learning framework into the “ROOT/content” directory of our virtual machine.

7) Compiling darknet.sln

Now I want to prepare to compile “darknet.sln” in the “ROOT/content/darknet/build/darknet” directory into the executable file darknet in the “ROOT/content/darknet” directory. To do this, I navigate to the “ROOT/content/darknet” directory by executing the command “%cd darknet”. I’m now in the “ROOT/content/darknet” directory. I modify AlexeyAB’s Makefile in this directory to configure the darknet binary to perform matrix computations using Google’s graphical processing units, to use NVIDIA Corporation’s CUDA Deep Neural Network library, and to use the OpenCV team’s OpenCV library. I modify AlexeyAB’s Makefile by executing the commands

!sed -i ‘s/GPU=0/GPU=1/’ Makefile
!sed -i ‘s/CUDNN=0/CUDNN=1/’ Makefile
!sed -i ‘s/OPENCV=0/OPENCV=1/’ Makefile

I’m still in the “ROOT/content/darknet” directory. To compile “darknet.sln” into the darknet binary, I execute the simple command “!make”.

Figure 7: Compiling “ROOT/content/darknet/build/darknet/darknet.sln” into the executable binary file “ROOT/content/darknet”.

8) Prepare Data Files

Before running my darknet executable file, I want to prepare data files (like obj.data). Each data file will provide darknet with information on the number of classes of objects I have identified in the images in my training dataset, the location of a text-file listing of the paths to images in my training dataset, the location of a text-file listing of the paths to images in my validation / development dataset, the location of a text-file listing of the names of classes of objects, and the location of a folder in which to save a weights file that the darknet binary will use to initialize the values in each cell in each filter in the YOLOv3 convolutional neural network.

Joseph Chet Redmon is the creator of the original darknet deep-learning framework. AlexeyAB offers an optimized version of Redmon’s framework. Instructions on how to use Redmon’s, and AlexeyAB’s, frameworks are available on Redmon’s YOLOv3 webpage. To navigate to Redmon’s YOLOv3 webpage, open a new tab in Chrome, type “https://pjreddie.com" in Chrome’s OmniBar, press “enter”, left click “darknet”, scroll down a little bit, and left click “YOLO: Real-Time Object Detection”.

8.1) Preparing coco.data

The data file corresponding to Joseph Chet Redmon’s second presented command to train a YOLOv3 convolutional neural network (“./darknet detector train cfg/coco.data cfg/yolov3.cfg darknet53.conv.74 -dont_show”), which corresponds to training a YOLOv3 convolutional neural network on the 2014 COCO dataset, is coco.data. One version of coco.data is located on our virtual machine in “ROOT/content/darknet/cfg/coco.data”. Execute the command “!head -6 cfg/coco.data” to display the first six lines of coco.data.

Figure 8.1.1: The first six lines of the default version of “ROOT/content/darknet/cfg/coco.data”.

I want to revise coco.data to reference a text-file listing on my virtual machine of paths to training images, a text-file listing on my virtual machine of paths to validation / development images, a text-file listing on my virtual machine of the names of classes of objects, and a folder on my virtual machine in which to save a weights file. I want coco.data to look like this:

classes = 80
train = data/coco/trainvalno5k.txt
valid = data/coco/5k.txt
#valid = data/coco_val_5k.list
names = data/coco.names
backup = backup

To get coco.data to reference files and a folder on my virtual machine, I execute the commands

!sed -i ‘s|train = /home/pjreddie/data/coco/trainvalno5k.txt|train = data/coco/trainvalno5k.txt|’ cfg/coco.data
!sed -i ‘s|valid = coco_testdev|valid = data/coco/5k.txt|’ cfg/coco.data
!sed -i ‘s|backup = /home/pjreddie/backup/|backup = backup|’ cfg/coco.data

Please note at this point that Medium’s blog post creation interface does not allow me to present the above stream-editing commands with the proper number of whitespace characters: Please edit the stream-editing commands to replace strings as presented in Figure 8.1.1.

I will check that my version of coco.data references files and a folder on my virtual machine by executing the command “!head -6 cfg/coco.data”.

Figure 8.1.2: The first six lines of my revised version of “ROOT/content/darknet/cfg/coco.data”.

8.2) Preparing voc.data

The data file corresponding to locating objects associated with the Pascal VOC 2007 and 2012 datasets is voc.data. One version of voc.data is located on our virtual machine in “ROOT/content/darknet/cfg/voc.data”. Execute the command “!head -6 cfg/voc.data” to display the first six lines of voc.data.

Figure 8.2.1: The first six lines of the default version of “ROOT/content/darknet/cfg/voc.data”.

I want to revise voc.data to reference a text-file listing on my virtual machine of paths to training images, a text-file listing on my virtual machine of paths to validation / development images, a text-file listing on my virtual machine of the names of classes of objects, and a folder on my virtual machine in which to save a weights file. I want coco.data to look like this:

classes = 20
train = VOC_data_from_2007_to_2012/train.txt
valid = VOC_data_from_2007_to_2012/2007_test.txt
names = data/voc.names
backup = backup

To get voc.data to reference files and a folder on my virtual machine, I execute the commands

!sed -i ‘s|train = /home/pjreddie/data/voc/train.txt|train = VOC_data_from_2007_to_2012/train.txt|’ cfg/voc.data
!sed -i ‘s|valid = /home/pjreddie/data/voc/2007_test.txt|valid = VOC_data_from_2007_to_2012/2007_test.txt|’ cfg/voc.data
!sed -i ‘s|backup = /home/pjreddie/backup/|backup = backup|’ cfg/voc.data

Please note at this point that Medium’s blog post creation interface does not allow me to present the above stream-editing commands with the proper number of whitespace characters: Please edit the stream-editing commands to replace strings as presented in Figure 8.2.1.

I will check that my version of voc.data references files and a folder on my virtual machine by executing the command “!head -6 cfg/voc.data”.

Figure 8.2.2: The first six lines of my revised version of “ROOT/content/darknet/cfg/voc.data”.

8.3) Preparing obj.data

To create a data file corresponding to locating objects associated with my own dataset of one hundred images of oak leaves from various species in Quercus, the genus of oak trees, on my Windows-10 PC, I minimize my jupyter notebook, press and hold “shift”, right click my desktop, left click “Open PowerShell window here”, reposition and resize my PowerShell window, write “Notepad”, press “enter”, re-position and resize my Notepad window, left click “File”, left click “Save As…”, delete the suggested file name, press the down-pointing chevron right of “Save as type:”, left click “All Files”, enter “obj.data” as file name, and left click “Save”. To fill in obj.data, I write:

classes = 1
train = Quercus_Dataset/train.txt
valid = Quercus_Dataset/validate.txt
names = data/obj.names
backup = backup

I will save obj.data by left clicking “File” and left clicking “Save”. I will return to view our jupyter notebook YOLOv3_Detecting.ipynb.

To copy obj.data from my desktop on my Windows-10 PC into the file “ROOT/content/darknet/cfg/obj.data”, I navigate to my Google Drive’s darknet folder, right click within the window to all the folders and files in my darknet folder, left click “Upload files”, left click “obj.data”, and left click “Open”. I then navigate back my jupyter notebook YOLOv3_Training.ipynb and navigate to the “ROOT/content” directory by executing the command “%cd ..”.

While in the “ROOT/content” directory, I mount my Google Drive as a virtual drive accessible by my virtual machine by executing the Python code

from google.colab import drive
drive.mount(“/content/drive”)

I left click the URL, left click my name, scroll down a little bit, left click “Allow”, left click the “Copy” button, navigate back to my jupyter notebook, right click in the text box under “Enter your authorization code:”, left click “Paste”, press “enter”, left click “Refresh”, and close the authentication window.

I’m still in the “ROOT/content” directory. To copy obj.data from my Google Drive’s darknet folder to the “ROOT/content/darknet/cfg/obj.data” file, I execute the command “!cp drive/My\ Drive/darknet/obj.data darknet/cfg/obj.data”. I then navigate back to “ROOT/content/darknet” directory by executing the command “%cd darknet”.

Figure 8.3: Creating “obj.data” on the Desktop of my Windows-10 PC.

9) Preparing Configuration Files

I next want to prepare configuration files. Each configuration file will provide darknet with information on how to organize images into batches and subdivisions (also known as mini-batches). When I train darknet’s YOLOv3 convolutional neural network, darknet divides my training set into batches with 64 images each and divides each batch into 16 subdivisions with 4 images each. Darknet then forward-propagates each matrix representing a subdivision of images through the YOLOv3 convolutional neural network, captures the “losses” output by the eighty-second, ninety-fourth, and one-hundred-sixth layers of the network, averages these losses into a “cost”, and adjusts each value in each cell in each filter in the network based on the existing cell value, the learning rate, and the value of a function representing the derivative of the cost with respect to each cell value. When I employ darknet to identify objects in an image, darknet divides my set of 1 image into batches with 1 image each and divides each batch into 1 subdivision with 1 image.

9.1) Preparing yolov3.cfg

The configuration file corresponding to Joseph Chet Redmon’s second presented command to train a YOLOv3 convolutional neural network (“./darknet detector train cfg/coco.data cfg/yolov3.cfg darknet53.conv.74 -dont_show”), which corresponds to training a YOLOv3 convolutional neural network on the 2014 COCO dataset, is yolov3.cfg. One version of yolov3.cfg is located on our virtual machine in “ROOT/content/darknet/cfg/yolov3.cfg”. Execute the command “!head -7 cfg/yolov3.cfg” to display the first seven lines of yolov3.cfg. Note that these seven lines presently configure darknet to divide a test set of 1 image into batches with 1 image each and divides each batch into 1 subdivision with 1 image.

Figure 9.1.1: The first seven lines of the default version of “ROOT/content/darknet/cfg/yolov3.cfg”.

I want to revise yolov3.cfg so as to configure the YOLOv3 convolutional neural network to divide the set of training images into batches with 64 images each and to divide each batch into subdivisions with 4 images each. I want yolov3.cfg to look like this:

[net]
# Testing
# batch=1
# subdivisions=1
# Training
batch=64
subdivisions=16

To do this, I will execute the commands

!sed -i ‘s/batch=1/# batch=1/’ cfg/yolov3.cfg
!sed -i ‘s/subdivisions=1/# subdivisions=1/’ cfg/yolov3.cfg
!sed -i ‘s/# batch=64/batch=64/’ cfg/yolov3.cfg
!sed -i ‘s/# # subdivisions=16/subdivisions=16/’ cfg/yolov3.cfg

Please note that the above form of the last stream-editing command was necessary for me to successfully revise yolov3.cfg in one shot.

I will check that my version of yolov3.cfg defines batch=64 and subdivisions=16 by executing the command “!head -7 cfg/yolov3.cfg”.

Figure 9.1.2: The first seven lines of my revised version of “ROOT/content/darknet/cfg/yolov3.cfg”.

9.1) Preparing yolov3-voc.cfg

The configuration file corresponding to training a YOLOv3 convolutional neural network on the Pascal VOC 2007 and 2012 datasets (“./darknet detector train cfg/voc.data cfg/yolov3-voc.cfg darknet53.conv.74 -dont_show”) is yolov3-voc.cfg. One version of yolov3-voc.cfg is located on our virtual machine in “ROOT/content/darknet/cfg/yolov3-voc.cfg”. Execute the command “!head -7 cfg/yolov3-voc.cfg” to display the first seven lines of yolov3-voc.cfg. Note that these seven lines presently configure darknet to divide a test set of 1 image into batches with 1 image each and divides each batch into 1 subdivision with 1 image.

Figure 9.2.1: The first seven lines of the default version of “ROOT/content/darknet/cfg/yolov3-voc.cfg”.

I want to revise yolov3-voc.cfg so as to configure the YOLOv3 convolutional neural network to divide the set of training images into batches with 64 images each and to divide each batch into subdivisions with 4 images each. I want yolov3-voc.cfg to look like this:

[net]
# Testing
# batch=1
# subdivisions=1
# Training
batch=64
subdivisions=16

To do this, I will execute the commands

!sed -i ‘s/batch=1/# batch=1/’ cfg/yolov3-voc.cfg
!sed -i ‘s/subdivisions=1/# subdivisions=1/’ cfg/yolov3-voc.cfg
!sed -i ‘s/# batch=64/batch=64/’ cfg/yolov3-voc.cfg
!sed -i ‘s/## subdivisions=16/subdivisions=16/’ cfg/yolov3-voc.cfg

Please note that the above form of the last stream-editing command was necessary for me to successfully revise yolov3-voc.cfg in one shot.

I will check that my version of yolov3-voc.cfg defines batch=64 and subdivisions=16 by executing the command “!head -7 cfg/yolov3-voc.cfg”.

Figure 9.2.2: The first seven lines of my revised version of “ROOT/content/darknet/cfg/yolov3-voc.cfg”.

9.3) Preparing yolov3-obj.cfg

To create a configuration file corresponding to locating objects associated with my own dataset of one hundred images of oak leaves from various species in Quercus, the genus of oak trees, I first change directory to “ROOT/content” by executing the command “%cd ..”. Then, I copy yolov3.cfg from “ROOT/content/darknet/cfg” to my Google Drive’s darknet folder by executing the command “!cp darknet/cfg/yolov3.cfg drive/My\ Drive/darknet/yolov3.cfg”. Then, I navigate to the darknet folder of my Google Drive, right click yolov3.cfg, left click “Download”, and left click “Save”. Then, I enter “Jupyter Notebook” in my search bar, left click “Jupyter Notebook (Anaconda 3)”, left click “Desktop”, left click in the check box to the left of “yolov3.cfg”, left click rename, enter “yolov3-obj.cfg”, and click “Save”. Then I left click “yolov3-obj.cfg”.

In the configuration file “yolov3-obj.cfg”, to configure the YOLOv3 convolutional neural network to divide the set of training images into batches with 64 images each and to divide each batch into subdivisions with 4 images each, I comment out “batch=1” and “subdivisions=1” with pound signs, uncomment “batch=64” and “subdivisions=16”, change “max_batches = 500200” to “max_batches=[max(2000 * number of classes, 4000) written as an integer]”, change “steps=400160,450180” to “steps=[floor(0.8 * max_batches) written as an integer],[floor(0.9 * max_batches) written as an integer]”, replace “classes=80” in lines 610, 696, and 783 with classes=[number of classes written as an integer], and replace “filters=255” in lines 603, 689, and 776 with “filters=[3 * (number of classes + 5) written as an integer]”. I left click “File” and left click “Save”.

To copy yolov3-obj.cfg from my desktop on my Windows-10 PC into the file “ROOT/content/darknet/cfg/yolov3-obj.cfg”, I navigate to my Google Drive’s darknet folder, right click within the window to all the folders and files in my darknet folder, left click “Upload files”, left click “yolov3-obj.cfg”, and left click “Open”. To copy yolov3-obj.cfg from my Google Drive’s darknet folder to the “ROOT/content/darknet/cfg/yolov3-obj.cfg” file, I execute the command
“!cp drive/My\ Drive/darknet/yolov3-obj.cfg darknet/cfg/yolov3-obj.cfg”. I then navigate back to “ROOT/content/darknet” directory by executing the command “%cd darknet”.

Figure 9.3: Editing yolov3-obj.cfg on the Desktop of my Windows-10 PC in Jupyter Notebook.

10) Preparing Weights Files

I next want to prepare a weights file. The darknet binary will use each weights file to initialize the values in each cell in each filter in the YOLOv3 convolutional neural network when either training the network or forward propagating an image or video frame through the network. Any random weights file in an appropriate format can be used to initialize the weights for any darknet YOLOv3 convolutional neural network; the darknet53.conv.74 is an appropriately formatted weights file that can be used to initialize the weights in a darknet YOLOv3 convolutional neural network, and has the added benefit of being a weights file appropriate for initializing the weights of a darknet YOLOv3 convolutional neural network able to detect objects similar to objects in one of the one-hundred million images in the ImageNet database.

We’re still in the ROOT/content/darknet directory. To download darknet53.conv.74 to this directory, execute the command
!wget https://pjreddie.com/media/files/darknet53.conv.74.

Figure 10: A screenshot of downloading the weights file https://pjreddie.com/media/files/darknet53.conv.74.

11) Preparing Names Files

I next want to prepare names files. Each names file will provide darknet with a listing of the names of classes of objects (like Quercus) that I labeled in images when I developed a training dataset (like our dataset of images of oak leaves).

11.1) Preparing coco.names

The configuration file corresponding to Joseph Chet Redmon’s second presented command to train a YOLOv3 convolutional neural network (“./darknet detector train cfg/coco.data cfg/yolov3.cfg darknet53.conv.74 -dont_show”), which corresponds to training a YOLOv3 convolutional neural network on the 2014 COCO dataset, is coco.names. One version of coco.names is located in “ROOT/content/darknet/data/coco.names”. coco.names does not need to be modified during either training or detecting. Execute the command “!head -5 data/coco.names” to display the first five lines of coco.names.

Figure 11.1: The first five lines of “ROOT/content/darknet/data/coco.names”.

11.2) Preparing voc.names

The names file corresponding to locating objects associated with the Pascal VOC 2007 and 2012 datasets is voc.names. One version of voc.names is located in “ROOT/content/darknet/data/voc.names”. voc.names does not need to be modified during either training or detecting. Execute the command “!head -5 data/voc.names” to display the first five lines of voc.names.

Figure 11.2: The first five lines of “ROOT/content/darknet/data/voc.names”.

11.3) Preparing obj.names

To create a names file corresponding to locating objects associated with my own dataset of one hundred images of oak leaves from various species in Quercus, the genus of oak trees, I close obj.data, navigate to my PowerShell window, write “Notepad”, press “enter”, reposition and resize my Notepad window, left click “File”, left click “Save As…”, delete the suggested file name, press the down-pointing chevron right of “Save as type:”, left click “All Files”, enter “obj.names” as file name, and left click “Save”. To fill in obj.names, I write the single line

Quercus

I will save obj.names by left clicking “File” and left clicking “Save”. I will return to view our jupyter notebook YOLOv3_Training.ipynb.

To copy obj.names from my desktop on my Windows-10 PC into the file “ROOT/content/darknet/data/obj.names”, I navigate to my Google Drive’s darknet folder, right click within the window to all the folders and files in my darknet folder, left click “Upload files”, left click “obj.names”, and left click “Open”. I then navigate back my jupyter notebook and navigate to the “ROOT/content” directory by executing the command “%cd ..”. To copy obj.names from my Google Drive’s darknet folder to the “ROOT/content/darknet/data/obj.names” file, I execute the command
“!cp drive/My\ Drive/darknet/obj.names darknet/data/obj.names”. I then navigate back to “ROOT/content/darknet” directory by executing the command “%cd darknet”.

Figure 11.3: Creating “obj.names” on the Desktop of my Windows-10 PC.

12) Setting Up Datasets of Images and Information

Let’s now set up images and image information that we will use to train YOLOv3 convolutional neural networks.

First, a new notes:

  • I have discovered through experience that training YOLOv3 CNN’s using darknet succeeds when each image files is a JPEG file.
  • I have discovered that training YOLOv3 CNN’s succeeds when each image file has a corresponding label text file with the same prefix. Each line in each label file corresponds to one bounding box around an object in the appropriate image. Each line contains five data points that are separated by whitespace characters and ends with a newline character. The five data points record the index of the class name of the object in a list of class names of objects associated with the dataset, the ratio of the horizontal center of the bounding box to the width of the image, the ratio of the vertical center of the bounding box to the height of the image, the ratio of the width of the bounding box to the width of the image, and the ratio of the height of the bounding box to the height of the image.
  • I have discovered that training YOLOv3 CNN’s using darknet and a custom dataset succeeds when training images and label files are in the same directory. However, I have discovered that the images and label files associated with the COCO dataset and images and label files associated with the Pascal VOC 2007 and 2012 datasets are organized differently.

12.1) Setting Up 2014 COCO Training and Validation Datasets and Their Images and Information

Let’s explore downloading, unzipping, and organizing images and label files associated with the COCO dataset, creating a text-file listing of paths to training images, and creating a text-file listing of paths to validation / development images.

We are still in the “ROOT/content/darknet” directory. I change directory into the “ROOT/content/darknet/data” folder by executing the comand “%cd data”. I create the folder “ROOT/content/darknet/data/coco” by executing the command “!mkdir coco”. I change directory into the “ROOT/content/darknet/data/coco” folder by executing the command “%cd coco”. I create the folder “ROOT/content/darknet/data/coco/images” by executing the command “!mkdir images”. I change directory into the “ROOT/content/darknet/data/coco/images” folder by executing the command “%cd images”.

I download to our present folder the zipped folder of 2014 COCO training images by executing the command “!wget -c http://images.cocodataset.org/zips/train2014.zip". The “-c” or “continue” option allows wget to resume downloading train2014.zip in a case of communication with cocodataset.org failing. I unzip train2014.zip into a folder of 2014 COCO training images using the command “!unzip -q train2014.zip”. The “-q” option suppresses printing image file names to this command’s code cell. I download to our present folder “ROOT/content/darknet/data/coco/images” the zipped folder of 2014 COCO validation / development images by executing the command “!wget -c http://images.cocodataset.org/zips/val2014.zip”. I unzip val2014.zip into a folder of 2014 COCO validation / development images using the command “!unzip -q val2014.zip”.

Figure 12.1.1: A screenshot of beginning downloading http://images.cocodataset.org/zips/train2014.zip. After running this command, I will unzip train2014.zip, download val2014.zip, and unzip val2014.zip.

After unpacking the 2014 COCO training and validation images to the train2014 and val2014 folders in “ROOT/content/darknet/data/coco/images”, I return to the “ROOT/content/darknet/data/coco” directory by executing the command “%cd ..”. I will download to our present folder “labels.tgz”, an archive of two folders of label files corresponding to the 2014 COCO training and validation images, by executing the command “!wget -c https://pjreddie.com/media/files/coco/labels.tgz”. I will unpack into “ROOT/content/darknet/data/coco” a labels folder containing two folders of label files corresponding to the 2014 COCO training and validation images by executing the command “!tar -xzf labels.tgz”. The “x”, “z”, and “f” options indicate that the “labels” folder in the archive should be extracted, the archive is in a “gzip” file format, and the archive is a file, as opposed to an external drive.

Figure 12.1.2: A screenshot of completing downloading https://pjreddie.com/media/files/coco/labels.tgz. After running this command, I will unpack labels.tgz.

I’m still in the “ROOT/content/darknet/data/coco” directory. I will create a list of the file names of all of the 2014 COCO training images by executing the Python code

I will print the number of 2014 COCO training images and a short sub-list of file names of 2014 COCO training images by executing the Python code

Per the below ouput, there are 82,783 2014 COCO training images.

Figure 12.1.3: Metadata for 2014 COCO training images.

I will create a list of the names of files containing labels for our 2014 COCO training images by executing the Python code

I will print the number of files containing labels for 2014 COCO training images and a short sub-list of the names of files containing labels for training images by executing the Python code

Per the below output, there are 82,081 files containing labels for 2014 COCO training images.

Figure 12.1.4: Metadata for label files for 2014 COCO training images.

Unfortunately, the number of files containing labels for training images (82,081) is significantly smaller than the number of training images (82,783). When we construct our text file listing of paths to training images, we will have to be careful to only include paths to training images with corresponding label files. More on this later.

Given that I am focused on efficiently training a YOLOv3 convolutional neural network on the 2014 COCO dataset, I will follow Joseph Chet Redmon’s example and will lump most of the COCO validation dataset into my actual training dataset, which will contain many more image and label files than the original 2014 COCO training dataset. To get started with the COCO validation dataset, I will create a list of the file names of all of the COCO validation images by executing the Python code

I will print the number of 2014 COCO validation images and a short sub-list of file names of 2014 COCO validation images by executing the Python code

Per the below output, there are 40,504 2014 COCO validation images.

Figure 12.1.5: Metadata for 2014 COCO validation images.

I will create a list of the names of files containing labels for our 2014 COCO validation images by executing the Python code

I will print the number of files containing labels for 2014 COCO validation images and a short sub-list of the names of files containing labels for training images by executing the Python code

Per the below output, there are 40,137 files containing labels for 2014 COCO validation images.

Figure 12.1.6: Metadata for label files for 2014 COCO validation images.

Unfortunately, the number of files containing labels for validation images (40,137) is significantly smaller than the number of validation images (40,504). When we construct our text file listing of paths to training images (which will include most of our validation images), we will have to be careful to only include paths to training images from the validation dataset with corresponding label files.

Next, I’m going to create storage for a complete, master list of file names of images in both the original 2014 COCO training dataset and the original 2014 COCO validation dataset that have corresponding label files. I execute the Python code

Now, I want to let you know about an assumption I’m going to make. The assumption is that every label file has a corresponding image file. It turns out to be true, but it’s still an assumption at this point. Assuming that every label file has a corresponding image file, let’s fill in our master list of image file names by iterating through each list of label file names and adding the file name of the corresponding image to the master list of image file names. Just in case every label file does not have a corresponding image file, I will check to see if the image file name corresponding to each label file is actually in the appropriate list of image file names. To do all of this, I execute the following Python code:

I will check to see if every label file actually has a corresponding image file by executing the following Python code. The answer is, “Yes”.

In summary, we have a master list of 2014 COCO image file names where every image exists in either train2014 or val2014 and every image has a corresponding label file.

Next, as is common practice in training neural networks, I will randomize our master list of image file names to prepare to divvy up paths to image files into a text file listing of paths of images we actually want to use to train a YOLOv3 convolutional neural network on the 2014 COCO dataset, and a text file listing of paths to extra images that we could hypothetically use for validating how well our YOLOv3 convolutional neural network trained and/or how useful we can expect our network to be. I will randomize our master list of image file names by executing the Python code

Finally, I will create our text file listing of paths to images we actually want to use to train a YOLOv3 convolutional neural network on the 2014 COCO dataset, and our text file listing of paths to images we actually want to use for validation, by executing the Python code

To wrap up, I will check to see if our text file listing of absolute paths to training images we actually want to use to train a YOLOv3 convolutional neural network was successfully written, and will check to see if our text file listing of absolute paths to actual validation images was successfully written by executing the Python code

Per the below output, the number of absolute paths in our text file listing of paths to actual training images is 117,218, and the number of absolute paths in our text file listing of paths to actual validation images is 5,000. The sum is our total number of label files, or the number of images in the 2014 COCO training and validation datasets with associated label files.

Figure 12.1.7: Metadata for our text file listings of absolute paths to actual training and validation images.

Now, we are ready to execute a darknet binary configured to train a YOLOv3 convolutional neural network on the 2014 COCO dataset. Before we do, let’s set up the Pascal VOC 2007 and 2012 datasets and a text file listing of paths to training images associated with the Pascal VOC 2007 and 2012 datasets. Let’s reset back to our “ROOT/content/darknet” folder by executing “%cd ..” twice.

12.2) Setting Up a Pascal VOC 2007 and 2012 Training Dataset and Its Images and Information

Let’s explore downloading, unzipping, and organizing images and label files associated with the Pascal VOC 2007 and 2012 datasets and creating a text-file listing of paths to training images.

We are still in the “ROOT/content/darknet” directory. I create the folder “ROOT/content/darknet/VOC_data_from_2007_to_2012” by executing the command “!mkdir VOC_data_from_2007_to_2012”. I change directory into the “ROOT/content/darknet/data/VOC_data_from_2007_to_2012” folder by executing the command “%cd VOC_data_from_2007_to_2012”. I download to our present folder Tape ARchive files containing images and image-information associated with the Pascal VOC 2012 training and validation datasets and images and image-information associated with the Pascal VOC 2007 training and validation datasets by executing the commands “!wget https://pjreddie.com/media/files/VOCtrainval_11-May-2012.tar" and “!wget https://pjreddie.com/media/files/VOCtrainval_06-Nov-2007.tar".

Figure 12.2.1: A screenshot of completing downloading http://pjreddie.com/media/files/VOCtrainval_11-May-2012.tar. After running this download completes, VOCtrainval_06-Nov-2007.tar will be downloaded automatically. I will then unpack these two archives.

I will unpack into “ROOT/content/darknet/VOC_data_from_2007_to_2012/VOCdevkit” (created by the following command) a “VOC2012” folder containing images and image-information associated with the Pascal VOC 2012 training and validation datasets by executing the command “!tar -xf VOCtrainval_11-May-2012.tar”. I will unpack into “ROOT/content/darknet/VOC_data_from_2007_to_2012/VOCdevkit” a “VOC2007” folder containing images and image-information associated with the Pascal VOC 2007 training and validation datasets by executing “!tar -xf VOCtrainval_06-Nov-2007.tar”.

Training and validation images associated with the Pascal VOC 2012 dataset live in “ROOT/content/darknet/VOC_data_from_2007_to_2012/VOCdevkit/VOC2012/JPEGImages”. Information regarding an image and objects and bounding boxes within that image is encoded in an XML file, with the same prefix as the image, that lives in “ROOT/content/darknet/VOC_data_from_2007_to_2012/VOCdevkit/VOC2012/Annotations”. Similarly, training and validation images associated with the Pascal VOC 2007 dataset live in “ROOT/content/darknet/VOC_data_from_2007_to_2012/VOCdevkit/VOC2007/JPEGImages”. Information regarding an image and objects and bounding boxes within that image is encoded in an XML file, with the same prefix as the image, that lives in “ROOT/content/darknet/VOC_data_from_2007_to_2012/VOCdevkit/VOC2007/Annotations”.

Given that the “ROOT/content/darknet/VOC_data_from_2007_to_2012/VOCdevkit” folder does not contain the label files for images that we will use to train a YOLOv3 convolutional neural network on the Pascal VOC 2007 and 2012 datasets, let’s write a Python script to create folders of label files, for all images associated with the Pascal 2007 and 2012 training and validation datasets, at “ROOT/content/darknet/VOC_data_from_2007_to_2012/VOCdevkit/VOC2012/labels” and “ROOT/content/darknet/VOC_data_from_2007_to_2012/VOCdevkit/VOC2007/labels”. Each image used for training will have an associated label file. Each line in each label file will encode information on one object / bounding box identified within the corresponding image. Each line will contain the index of the class name of the object in a list of class names associated with the Pascal VOC 2007 and 2012 datasets, a whitespace character, the ratio of the horizontal position of the center of the bounding box (in pixels from the upper left corner of the image) to the width of the image (in pixels), a whitespace character, the ratio of the vertical position of the center of the bounding box (in pixels from the upper left corner of the image) to the height of the image (in pixels), a whitespace character, the ratio of the width of the bounding box (in pixels) to the width of the image, a whitespace character, the ratio of the height of the bounding box (in pixels) to the height of the image, and a newline character.

In order to create folders of label files, let’s import the Python os library. In order to access image information encoded in XML files, let’s import the Python xml.etree.ElementTree library as ET.

In order to enter indices representing class names into label files, let’s create a list of all twenty class names associated with the Pascal VOC 2007 and 2012 datasets.

To access each of four text file listings of paths to Pascal VOC 2007 and 2012 training and validation images, that we can use to find image-information files, that we can use to fill in label files, let’s define a list of tuples representing the Pascal VOC 2012 training dataset, the Pascal VOC 2012 validation dataset, the Pascal VOC 2007 training dataset, and the Pascal VOC 2007 validation dataset.

The four text file listings of paths to Pascal VOC 2007 and 2012 training and validation images represented by the four tuples are:

  • “ROOT/content/darknet/VOC_data_from_2007_to_2012/VOCdevkit/VOC2012/ImageSets/Main/train.txt”,
  • “ROOT/content/darknet/VOC_data_from_2007_to_2012/VOCdevkit/VOC2012/ImageSets/Main/val.txt”,
  • “ROOT/content/darknet/VOC_data_from_2007_to_2012/VOCdevkit/VOC2007/ImageSets/Main/train.txt”, and
  • “ROOT/content/darknet/VOC_data_from_2007_to_2012/VOCdevkit/VOC2007/ImageSets/Main/train.txt”.

Let’s actually create folders for our label files.

To prepare to create all of our label files, let’s first set up a for loop to allow us to use each of our four text files listings of Pascal VOC 2007 and 2012 training and validation images.

For each year and dataset, let’s create a list of image prefixes based on the text file listing “ROOT/content/darknet/VOC_data_from_2007_to_2012/VOCdevkit/VOC[year]/ImageSets/Main/[dataset].txt”.

To further prepare to create all of our label files, let’s set up a nested for loop to allow us to execute the same label-file creation code for each image prefix.

For each image prefix, let’s create and open the label file “ROOT/content/darknet/VOC_data_from_2007_to_2012/VOCdevkit/VOC[year]/labels/[image_prefix].txt” for writing.

While our label file is open for writing, let’s open the corresponding image-information (annotation) file for reading.

While our annotation file is open for reading, let’s parse the annotation file into an XML tree.

Let’s save the root annotation object defined in the annotation file to root.

Let’s save root’s size object to size_branch.

Let’s save size_branch’s width and height objects to image_width and image_height.

For each object with tag “object” in root, let’s save this object to object_branch, then…

Let’s save this object’s name to object_class_name.

Let’s decide whether this object is a difficult object to identify as a member of the object’s class.

If the object’s class name is actually in the list of classes associated with the Pascal VOC 2007 and 2012 datasets and the image is not too difficult to identify as a member of the object’s class…

Let’s save the index of the object’s class name in our list of classes to index_of_object_class_name_in_list_of_classes.

Let’s calculate the ratio of the horizontal position of the center of our bounding box / object to the image width.

Let’s calculate the ratio of the vertical position of the center of our bounding box / object to the image height.

Let’s calculate the ratio of the bounding-box width to the image width.

Let’s calculate the ratio of the bounding-box height to the image height.

And finally, let’s write to our present label file a string containing:

  • The index of the object’s class name in our list of classes,
  • A whitespace character,
  • The ratio of the horizontal position of the center of the object’s bounding box to the image width,
  • A whitespace character,
  • The ratio of the vertical position of the center of the object’s bounding box to the image height,
  • A whitespace character,
  • The ratio of the width of the object’s bounding box to the image width,
  • A whitespace character,
  • The ratio of the height of the object’s bounding box to the image height, and
  • A newline character.

Now that we’ve created all of our label files corresponding to all images in the Pascal VOC 2007 and 2012 training and validation datasets, let’s create a text file listing, called train.txt, of paths to images that we will actually use to train a YOLOv3 convolutional neural network on the Pascal VOC 2007 and 2012 datasets.

Let’s make our text file listing train.txt be the result of concatenating a text file listing of paths to all images in the 2007 training dataset, a text file listing of paths to all images in the 2007 validation dataset, a text file listing of paths to all images in the 2012 training dataset, and a text file listing of paths to all images in the 2012 validation dataset.

Let’s start a new code cell.

For each year (2007 or 2012) and dataset (train or val)…

Find again our list of image prefixes corresponding to this year and dataset.

Open for writing a text file listing of absolute paths to images associated with this year and dataset that lives in “ROOT/content/darknet/VOC_data_from_2007_to_2012”.

For each image prefix in our list of images prefixes…

Write to our open text file listing of absolute paths to images associated with the present year and dataset that lives in the “ROOT/content/darknet/VOC_data_from_2007_to_2012” folder the absolute path of the image with this prefix.

Now that we have four text file listings of absolute paths to images associated with specific years and datasets, I execute the following command to concatenate these listings into one text file listing of paths to images that I actually want to use to train a YOLOv3 convolutional neural network on the Pascal VOC 2007 and 2012 datasets.

To check that everything is ready for training, I will print the number of paths to images we actually want to use to train a YOLOv3 convolutional neural network on the Pascal VOC 2007 and 2012 datasets and a short sub-list of the paths to actual training images by executing the following Python code and the Linux command.

As you can see, we have 16,551 paths to actual training images. Now, we are ready to execute a darknet binary configured to train a YOLOv3 convolutional neural network on the Pascal VOC 2007 and 2012 datasets. Before we do, let’s set up a custom dataset and a text file listing of paths to training images associated with this custom dataset. Let’s reset back to our “ROOT/content/darknet” folder by executing “%cd ..” once.

Figure 12.2.2: Metadata for our text file listing of absolute paths to images we actually want to use to train a YOLOv3 convolutional neural network on the Pascal VOC 2007 and 2012 datasets.

12.3) Setting Up Custom Oak-Leaf Training and Validation Datasets Their Images and Information

To extract to our “ROOT/content/darknet” directory the folder Quercus_Dataset, of training images of oak leaves and label files encoding the positions and sizes of bounding boxes in each image, from the zipped folder “Quercus_Dataset.zip” in my Google Drive’s darknet folder, I navigate to the “ROOT/content” directory, by executing the command “%cd ..”, and execute the Python code

I return to our darknet directory by executing the command “%cd darknet”.

To create listings of images corresponding to training and validation datasets of images of oak leaves in Quercus, the genus of oak trees let’s find the number of images in Quercus_Dataset.

Let’s create an image-file / label-file / image-file / label-file… list of all of the files in Quercus_Dataset. Each label file of course records the positions and sizes of bounding boxes in an image and has the same prefix as that image, and immediately follows that image file in the list.

Let’s create a one-dimensional array of randomized image indices that can be used to create a randomized list of just image files.

Let’s create a randomized list of image files.

Given that as of November 7, 2019 there are 111 images in my dataset of oak-leaf images and label files, I define the number of images in my training dataset to be 100. I define the number of images in my validation dataset to be the difference between the number of images in my oak-leaf dataset and the number of images in my training dataset. I then write the first 100 images to darknet/Quercus_Dataset/train.txt and the remaining num_validation_images images to darknet/Quercus_Data/validate.txt.

I verify that train.txt and validate.txt are randomized listings of paths to Quercus_Dataset training and validation images.

The text file listing of paths to training images has 100 paths, and the text file listing of paths to validation images has 11 paths. The paths seem well randomized.

Figure 12.3.1: Metadata for our text file listing of relative paths to images we actually want to use to train a YOLOv3 convolutional neural network on my custom oak-leaf dataset.

We are now ready to execute the darknet binary in our “ROOT/content/darknet folder” to train a YOLOv3 convolutional neural network on the 2014 COCO dataset, the Pascal VOC 2007 and 2012 datasets, or my custom dataset of images of oak leaves!

13) Executing Our darknet Binary

Now it’s time to finally execute our darknet binary in the “ROOT/content/darknet” directory. To create weights files in the darknet/backup folder after hundreds of iterations of forward and backward propagation of batches of images from the COCO dataset through a YOLOv3 convolutional neural network, execute the command “!./darknet detector train cfg/coco.data cfg/yolov3.cfg darknet53.conv.74 -dont_show”. Weights files will appear in the backup folder after 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 2000, 3000, … iterations. You can use the resulting yolov3_last.weights, yolov3_1000.weights, yolov3_2000.weights, yolov3_3000.weights, … in lieu of darknet53.conv.74 to continue training from a weights file corresponding to partial training. You can also use the resulting weights files to attempt to verify that your YOLOv3 convolutional neural network is being trained successfully on the 2014 COCO dataset.

To create weights files in the darknet/backup folder after hundreds of iterations of forward and backward propagation of batches of images from the Pascal VOC 2007 and 2012 datasets through a YOLOv3 convolutional neural network, execute the command “!./darknet detector train cfg/voc.data cfg/yolov3-voc.cfg darknet53.conv.74 -dont_show. Weights files will appear in the backup folder after 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 2000, 3000, … iterations. You can use the resulting yolov3_last.weights, yolov3_1000.weights, yolov3_2000.weights, yolov3_3000.weights, … in lieu of darknet53.conv.74 to continue training from a weights file corresponding to partial training. You can also use the resulting weights files to attempt to verify that your YOLOv3 convolutional neural network is being trained successfully on the Pascal VOC 2007 and 2012 datasets.

To create weights files in the darknet/backup folder after hundreds of iterations of forward and backward propagation of batches of images from my custom oak-leaf dataset through a YOLOv3 convolutional neural network, I execute the command “!./darknet detector train cfg/obj.data cfg/yolov3-obj.cfg darknet53.conv.74 -dont_show. Weights files will appear in the backup folder after 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 2000, 3000, … iterations. I can use the resulting yolov3_last.weights, yolov3_1000.weights, yolov3_2000.weights yolov3_3000.weights, … in lieu of darknet53.conv.74 to continue training from a weights file corresponding to partial training. I can also use the resulting weights files to attempt to verify that my YOLOv3 convolutional neural network is being trained successfully on the my custom oak-leaf dataset.

Figure 13.1: Our three Linux commands to train three very different YOLOv3 convolutional neural networks on three very different datasets on the Google virtual machine associated with our Google-Colaboratory jupyter notebook.

At the beginning of execution of our darknet binary, we see a presentation of the core architecture of any YOLOv3 convolutional neural network, as well as the status of some hyperparameters that were defined in our configuration file.

Figure 13.2: A screenshot presenting a few lines of output associated with executing our darknet binary that describe some of the last convolutional layers in any YOLOv3 convolutional neural network.

A little later, we see output characteristic of successful training of any YOLOv3 convolutional network. Each line beginning with “v3 (” corresponds to one subdivision of four images and one of the three loss-calculating layers in any YOLOv3 convolutional neural network. The line beginning with “ 1: ” presents an average loss after forward propagation of one batch of sixty-four training images through the network, as well as the time our Google virtual machine took to perform this forward propagation. It is my guess that the presented average loss is the average of forty-eight losses, and that each of the forty-eight losses corresponds to forward propagation of one subdivision and is the output of one loss-calculating layer. It is my guess that our Google virtual machine will speed up in training over time, and that later presented times each correspond to one back-propagation and one forward propagation of a batch of training images.

Figure 13.3: A screenshot presenting a few lines of output characteristic of successful training of any YOLOv3 convolutional neural network.

Later on in training our YOLOv3 convolutional neural network on the 2014 COCO dataset, after thirty or so iterations, I found an average loss of about 1,000 (we started at about 1,700) and an average batch time over ten batches of about 7.5 seconds.

For perspective, I want our neural network to have an average loss as close to zero as possible, but loss decreases exponentially with time.

On another note, AlexeyAB recommends training for 2,000 propagation cycles per class, or for a minimum of 4,000 iterations. The 2014 COCO dataset has 80 classes. If one propagation cycle takes 7.5 seconds, we’re looking at 1.2 million seconds, or about 2 weeks, of continuous computation time. Please consider here that my Google virtual machine is about thirty times faster than my 2007 Windows-10 PC without GPU on forward propagation, but that my connection to my Google virtual machine times out after ninety minutes unless I interact with it. Additionally, after not too long it seems that Google Chrome crashes due to overloading memory available for storing code-cell output. To solve both of these problems, I run a simple Python script from PowerShell to imitate my mouse and periodically click the “Clear Output” button that becomes visible if you hover your cursor near the top left of the output of a code cell.

Another note: I have been told that Google kindly invites you to take a break from training after 12 hours. If you’ve generated yolov3_5000.weights and are working on generating yolov3_6000.weights, you might loose 2 hours’ worth of computation time. Additionally, while I haven’t made it to Google’s limit for us, I imagine it could take significant time to get training up and running again from yolov3_5000.weights, as well as significant mental resources. So let’s add another week to the training time for the 2014 COCO dataset. Three weeks total of continuous computation time on a Google virtual machine with GPU’s, as an estimate. Wow. Really makes you think about how amazing animals are.

Similarly, the Pascal VOC 2007 and 2012 datasets have 20 classes. If we have 2,000 iterations per class, and each iteration takes about 7.5 seconds, we’re looking at about 300,000 seconds, or about 4 days of continuous computation time. Let’s make it a week considering the need to ask our virtual machine to train multiple times.

Likewise, my custom oak-leaf dataset right now has 1 class. If we have 4,000 iterations total, and if each propagation cycle takes about 7.5 seconds, we’re looking at about 9 hours of continuous computation time. While it fits within one training window (assuming my Python mouse clicker is active), it’s still a sizable chunk of time.

A new thought: I believe AlexeyAB recommends restarting training from scratch when introducing new classes. So I have to be very careful when I prepare an leaf detector that can work with dozens of genera. But I could start training from one of my later weights files, because a YOLOv3 CNN can train on when initialized with any random weights file. Transfer learning to a broader set of classes might work okay. Something to try.

Regarding optimization: At this point, some members of our Artificial-Intelligence community seem to train systematically, but only after developing some intuition regarding appropriate architectures, hyperparameters, and parameter initializations. If training takes three weeks, how will I go about tweaking hyperparameters?

Regarding transfer learning and optimization, I might look for a way to parallelize and stagger training… Multiple PC’s? Cloud computing on multiple virtual machines? All the training all the time!

More thoughts later…

14) Summary

In summary, I have provided you a guide in downloading darknet, Joseph Chet Redmon and AlexeyAB’s deep-learning framework; compiling darknet.sln into a darknet executable file; arranging data files, configuration files, weights files, and names files; preparing datasets of images, label files, and text-file listings of paths to training images; and executing commands to run the darknet binary and train three YOLOv3 convolutional neural networks configured very differently on three very different datasets. We projected that it would take about 9 hours of continuous computational time to train a YOLOv3 convolutional neural network capable of bounding objects associated with a single class associated with my custom dataset, about 1 week of continuous computational time to train a YOLOv3 convolutional neural network capable of bounding objects associated with twenty classes associated with the Pascal VOC 2007 and 2012 datasets, and about 3 weeks of continuous computational time to train a YOLOv3 convolutional neural network capable of bounding objects associated with eight classes associated with the 2014 COCO dataset.

All that being said, while perhaps the darknet training process for YOLOv3 convolutional neural networks could be made more efficient, darknet is still one of the fastest, most accurate, and most powerful deep-learning frameworks for detecting objects in images, videos, and camera streams. Darknet is used in security, medicine, and the sciences.

In the near future, I hope to finish developing a Graphical User Interface to quickly and efficiently create label files for training images associated with a custom dataset; to develop a procedure for using darknet to label objects in a camera stream, to study the architecture of the darknet deep-learning framework and the mathematics behind deep neural networks and YOLOv3 convolutional neural networks, and to begin to conduct quantitative analyses of images, videos, and camera streams by accessing bounding-box position and size, number of bounding boxes appearing over time, number of objects, and/or relationships of objects. I hope you enjoyed this guide and will find it useful and inspiring. I’m excited to hear what you could do with your procedure for using darknet. I hope to create future videos presenting developing an image dataset, forward-propagating video frames in a camera stream through a YOLOv3 convolutional neural network, building neural networks using NumPy or PyTorch, and conducting data analyses. See you out there.

--

--