Train YOLOv8 on a Custom Object Detection Dataset with Python

Kazi Mushfiqur Rahman
8 min readJul 24, 2023

--

Photo by BoliviaInteligente on Unsplash

Python project folder structure

Here, project name is yoloProject and data set contains three folders: train, test and valid. Every folder has two folders: images and labels.

Describe the directory structure and some configuration details for training a YOLO object detection model

Explanation of the above code:

Sure! A directory structure and some configuration information for setting up a YOLO object detection model appear to be described in the text that is provided. Let’s dissect it step-by-step:

Directories of Train and Validation Data
train: The location of the training pictures’ directory. The directory in this instance can be found at C:UsersUSERPycharmProjectsyoloProjectdatatrainimages.
val: The location of the validation pictures’ storage directory. The directory in this instance can be found at C:UsersUSERPycharmProjectsyoloProjectdatavalidimages.
You would store your tagged photos for training and validation in these directories. During training and parameter optimization, the model will utilize the images from the train directory. During validation, it will use the images from the val directory to assess its performance.

Classification Number (nc):

nc: The number of object categories or classes that the YOLO model has been trained to recognize. Five classes are present in this instance.
Names of classes

names: A list of names that are associated with the different types of objects that the YOLO model is taught to recognize. The model in this instance is trained to recognize objects in the following classes:
‘Helmet’ ‘Goggles’ ‘Jacket’ ‘Gloves’ ‘Footwear’
The YOLO model will attempt to recognize a certain type of object in the photographs during both training and validation, and each class name in the names list corresponds to that sort of object.

Overall, the dataset and the item classes you want the YOLO model to learn to detect are defined by this information. Using the provided training and validation photos, along with the corresponding annotations or bounding boxes for the items in the images, you may now begin to train the YOLO model.

Load YOLO model from GitHub and perform prediction on an image

When we run the above code we get the following output

Here, the result of prediction is visible

Explanation of the above code:

I’ll lay out the code in simple terms, step by step:

Library Imports: The code begins by importing the required libraries. It does so by utilizing the “ultralytics” package, which offers a YOLO (You Only Look Once) object detection algorithm implementation.

The model is downloaded and loaded: The path to a “yolov8n.pt” pre-trained model file is sent to the code to initialize a YOLO object detection model. A pre-trained YOLO model that has been trained on a sizable dataset should be included in this file. The “n” in “yolov8n” could stand for a particular iteration or variation of the YOLO model.

Using the initialized YOLO model, the code then makes predictions about the items in an online image. The predict function, which accepts the following inputs, is used.

source: The image’s download URL, which will be utilized for prediction. In this instance, it makes use of the dog image from the supplied URL.
conf: a confidence threshold (0.25 in this instance), which specifies the minimal confidence level necessary for a valid detection of an object. More detections with poorer confidence may result from lower values.
Results of Prediction: The YOLO model examines the image and makes predictions about the existence and locations of the objects it has been trained to recognize. The results variable holds the outcomes.

The pre-trained YOLO model file (“yolov8n.pt”) must be present in your current working directory or at the provided path for this code to operate.

Furthermore, as the code downloads an image from the internet, a working internet connection is necessary in order to obtain the image for prediction.

The YOLO model’s predictions, which typically comprise the class labels, confidence scores, and bounding box coordinates of the detected objects in the image, are stored in the results variable. These outcomes can be further processed and shown to show what things the model has identified in the provided dog image.

Download segmentation model from GitHub and predict an image from local machine

After executing the above code we get the following output:

Here, the result of prediction is visible

Explanation of the above code:

In 5th line from the above code

Download and Loading Segmentation Model: To use the pre-trained segmentation model, you must download it from the internet and load it using the correct model class and library.

In 6th line

predict an image using predict() function

Download detection model from GitHub and train it

After executing the above code we get the following output:

………………

Explanation of the above code:

The model is downloaded and loaded: The path to a “yolov8s.pt” pre-trained model file is sent to the code to initialize a YOLO object identification model. A pre-trained YOLO model that has been trained on a sizable dataset should be included in this file.

Instruction of the Model: The YOLO model’s training is then started by the code using the train function. It includes the following criteria for training:

data: The location of a configuration file (dfire.yaml) that contains details about the dataset, such as the number of classes, the location of the training pictures and annotations, etc.

The total number of training epochs determines how frequently the complete dataset will be utilized to update the model while it is being trained.
imgsz: The dimensions of the input images used for training (in this case, 64x64 pixels).
Plots: A boolean value that specifies whether or not the training progress will be plotted and shown.
Training: The model will iterate over the dataset several times (determined by the number of epochs) during training. It will learn how to recognize objects, such as fires in this case, in each epoch by using the images and annotations from the dataset.

Please be aware that the code relies on the given pre-trained model file (“yolov8s.pt”) as well as the dataset configuration file (“dfire.yaml”) containing the necessary dataset and model information.

Remember that deep learning model training can be computationally and time-consuming, especially for object identification, so you would need a strong GPU and enough resources to complete this operation successfully. For precise fire detection, a small image size might not be the best option. In general, stronger detection performance would be more appropriate for larger image sizes.

perform validation through the trained model

After executing the above code we get the following output:

Explanation of the above code:

In 5th line

Loading the identification Model: The code loads a pre-trained model file called “best.pt” to start a YOLO object identification model. The specified file path has a problem, though. The model might not load if the file “runs/detect/train6/weights/best.pt” does not exist. The file path appears to be missing or incorrect.

In 6th line

Validation on Test Dataset: Using the val function, the code makes an attempt to do validation on the test dataset. The val function is frequently used to assess how well a pre-trained model performs on a different validation dataset. The validation procedure, however, might not function as intended given that the model loading was unsuccessful as a result of the wrong file location.

Typically, the following steps are required to correctly train a detection model:

Get the dataset ready: Create training and testing sets from your dataset and add annotations (such as bounding boxes or masks) for the items you want the model to recognize.

What is the model architecture? Use a deep learning framework like PyTorch or TensorFlow to build the model architecture and select a YOLO variant (such as YOLOv3, YOLOv4).

Develop the model: To train the model on the selected YOLO version, use the training set. In order to improve the model’s parameters, this method entails loading the data, establishing loss functions, and iterating over the dataset.

certify the model: After training, assess the model’s performance on the test dataset to gauge its precision and generalizability.

You should check the documentation and usage examples of the “ultralytics” library or consult the GitHub repository where you obtained the code to ensure the right procedures for downloading, training, and validating the YOLO model because the provided code is incomplete and there may be problems with the model file path.

Show image from trained image

After executing the above code we get the following output:

Explanation of the above code:

Certainly! The Python script you gave uses the PIL (Python Imaging Library) and matplotlib libraries to load and show an image. Let’s dissect it step-by-step:

Adding Libraries: The code first adds the necessary libraries:

Python Imaging Library is referred to as PIL. Python uses it to manipulate images.
Matplotlib is a well-known Python charting toolkit that is used to produce visuals.
the image’s path is set: The file path of the image you want to display is stored in the variable image_path, which is defined. The ‘confusion_matrix.png’ file is located in the ‘runs/detect/train6/’ directory in this instance, as shown by the path.

Opening the Image: The image file supplied by image_path is opened and loaded into memory using the Image.open() function from the PIL package. An Image object that represents the loaded image is returned by the Image.open() function.

The matplotlib library is used by the code to display the loaded image. The Image object is the input for the plt.imshow(image) method, which displays the image on the screen.

Displaying the Image: To display the image, the plt.show() function is used. This feature causes the image to be displayed in a different window.

To sum up, the code loads an image from the image_path supplied and then displays the image on the screen using matplotlib. The program makes the assumption that the “confusion_matrix.png” file is a valid image file that can be opened and shown, and that it is located at the specified location.

Load an image from local project and display it

After executing the above code we get the following output:

--

--

Kazi Mushfiqur Rahman

B.Sc. In CSE, Software Engineer, Techneous | Python | Django | DRF | Computer Vision | OpenVINO - AI Framework | JS | Ajax | Devops | Technical writter