YOLOv8 Fine-Tuning

Detecting Pug Images

Robert Hernández Martínez
LatinXinAI
6 min readJul 9, 2024

--

Toto pug: Who is this?

Use case

Detect pugs in pictures with Ultralytics YOLOv8, a cutting-edge, state-of-the-art (SOTA) model for object detection and other tasks. Improvements in performance and flexibility by tuning the model in Google Colab.

Step 1: Collect images

For this example, we collected two different sets of images from the internet: One for Training and other for Validation, organizing the folders as follows:

My path to images in Google Drive.
Example of images set.

Step 2: Create annotations

There are many annotation tools in the market, Make Sense is open source, free to use and works in your browser. Here is an example of how to create bounding boxes and label images:

Creating annotations in images: Pugs bounding boxes and labels.

Step 3: Download YOLO labels and create yaml file

Once you have done with annotations in Training and Validation datasets, download the YOLO zip file with every normalized value from each image in a txt file. Is recommended to have images and txt files in their correspondent folder, for example, the Train folder.

Seven annotations in the txt file for the same number of pugs in the image.

Configure the yaml file with the path for training and validation images and labels. In this case, there is only one class: pug.

Yaml file for pug class.

If you have different classes, make sure to follow this structure: yolov5/data/coco128.yaml at master · ultralytics/yolov5 (github.com)

Step 4: Training the model

Here you will find a simple snippet to train the model. You can test other models available on Hugging Face website: Ultralytics/YOLOv8 at main (huggingface.co)

from ultralytics import YOLO
from google.colab import drive
drive.mount("/content/drive")

# Load a model
model = YOLO("yolov8s.pt") # load a pretrained model (recommended for training)

# Use the model
model.train(data = "your_drive_path/Project Pug Image Detection/pug.yaml", epochs=5, conf=0.5) # train the model

results = model.predict(source = "your_drive_images_folder_path/Project Pug Image Detection/Train/", show=True, conf=0.5) # predict on images folder

Inference argument conf sets the minimum confidence threshold for detections. We require confidence for at least 0.5 as proposed threshold to avoid false detections. Objects detected with confidence below this threshold will be disregarded. Adjusting this value can help reduce false positives.

The output of the training provides metrics of great value:

Basic configurations for model training. See Train — Ultralytics YOLO Docs for more details.

For this use case, we tested the yolov8s.pt model. It took three hours to complete, the output contains two files for prediction: last.pt and best.pt

The results are saved in the temporary path: runs/detect/train2 changing the number with a new execution.

Model results after 70 epochs in 3 hours training.

Step 5: Model prediction

Since the model depends on the number of epochs and long training duration, is highly recommendable to run it in Google Colab. This use case required 50 images for training, 50 images for validation, and 70 epochs. However, 300 epochs is the standard according to documentation: Tips for Best Training Results — Ultralytics YOLO Docs

Use the model to predict in new images, different from the ones used for training and validation. These images can be allocated in a “New_Images” folder.

Here’s a snippet for train number 2. Note we use the last.pt model from the previous training:

from ultralytics import YOLO
import os

# Load a model
model = YOLO("runs/detect/train2/weights/last.pt") # load a pretrained model (recommended for training)

# Specify the directory containing your images
image_dir = "your_drive_path/Project Pug Image Detection/New_Images/"

# Get a list of image file names
image_filenames = [os.path.join(image_dir, filename) for filename in os.listdir(image_dir) if filename.lower().endswith((".jpg", ".jpeg", ".png", ".webp"))]

# Specify the directory path for saving results
save_dir = "your_drive_path/Project Pug Image Detection/Results_New_Images/"

os.makedirs(save_dir, exist_ok = True)

# Process results for each image
for image_filename in image_filenames:
# Run inference on the image
results = model([image_filename], stream=True, conf=0.5)

for i, result in enumerate(results):
# Extract the original image name (without extension)
original_image_name = os.path.splitext(os.path.basename(image_filename))[0]

# Save results with modified filename
result_filename = f"{original_image_name}_result.jpg"
result.save(filename=os.path.join(save_dir, result_filename))

# Optionally display the result

print("Results saved succesfully!")

You will have prediction results in the new folder as follows:

Results folder for images different from the ones used in training and validation. See Predict — Ultralytics YOLO Docs for more details.
Pug on a rainy day.
Pug in a costume party.
Toto pug - Attention.

Performance metrics

Model summary presents metrics to evaluate the accuracy and efficiency of the object detection model, this means how effectively a model can identify and localize objects within images.

According to YOLO Performance Metrics — Ultralytics YOLO Docs to understand how well the model is doing for a specific class, granular information is provided, such as Box (P, R, mAP50, mAP50–95) giving insights on the model’s performance in detecting objects:

P (Precision): The accuracy of the detected objects, indicating how many detections were correct.

R (Recall): The ability of the model to identify all instances of objects in the images.

mAP50: Mean average precision calculated at an intersection over a union (IoU) threshold of 0.50. It’s a measure of the model’s accuracy considering only the “easy” detections.

mAP50–95: The average of the mean average precision calculated at varying IoU thresholds, ranging from 0.50 to 0.95. It gives a comprehensive view of the model’s performance across different levels of detection difficulty.

Model summary for the use case:

Model summary for Project Pug Image Detection.

Fine-tuning

It’s an iterative process aimed to optimize the machine learning model’s performance metrics, such as accuracy, precision, and recall.

The tuning process produces several files and directories with the results of the tuning. For the train2 example, it looks like this:

Output for training number 2.

Snippet to save from Google Colab to your Drive:

# Save the output from path "runs/detect/train2/" into my Drive

!cp -r /content/runs/detect/train2/ "your_drive_path/Project Pug Image Detection/runs_Colab/"

Make sure the results plot shows loss going down as you can see here after 70 epochs.

Results plots from YOLOv8 in train2.

A guide to explore performance metrics associated with YOLOv8, their significance, and how to interpret them can be found here: YOLO Performance Metrics — Ultralytics YOLO Docs

Now let’s see how the model performs with real data for the validation set:

Validation set labels.
Validation set prediction.

Things are not perfect, so model tuning needs to continue and eventually will require more computational power such as a Graphics Processing Unit (GPU), a specialized hardware component that is capable of efficiently handling parallel mathematical operations, surpassing the general-purpose capabilities of a Central Processing Units (CPU). Essentially, a GPU can do it quickly enough to be useful in real-time graphics applications.

More about Cloud GPU Rental Providers here: Cloud GPUs // The Best Servers, Services & Providers [RANKED!] (github.com)

Conclusion

YOLOv8 is easily adaptable to many use cases, provides features and capabilities for machine learning practitioners to maximize potential in their projects.

Combined with a GPU service, YOLOv8 models can be implemented quickly to respond to specific requirements in an efficient way, and of course, detect pugs.

Keep up sharing. 😊

LatinX in AI (LXAI) logo

Do you identify as Latinx and are working in artificial intelligence or know someone who is Latinx and is working in artificial intelligence?

Don’t forget to hit the 👏 below to help support our community — it means a lot!

--

--