YOLOv9 vs. YOLOv8: Segmentation & Fine-Tuning Guide

Comparison, demo, and training on your own dataset

Oliver Lövström

Follow

Published in

Internet of Technology

4 min readApr 29, 2024

--

The YOLOv9 model for object segmentation was released recently, offering superior performance to the previous YOLOv8 model. This article will compare YOLOv8 and YOLOv9, showcase YOLOv9 segmentation, and include a guide for fine-tuning YOLOv9 on your own datasets.

Source: Photo by Kam Idris on Unsplash, Modified by Oliver Lövström

Comparison and Showcase

In this section, we will compare YOLOv8 and YOLOv9 performance and quickly showcase YOLOv9 segmentation.

YOLOv8 vs. YOLOv9

The new YOLO model uses techniques such as Programmable Gradient Information (PGI) and Generalized Efficient Layer Aggregation Network (GELAN) to improve performance [1]. YOLOv8 accuracy and efficiency:

Recently, two models using YOLOv9 for object segmentation were released, improving upon the performance of the previous generation:

In summary, YOLOv9 improves accuracy and efficiency for object segmentation.

Demo

YOLOv9e segmentation demo:

Training

It’s possible to train YOLOv9 on your own segmentation data. The process can be divided into three steps: (1) Installation, (2) Dataset Creation, and (3) Fine-tuning/Training.

Installation

Begin by installing the Ultralytics framework:

pip install ultralytics

Dataset

Choose or create the dataset you need. The dataset needs to be in YOLO segmentation format, meaning each image shall have a corresponding text file (.txt) with the following content:

<class-index> <x1> <y1> <x2> <y2> ... <xn> <yn>
...
<class-index> <x1> <y1> <x2> <y2> ... <xn> <yn>

Where each row is an object, class-index is an integer representing the class and xn, yn are the normalized screen coordinates of the segmentation points. Make sure to include background images, i.e., images without detected objects. Background images don’t need an accompanying annotation file.

After finding a dataset and completing the image annotations, organize the dataset in the following way:

path/to/dataset/
├─ train/
│  ├─ img_0000.jpg
│  ├─ img_0000.txt
│  ├─ ...
│  ├─ img_0999.jpg
│  ├─ img_0999.txt
├─ val/
│  ├─ img_1000.jpg
│  ├─ img_1000.txt
│  ├─ ...
│  ├─ img_1099.jpg
│  ├─ img_1099.txt

Fine-Tuning

This section is for you if you want to train YOLOv9 on your custom data. If you’re just looking to use the model, skip ahead to the section Inference and Segmentation.

First, begin by creating a training configuration file:

# train.yaml
path: path/to/dataset
train: train
val: val

names:
  0: person
  1: bicycle
  2: car
  # ...
  77: teddy bear
  78: hair drier
  79: toothbrush

The configuration file shall contain the paths to the training and validation sets, class names, and class mapping.

Finally, train the model using the Ultralytics framework:

from ultralytics import YOLO

model = YOLO("yolov9c-seg.yaml")
model.train(data="path/to/train.yaml", epochs=100)

Make sure to use the correct segmentation model depending on your time constraints and hardware:

- yolov9c-seg.yaml
- yolov9e-seg.yaml

Best Practices

If you want to optimize the training performance, read this guide:

YOLOv8: Best Practices for Training

Guide for data augmentation and hyperparameter tuning with YOLOv8

medium.com

Inference and Segmentation

Run inference:

results = model("images/sofa.jpg")

Plot segmented masks:

import numpy as np
import matplotlib.pyplot as plt
import cv2

for result in results:
    height, width = result.orig_img.shape[:2]
    background = np.ones((height, width, 3), dtype=np.uint8) * 255

    masks = result.masks.xy
    for mask in masks:
        mask = mask.astype(int)
        cv2.drawContours(background, [mask], -1, (0, 255, 0), thickness=cv2.FILLED)

    plt.imshow(background)
    plt.title('Segmented objects')
    plt.axis('off')
    plt.show()

    plt.imsave('segmented_objects.jpg', background)

Plot segmentation mask with original colors:

import cv2
import numpy as np
import matplotlib.pyplot as plt

for result in results:
    height, width = result.orig_img.shape[:2]
    background = np.ones((height, width, 3), dtype=np.uint8) * 255
    masks = result.masks.xy
    orig_img = result.orig_img

    for mask in masks:
        mask = mask.astype(int)
        mask_img = np.zeros_like(orig_img)

        cv2.fillPoly(mask_img, [mask], (255, 255, 255))
        masked_object = cv2.bitwise_and(orig_img, mask_img)
        background[mask_img == 255] = masked_object[mask_img == 255]

    background_rgb = cv2.cvtColor(background, cv2.COLOR_BGR2RGB)

    plt.imshow(background_rgb)
    plt.title('Segmented objects')
    plt.axis('off')
    plt.show()

    cv2.imwrite('segmented_objects.jpg', background)

YOLOv9 vs. YOLOv8: Segmentation & Fine-Tuning Guide

Comparison, demo, and training on your own dataset

Comparison and Showcase

YOLOv8 vs. YOLOv9

Demo

Training

Installation

Dataset

Fine-Tuning

Best Practices

YOLOv8: Best Practices for Training

Guide for data augmentation and hyperparameter tuning with YOLOv8

Inference and Segmentation

Further Reading

Machine Learning

Offered by Stanford University and DeepLearning.AI. #BreakIntoAI with Machine Learning Specialization. Master…

References

Published in Internet of Technology

Written by Oliver Lövström

Responses (1)