💣Notes to Self: Image Segmentation with YOLOv8

Ali Cenk Baytop

5 min readMar 20, 2024

I write articles to enhance my abilities and keep them as a reminder for my future.

“Artificial Intelligence is whatever hasn’t been done yet.” — Larry Tesler

📚📖📃Content

1. What is Image Segmentation?

2. What is YOLO?

3. What is YOLOv8 Segmentation Model?

4. Dataset Preparation

5. Train, Validation and Test

In this article, I explain how to apply YOLOv8 segmentation model easily. Through this exploration, we will dive into the core concepts of image segmentation and basic codes of YOLOv8. From understanding the underlying principles to practical implementation, this article aims to teach you to build your own model for your custom dataset YOLOv8 for image segmentation tasks.

1. What is Image Segmentation?

In computer vision field, the ability to understand and process images is essential. Image segmentation, a fundamental task within this domain, plays a pivotal role in visual data into meaningful components, therefore enabling machines to understand and interact with the visual world more effectively.

Imagine a scenario where an autonomous vehicle needs to detect pedestrians, vehicles, and obstacles on the road to navigate safely or consider the field of medical imaging, where precise decisions are crucial for accurate diagnosis and treatment planning. These are just a few examples of image segmentation in various real-world applications.

2. What is YOLO?

YOLO, which stands for “You Only Look Once,” is a pioneering object detection algorithm that revolutionized the field of computer vision.

The key principles behind YOLO’s functioning can be summarized as follows:

Single Shot Detection: YOLO adopts a single-shot detection approach, wherein it predicts bounding boxes and class probabilities directly from the full image in a single evaluation. This results in faster inference times compared to multi-stage detection methods.

Grid-based Prediction: YOLO divides the input image into a grid and predicts bounding boxes and class probabilities for each grid cell. Each grid cell is responsible for predicting objects whose center falls within the cell.

Anchor Boxes: To handle variations in object sizes and aspect ratios, YOLO utilizes anchor boxes, which are predefined bounding boxes of different shapes and sizes. These anchor boxes are learned during training and aid in improving detection accuracy.

Feature Extraction: YOLO utilizes a convolutional neural network (CNN) as its backbone for feature extraction. This CNN processes the input image to extract high-level features that are subsequently used for object detection.

YOLO has gained widespread popularity for its efficiency in real-time object detection tasks, making it suitable for applications such as video surveillance, autonomous driving, and augmented reality.

3. What is YOLOv8 Segmentation Model?

YOLOv8 Segmentation Model is an extension of the YOLO algorithm tailored specifically for image segmentation tasks. While the original YOLO algorithm excels at object detection, YOLOv8 Segmentation Model enhances its capabilities to perform pixel-level segmentation, wherein each pixel in the image is assigned a semantic label corresponding to the object it belongs to.

YOLOv8 Segmentation Model achieves this by incorporating additional layers and modules into the YOLO architecture, enabling it to produce segmentation masks alongside bounding box predictions. By leveraging the strengths of YOLO for efficient and accurate object detection, YOLOv8 Segmentation Model offers a unified solution for both detection and segmentation tasks in computer vision.

4. Dataset Preparation

At first, start with choosing your object then if you want to use custom dataset and prepare by yourself, I suggest this way with simple-image-download:

If you want to use my Duck dataset link is here.

pip install simple-image-download

from simple_image_download import simple_image_download as simp

response = simp.simple_image_download

OBJECT = "NAME OF OBJECT HERE"

keywords = [OBJECT]

SIZE = 100 #NUMBER OF IMAGE

for kw in keywords:
    response().download(kw, SIZE)

Then, you can easily use labelme tool.

pip install labelme

After labeling, you have JSON files, but YOLO models accept txt so, to convert them all you may use labelme2yolo:

# --json_dir: where you have your labeled JSON dataset

pip install labelme2yolo

labelme2yolo --json_dir path/Dataset

Reorganize your directory as below:

tree

C:.
├───Duck
│ ├───test
│ │ └───images
│ ├───train
│ │ ├───images
│ │ └───labels
│ └───val
│ ├───images
│ └───labels

Let’s set up your dataset.yaml file to specify folder’s location.

# train: directory of train images.
# val: directory of validation images.
# nc: number of classes.
# names: names of classes.

train: path/Duck/train/images
val: path/Duck/val/images
nc: 1
names: ["duck"]

5. Train, Validation and Test

Used (YOLOv8n-seg) YOLOv8 nano segmentation model but depends on your task and capabilities, you may use other models too.

Original YOLOv8 model GitHub.

Train and Validation:

pip install ultralytics

from ultralytics import YOLO

if __name__ == "__main__":  
    model = YOLO("yolov8n-seg.pt")
    results = model.train(data="Duck/dataset.yaml", epochs=100, batch=8)

After training process, you can examine runs/segment/train folder for training and validation results.

Test:

# show=True, save=True, show_labels=True, show_conf=True, conf=0.1, save_txt=True, save_crop=True, line_width=2, box=True, visualize=True

from ultralytics import YOLO

if __name__ == "__main__": 
    model = YOLO("runs/segment/train/weights/best.pt")
    model.predict(source="Duck/test/images/duck23.jpeg", show=True, save=True, conf=0.5, iou=0.5)

You can check results runs/segment/predict folder.