Custom DataSet in YOLO V8 !

3 min readAug 16, 2023

Let’s use a custom Dataset to Training own YOLO model !

First, You can install YOLO V8 Using simple commands.

!pip install ultralytics

You can refer to the link below for more detailed information or various other models.

Home

Explore a complete guide to Ultralytics YOLOv8, a high-speed, high-accuracy object detection & image segmentation…

docs.ultralytics.com

github :

GitHub - ultralytics/ultralytics: NEW - YOLOv8 🚀 in PyTorch > ONNX > OpenVINO > CoreML > TFLite

NEW - YOLOv8 🚀 in PyTorch > ONNX > OpenVINO > CoreML > TFLite - GitHub - ultralytics/ultralytics: NEW - YOLOv8 🚀 in…

github.com

Prepare the images for training

Before proceeding with the actual training of a custom dataset, let’s start by collecting the dataset ! In this automated world, we are also automatic data collection. Let’s explore how to automate data collection using Python, I’ll leave the method for Python automation in the link below. it will be very useful.

Link

Automatically Collecting Datasets for AI Training — Python

I am grateful to everyone reading this. Let’s introduce this week’s topic: automated dataset collection.

medium.com

[⭐UPDATE⭐]

Auto labeling — YOLO v8

Auto Labeling is not a dream !

AI generation is coming, and we need to collect more data to train our models or improve them. However, all of this…

medium.com

Using Auto labeling skill in annotation APP

Very Powerful Annotation APP !!

Long time no see !!!!!! Today I’ve brought an incredibly side-project this time ! it’s about the annotation app, It’s…

medium.com

Import libraries

from ultralytics import YOLO
import cv2
import matplotlib.pyplot as plt

import pandas as pd
import numpy as np

There are five models in YOLO V8. with the smallest one on top and the largest one on the bottom, For this exercise, I will using the largest model YOLOv8x.

https://github.com/ultralytics/ultralytics

model = YOLO("yolov8x.pt")
dict_classes = model.model.names

Download the dataset

hamster recognition Object Detection Dataset by 승강

193 open source hamster images. hamster recognition dataset by 승강

universe.roboflow.com

You can download the dataset from the webiste above. As far as You know, the creator is the author(ME) hahahahaha.

You can use the below command to download the dataset in zip format.

!wget -O hamster_Data.zip https://app.roboflow.com/ds/jbCLWvQOTl?key=pAdnrL5TmX

Using the zipfile library, extracrted all downloaded files from above and decompressed at the specified path.

import zipfile

with zipfile.ZipFile('/content/hamster_Data.zip') as target_file:
    target_file.extractall('/content/hamster_Data/')

Let’s see the .yaml file.

!cat /content/hamster_Data/data.yaml

train: ../train/images
val: ../valid/images
test: ../test/images

nc: 1
names: ['hamster']

roboflow:
  workspace: -vftqe
  project: hamster-recognition
  version: 1
  license: CC BY 4.0
  url: https://universe.roboflow.com/-vftqe/hamster-recognition/dataset/1

The crucial part we need to focus on is the top 5 lines

train: ../train/images
val: ../valid/images
test: ../test/images

nc: 1
names: ['hamster']

The first three lines (train, val, test) should be customized for each individual’s dataset path. The last two lines do not require modification as the goal is to identify only one type of object, a hamster.

Fix the content of .yaml file

import yaml

data = {'train' :  '/content/hamster_Data/train/images',
        'val' :  '/content/hamster_Data/valid/images',
        'test' :  '/content/hamster_Data/test/images',
        'nc': 1,
        'names': ['hamster']
        }

# overwrite the data to the .yaml file
with open('/content/hamster_Data/hamster_data.yaml', 'w') as f:
    yaml.dump(data, f)

# read the content in .yaml file
with open('/content/hamster_Data/hamster_data.yaml', 'r') as f:
    hamster_yaml = yaml.safe_load(f)
    display(hamster_yaml)

{'nc': 1,
 'names': ['hamster'],
 'test': '/content/hamster_Data/test/images',
 'train': '/content/hamster_Data/train/images',
 'val': '/content/hamster_Data/valid/images'}

Training own dataset !

print(len(model.names))
# 80 COCO dataset labels

model.train(data='/content/hamster_Data/hamster_data.yaml', epochs=50, batch=8)

The results are automatically saved in the ‘runs/detect/train’ path.

!scp /content/runs/ "/Destination path/"

Predict the model

results = model.predict(source='/content/hamster_Data/test/images',save = True)

!scp /content/runs/ "/Destination path/"

Enjoy it !

YOLOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO

Custom DataSet in YOLO V8 !

Home

Explore a complete guide to Ultralytics YOLOv8, a high-speed, high-accuracy object detection & image segmentation…

github :

GitHub - ultralytics/ultralytics: NEW - YOLOv8 🚀 in PyTorch > ONNX > OpenVINO > CoreML > TFLite

NEW - YOLOv8 🚀 in PyTorch > ONNX > OpenVINO > CoreML > TFLite - GitHub - ultralytics/ultralytics: NEW - YOLOv8 🚀 in…

Prepare the images for training

Link

Automatically Collecting Datasets for AI Training — Python

I am grateful to everyone reading this. Let’s introduce this week’s topic: automated dataset collection.

[⭐UPDATE⭐]

Auto Labeling is not a dream !

AI generation is coming, and we need to collect more data to train our models or improve them. However, all of this…

Very Powerful Annotation APP !!

Long time no see !!!!!! Today I’ve brought an incredibly side-project this time ! it’s about the annotation app, It’s…

Import libraries

Download the dataset

hamster recognition Object Detection Dataset by 승강

193 open source hamster images. hamster recognition dataset by 승강

Fix the content of .yaml file

Training own dataset !

Predict the model

Written by ChengKang Tan