Converting a custom dataset from COCO format to YOLO format

Abdul Rehman
Red Buffer
Published in
5 min readJul 28, 2022

--

Photo by LouisMoto on Unsplash

Recently, I had to use the YOLOv5 for object detection. To train the model, your custom dataset must be in the YOLO format and if not, online tools are available that will convert your custom dataset into your required format. Similarly, if your dataset is in COCO format, you can use online tools to convert it from COCO (JSON) format into YOLO format. But the problem I faced was that the dataset given to me was not in complete COCO format and when I tried an online tool for converting COCO dataset into YOLO format, it did not do so accurately. This is when I realized that I needed to create my own tool.

In this article, we will discuss the complete procedure to convert a dataset into a YOLO format set by step.

Table of Contents

  • What is YOLO format?
  • Current Dataset Format
  • Goal
  • Implementation
  • Testing
  • Outcome
  • References

What is YOLO format?

In the YOLO format, every image in the dataset has a single text file. If an image has no objects there is no text file for that image. Inside the text file, each row contains the following information:

(class_id, x_centre,  y_centre,  width,  height)
The first column contains the class ids (0,27), the second and the third columns contain the midpoint coordinates of a bounding box, and the fourth and the fifth columns contain the width and height of a bounding box respectively.

Let's suppose we have ten images in the dataset then there must be ten text files if all ten images contain some object. For example, for the zidane.txt file, I showed above, there must be zidane.jpg in the dataset. Following is the directory structure of the YOLO format dataset:

Current Dataset Format(COCO like):

dataset_folder
→ images_folder
→ ground_truth.json

In the dataset folder, we have a subfolder named “images” in which we have all images, and a JSON file containing annotations for all the images in the folder. Annotation file contains the information about the image including image id, category, bbox, etc. But the first two elements(x,y) of the bounding boxes are the top left corner coordinates of the box.

Dataset’s ground_truths.json file contains the information about the images, annotations, and categories.

GOAL(YOLO Format):

Creating a new dataset having the following properties:

The new dataset will contain two folders named “images” and ”labels”. In the images folder, all the images will be saved while their respective text files will be kept in the labels folder.
  • Both image files in the images folder and its relative text file in the labels folder must have the same filename.
    For example:
    images
    →0001.jpg
    labels
    → 0001.txt
  • Each text file must fulfill all the properties of the YOLO format text file which are the following:
    1. The first element of each row is a class id, then bounding box properties (x, y, width, height).
    2. Bounding box properties must be normalized (0–1).
    3. (x, y) should be the mid-points of a box.

Implementation

Imports

Importing required packages.

Setting Paths

  • The input path is the path of the dataset that we want to convert into the YOLO format.
  • Output path is the path where a new converted dataset will be saved.

Reading JSON Annotation file

Reading the Annotations file from the custom dataset (in COCO format) using JSON.

Processing Images

Reading source image from the current dataset, renaming it according to the relative labels file and saving to the new dataset location. Also saving image’s filenames for further use.

Helper Functions

This function takes an image_id as a parameter and returns the annotations of that image.

To get the relative annotations for an image, I have created a helper function and its implementation details are given below:

  • Create an empty list to store the image annotations: img_ann
  • Using a for loop to traverse through all the annotations in the data.
  • Uses an if condition to check for the required image_id .
  • If image id matches, append the annotations for this image into the list we have created i.e: img_ann
  • I am using an isFound as a flag to check whether an image’s annotations are found or not. In case of not found, I am returning a None.
Getting the image from the dataset, by providing the filename of the image.

data['images'] in a dataset is a list of dictionaries containing the image information. For example:

In the function I mentioned above get_img() I want to extract the relative image information such as 'id' , 'height' and 'width' by matching image filenames. If the filename matches, it returns image information in the form of a dictionary.

Processing Labels
Applying Conversion.
Following are the steps we are going to perform in conversion:

  • Extracting image information such as image_id, image_width, image_height, etc.
  • Get annotations for this image using image_id.
  • Open a text file for this image in the output path given by the user.
  • Extract bounding box properties for each object in the image.
  • Finding midpoint coordinates.
  • Apply Normalization.
  • Setting precision.
  • Writing the updated annotations for this image into a text file.
  • After processing through all the annotations for the current image, close the text file.
  • Repeat the steps for all images.

Conversion Completed
A new dataset(converted_dataset) will be created at the output path mentioned, containing images and labels in the YOLO format.

Testing
Before training the model on the new converted dataset, we should check the dataset. To check that the new converted dataset meets the requirements and is correctly converted, I am using an online tool Roboflow and the results of some sample images are mentioned below:

Picking up a sample image from the new converted dataset, and testing whether its relative text file contains the correct labels or not.

Image:

Checking an img0.jpg from the images folder of a new converted dataset.

Labels:

Checking an img0.txt from the labels folder of new converted dataset.

Results:

Roboflow draws the bounding boxes around all the objects using the labels file.

Outcome

This article was a step-by-step guide on how you can create your own custom dataset in the YOLO format for training the object detection model. I tried to convert the dataset using simple python code. To run the code you can copy the code blocks and paste them into any of the notebooks you like such as Jupyter or Colab.

Hopefully, now you have an idea of how to implement different techniques to convert the users' datasets into your required format.

References

--

--