How to train an Object Detector with your own COCO dataset in PyTorch (Common Objects in Context format)

Understanding the Dataset & DataLoader in PyTorch

Takashi Nakamura
Nov 5 · 4 min read

Background

PyTorch has multiple well known Computer Vision models built-in, which can readily be used for transfer learning as well as training your own models. There are many examples and official tutorials, e.g.

Dataset class

What is the Dataset class?

The Dataset class enables you to generate (or pull) your data using multiple cores, and to feed the generated data to the model. In short, it is an efficient data generation utility.

Do we need to use the Dataset class?

Well, we are able to run deep learning models without the Dataset class; but loading a dataset is generally memory-intensive — so it’s highly recommended to use the Dataset class.

How do we use Dataset class?

Need to define following:

  1. __getitem__(): How to generate samples from the data (What kind of data you want)
  2. (Optional) __len__(): The total number of samples.

Example

  1. I have 3 jpeg images in folder my_data
  2. The names of image are img1.jpg, img2.jpg, and img3.jpg;
  3. The labels of image are [0, 1, 1] (e.g. img1.jpg is a black cat image, whereas both img2.jpg and img3.jpg are tabby cat images);
  4. I would like to efficiently load the image and label using theDataset class.
Simple Dataset Class for returning images and labels
Usage of the simple Dataset Class
Loading your simple Dataset and visualising the results

Step-by-step solution for COCO data

Tasks

  1. I followed the tutorial linked above. I needed to download pycocotools, which needed the C compiler — Install Cython;
  2. As the official tutorial mentioned (also seen the above simplified example), the PyTorch data loading utility is the torch.utils.data.DataLoader class. It represents a Python iterable over a dataset. For my dataset, I needed to create my own Dataset class, torch.utils.data.Dataset;
  3. The example of COCO format can be found in this great post;
  4. I wanted to implement Faster R-CNN model for object detection.

Modify Dataset class for COCO data

First, as the official documentation mentioned, I needed to overwrite __getitem__(), to fetch a data sample for a given key. Also, subclasses could optionally overwrite __len__() .

  1. Opening the corresponding image files
Example COCO Dataset class
  1. In the above tutorial, they implemented Mask R-CNN — which needs “mask” information for my_annotation. It is not required for Faster R-CNN.
  2. The inputs for a PyTorch model must be in tensor format. I defined get_transform() as below.

Setup own DataLoader

Once I had created my own Dataset class, it was time to set up a DataLoader.

Check DataLoader

Let’s check whether our DataLoader pulls images and annotations iteratively.

Run the model

Now we have prepared our own COCO-formatted data, ready for the Faster R-CNN model. It is straight forward to modify a few parameters in order to customise the model (e.g. number of anchor boxes, etc.). A simplified implementation using some lines from the official tutorial is presented below:

Conclusion

This article covered how to prepare your own COCO dataset, for use with an object detection model in PyTorch. During the exercise, I concluded that PyTorch is less complicated than other deep neural networks frameworks, especially for Computer Vision tasks. Having said that, I could not grasp the idea of the Dataset and DataLoader classes at the beginning and hopefully this article helps you develop some intuition!

FullStackAI

Musings on AI, Engineering and more…

Takashi Nakamura

Written by

Data scientist and consultant. PhD in Signal Processing.

FullStackAI

Musings on AI, Engineering and more…

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade