Create Your Own Dataset for Instance Segmentation

Preparing data for Deep Learning

Muhammad Arnaldo
Analytics Vidhya
3 min readJan 17, 2021

--

The Cats photo by Jari Hytönen on Unsplash

Maybe you have already made several image classification model. Utilizing datasets from Kaggle or the built-in dataset from TensorFlow and PyTorch. You can build your own datasets for image classification easily. But what about instance segmentation?

Overview

One of the tasks which need to be done before we can start training our instance segmentation model is to annotate the image data. In the image classification task, we only need to collect image data and separate them into folders according to their class.

Instance segmentation requires us to do an additional task, which is to annotate the object of interest. In this way, we are telling our machine learning model which pixels in an image belong to the specific class.

Installing Labelme

Here we will use labelme to help us do the job. So, what is labelme?

Labelme is a graphical image annotation tool inspired by http://labelme.csail.mit.edu. It is written in Python and uses Qt for its graphical interface. — Kentaro Wada

You can get labelme and read the docs at this link:

There are several ways to install labelme. We can install it from anaconda or docker. Also, it can be installed from the command line or standalone installation. We can get the standalone executable/app at this link:

Here you’ll see several versions of a standalone installation, and you can download according to your operating system.

Labelme standalone installation

After the installation is successful, we can launch the app from the terminal, by typing this command and hit enter:

We can start with opening an image or click open dir to open the whole image in a folder. Here’s how it looks.

The user interface of Labelme

Annotating the image data

To begin annotating the image data we can click on the create polygon menu on the left sidebar and then create a polygon around the edge of the object of interest. This is actually similar to the pen tools on Photoshop application. Once we finished, we can put a name for the annotation. Repeat this process to annotate the whole instances.

Annotating the image data

Output

The output from labelme is in a JSON file format. Make sure to save the annotation file in the same folder as the image file.

Done, you can train your own datasets with instance segmentation algorithms such as Mask-RCNNs.

--

--