Create Your Own Dataset for Instance Segmentation
Preparing data for Deep Learning
Maybe you have already made several image classification model. Utilizing datasets from Kaggle or the built-in dataset from TensorFlow and PyTorch. You can build your own datasets for image classification easily. But what about instance segmentation?
Overview
One of the tasks which need to be done before we can start training our instance segmentation model is to annotate the image data. In the image classification task, we only need to collect image data and separate them into folders according to their class.
Instance segmentation requires us to do an additional task, which is to annotate the object of interest. In this way, we are telling our machine learning model which pixels in an image belong to the specific class.
Installing Labelme
Here we will use labelme to help us do the job. So, what is labelme?
Labelme is a graphical image annotation tool inspired by http://labelme.csail.mit.edu. It is written in Python and uses Qt for its graphical interface. — Kentaro Wada
You can get labelme and read the docs at this link:
https://github.com/wkentaro/labelme
There are several ways to install labelme. We can install it from anaconda or docker. Also, it can be installed from the command line or standalone installation. We can get the standalone executable/app at this link:
https://github.com/wkentaro/labelme/releases
Here you’ll see several versions of a standalone installation, and you can download according to your operating system.
After the installation is successful, we can launch the app from the terminal, by typing this command and hit enter:
labelme
We can start with opening an image or click open dir to open the whole image in a folder. Here’s how it looks.
Annotating the image data
To begin annotating the image data we can click on the create polygon menu on the left sidebar and then create a polygon around the edge of the object of interest. This is actually similar to the pen tools on Photoshop application. Once we finished, we can put a name for the annotation. Repeat this process to annotate the whole instances.
Output
The output from labelme is in a JSON file format. Make sure to save the annotation file in the same folder as the image file.
Done, you can train your own datasets with instance segmentation algorithms such as Mask-RCNNs.