Training your own Data set using Mask R-CNN for Detecting Multiple Classes

SriRam Govardhanam
Analytics Vidhya
Published in
4 min readJan 22, 2020

Mask R-CNN is a popular model for object detection and segmentation.

There are four main/ basic types in image classification:

The picture itself is self explanatory, now we are dealing with Instance segmentation [image credits: Slide 19 http://cs231n.stanford.edu/slides/2017/cs231n_2017_lecture11.pdf]

Goal

To train a model , so that it can able to differentiate(mask) different classes in the image (like cat, dog, car etc) while masking out every class precisely.

This is how it actually looks like

Starting from the scratch, first step is to annotate our data set, followed by training the model, followed by using the resultant weights to predict/segment classes in image.

Let’s Dive in

make sure that you selected the polygon tool, for other tools update the code corresponding to the tool

Segregate the images into two folders for training (train) and for validating(val), ideally in 3:2 ratio. The project structure should look like this:

Project
|-- logs (created after training)
| `-- weights.h5
`-- main
|-- dataset
| |-- train
| `-- val
`-- Mask_RCNN
|-- train.py
|-- .gitignore
|-- LICENCE
`-- etc..

Thanks to google colab, it offers 13GB GPU with no cost and can use continuously for 12hrs (google takes the ML realm to the next level by giving free resources 👏🏻👏🏻).

Now for no reason at all, an eye-opening line for you- “Some people have lives; Some people have masks”, you know who has the both 😉.

The Mask_RCNN folder above is the download zip file option in GitHub: https://github.com/matterport/Mask_RCNN , for train.py file and model.ipynb file refer to my github : https://github.com/SriRamGovardhanam/wastedata-Mask_RCNN-multiple-classes

I have done some modifications to the actual code available in Mask_RCNN/samples/balloon/balloon.py.

In Configurations part, change number of classes as per the requirement
NUM_CLASSES = 1 + 4 # Background + Num of classes

In Dataset part, modify the existing code to the following

class CustomDataset(utils.Dataset):def load_custom(self, dataset_dir, subset) #Add classes as per your requirement and order
self.add_class('object', 1, 'bottle')
self.add_class('object', 2, 'glass')
self.add_class('object', 3, 'paper')
self.add_class('object', 4, 'trash')
assert subset in ['train', 'val']
dataset_dir = os.path.join(dataset_dir, subset)
annotations = json.load(open(os.path.join(dataset_dir,
'via_region_data.json')))
annotations = list(annotations.values())
annotations = [a for a in annotations if a['regions']]
for a in annotations:
polygons = [r['shape_attributes'] for r in a['regions']]
objects = [s['region_attributes'] for s in a['regions']]
num_ids = []
for n in objects:
print one
print n
try:
if n['object'] == 'bottle':
num_ids.append(1)
elif n['object'] == 'glass':
num_ids.append(2)
elif n['object'] == 'paper':
num_ids.append(3)
elif n['object'] == 'trash':
num_ids.append(4)
except:
pass
image_path = os.path.join(dataset_dir, a['filename'])
image = skimage.io.imread(image_path)
(height, width) = image.shape[:2]
self.add_image(
'object',
image_id=a['filename'],
path=image_path,
width=width,
height=height,
polygons=polygons,
num_ids=num_ids,
)
# also change the return value of def load_mask()num_ids = np.array(num_ids, dtype=np.int32)
return mask, num_ids

After these changes, we are now able to train multiple classes.

Open terminal -> go to the file train.py directory and use the following command.

python3 train.py train - dataset='dataset path'  weights=coco
now we get each epoch weight in log folder

Now that we got weights of the model, we now check and keep the required weight in inspect_model_data.ipynb file. For that we need to run .ipynb file in jupyter notebook. So open jupyter notebook and update the data set path and weight.h5 path carefully in the notebook.

Result

Here we defined 4 classes :

  • bottle
  • glass
  • paper
  • trash

Below are the examples for model’s accuracy

glass prediction
left side is the input of different classes, input picture itself is a collage

Since we just tweaked a bit on original code of matter port’s mask-rcnn, it do has all the step by step detection

color splash
anchor sorting and filtering
Bounding box prediction
glass mask and prediction

Code

Code full implementation details can be found here.

Conclusion

We learnt pixel wise segmentation of multiple classes, I hope you understood this article, if you have any questions, comment below.The edge of the mask can be improved by increase in data and careful labeling, or those dumb pixels are not the brightest one around (Bad puns are how eye roll).

--

--

SriRam Govardhanam
Analytics Vidhya

i love the simplified, endless universe as much as complicated, tangled quantum-verse