Training your own Data set using Mask R-CNN for Detecting Multiple Classes
Mask R-CNN is a popular model for object detection and segmentation.
There are four main/ basic types in image classification:
Goal
To train a model , so that it can able to differentiate(mask) different classes in the image (like cat, dog, car etc) while masking out every class precisely.
Starting from the scratch, first step is to annotate our data set, followed by training the model, followed by using the resultant weights to predict/segment classes in image.
Let’s Dive in
- Firstly open annotator [https://www.robots.ox.ac.uk/~vgg/software/via/via_demo.html],
- Load the images by selecting Project -> Add local files.
- After labeling all the images, export annotations(as json).
- We are using the latest version of VGG tool online.
Segregate the images into two folders for training (train) and for validating(val), ideally in 3:2 ratio. The project structure should look like this:
Project
|-- logs (created after training)
| `-- weights.h5
`-- main
|-- dataset
| |-- train
| `-- val
`-- Mask_RCNN
|-- train.py
|-- .gitignore
|-- LICENCE
`-- etc..
Thanks to google colab, it offers 13GB GPU with no cost and can use continuously for 12hrs (google takes the ML realm to the next level by giving free resources 👏🏻👏🏻).
Now for no reason at all, an eye-opening line for you- “Some people have lives; Some people have masks”, you know who has the both 😉.
The Mask_RCNN folder above is the download zip file option in GitHub: https://github.com/matterport/Mask_RCNN , for train.py file and model.ipynb file refer to my github : https://github.com/SriRamGovardhanam/wastedata-Mask_RCNN-multiple-classes
I have done some modifications to the actual code available in Mask_RCNN/samples/balloon/balloon.py.
In Configurations part, change number of classes as per the requirement
NUM_CLASSES = 1 + 4 # Background + Num of classes
In Dataset part, modify the existing code to the following
class CustomDataset(utils.Dataset):def load_custom(self, dataset_dir, subset) #Add classes as per your requirement and order
self.add_class('object', 1, 'bottle')
self.add_class('object', 2, 'glass')
self.add_class('object', 3, 'paper')
self.add_class('object', 4, 'trash')assert subset in ['train', 'val']
dataset_dir = os.path.join(dataset_dir, subset)
annotations = json.load(open(os.path.join(dataset_dir,
'via_region_data.json')))
annotations = list(annotations.values())
annotations = [a for a in annotations if a['regions']]
for a in annotations:
polygons = [r['shape_attributes'] for r in a['regions']]
objects = [s['region_attributes'] for s in a['regions']]
num_ids = []
for n in objects:
print one
print n
try:
if n['object'] == 'bottle':
num_ids.append(1)
elif n['object'] == 'glass':
num_ids.append(2)
elif n['object'] == 'paper':
num_ids.append(3)
elif n['object'] == 'trash':
num_ids.append(4)
except:
passimage_path = os.path.join(dataset_dir, a['filename'])
image = skimage.io.imread(image_path)
(height, width) = image.shape[:2]self.add_image(
'object',
image_id=a['filename'],
path=image_path,
width=width,
height=height,
polygons=polygons,
num_ids=num_ids,
)# also change the return value of def load_mask()num_ids = np.array(num_ids, dtype=np.int32)
return mask, num_ids
After these changes, we are now able to train multiple classes.
Open terminal -> go to the file train.py directory and use the following command.
python3 train.py train - dataset='dataset path' weights=coco
Now that we got weights of the model, we now check and keep the required weight in inspect_model_data.ipynb file. For that we need to run .ipynb file in jupyter notebook. So open jupyter notebook and update the data set path and weight.h5 path carefully in the notebook.
Result
Here we defined 4 classes :
- bottle
- glass
- paper
- trash
Below are the examples for model’s accuracy
Since we just tweaked a bit on original code of matter port’s mask-rcnn, it do has all the step by step detection
Code
Code full implementation details can be found here.
Conclusion
We learnt pixel wise segmentation of multiple classes, I hope you understood this article, if you have any questions, comment below.The edge of the mask can be improved by increase in data and careful labeling, or those dumb pixels are not the brightest one around (Bad puns are how eye roll).