Mask-RCNN on custom COCO like dataset on Windows machine

4 min readJul 6, 2020

It’s been one month I have been studying object detection and trying to figure out how to train a model from scratch. So I had to look deeply into what is annotation and how to use annotated files to feed into a model. I worked on Yolo and then moved to Mask-RCNN which is actually very important if someone wants to solve instance segmentation problems. In other words, it can separate different objects in an image or a video. You give it an image, it gives you the object bounding boxes, classes and masks.

I am a Windows 10 user and faced some obstacles during this learning period and thought why not share with others so that they would not spend their time on this to fix these problems. So let’s get started with all the necessary commands and tools. I am not going to show lines of codes here as all files are available in my Github repo which is:

Yunus0or1/Object-Detection-Python

This repo contains different projects on object detection using deep learning algorithms such as Yolo, mask-RCNN etc. …

github.com

First install tensorflow and tensorflow-gpu. I am guessing that you are using an Nvidia GPU as tensorflow-gpu will use cuda and cudNN. If you have an AMD GPU, no worries, you can train with your CPU or use Google Colab. (Actually I do not know how to use AMD GPU for model training)

pip install tensorflow==1.14
pip install tensorflow-gpu==1.14
pip install keras==2.2.0

Go to this link: https://developer.nvidia.com/cuda-10.0-download-archive
Download and install cuda 10.0 as this is the stable one for tensorflow 1.14

Go to this link: https://developer.nvidia.com/rdp/cudnn-archive
Download cudNN for 10.0. Extract it and copy the “bin”, “include” and “bin” in

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.0

(I suppose you installed in the default directory)

Now you will need PycocoTools. So what is the COCO API?
COCO API — http://cocodataset.org/ COCO is a large image dataset designed for object detection, segmentation, person keypoints detection, stuff segmentation, and caption generation. This package provides Matlab, Python, and Lua APIs that assists in loading, parsing, and visualizing the annotations in COCO. It is so hard to install PycocoTools on a Windows machine (at least I suffered ). First Install Visual C++ 2015. You will find it in my Git Repo. Install it and then write this command on CMD:

pip install git+https://github.com/philferriere/cocoapi.git#subdirectory=PythonAPI

You can find more detailed instruction here in this link also from where I learned:
https://github.com/philferriere/cocoapi

Next we need to annotate data. Before annotation you must resize your images. I have resized them into 512X512 dimensions. For annotation I have chosen labelme software. It’s actually a python package. Run these commands first:

pip install pyqt5
pip install labelme

Open CMD. write labelme and an annotator will pop up. Choose directory and start drawing polygons on images. Now under every image one annotation json file will be saved. So if you have 50 images you will get 50 annotation files. We need to convert them into COCO like annotations. I used a script to do the task and it is convertToCoco.py included in my Git Repo. I have tweaked a little bit on line 67. I just added 1 so that my categories start from index 1. Remember index 0 is reserved for BackGround class. Now put all your json files in a folder say in “annotation” folder. Put convertToCoco.py and “annotation” folder in same directory and run this command:

python convertToCoco.py annotation

This will create a json file named trainval.json which is a COCO like dataset JSON file. More details about image annotation in this link from where I learned:
https://www.dlology.com/blog/how-to-create-custom-coco-data-set-for-instance-segmentation/

Now it is about Mask-RCNN. There is one good repository on this algorithm implementation in Python and this is from matterport. First you have to clone.

git clone https://github.com/matterport/Mask_RCNN.git

Go inside the Mask_RCNN directory, open CMD and write this command:

python setup.py install

Inside Mask_RCNN, get the mrcnn folder and copy it to the same directory as Mask_RCNN. Inside Mask_RCNN/samples, get the coco folder and copy it to the same directory as Mask_RCNN. I hope then you will not have any directory error. I have added an image of my directory so that you can understand how to do that.

Now visit my GitHub repo mentioned above and look at this file: mask-RCNN-custom.py

I will explain some codes. Firstly I have imported all the necessary files. Then I defined class_names which is actually not necessary in this file. I have added COCO_MODEL_URL which will download COCO_MODEL_WEIGHT if not available.

Next look at the CustomConfig class. I have added a constructor to take num_class as an argument otherwise you might get errors. In this class I have mentioned the size of my images which is 512 width and 512 height.

Then look at the CocoLikeDataset class which loads your data from the Json file. It will read your json file and add the categories in self.add_class method. Rest of the functions are self explanatory. Last five lines:

show_data() # Show some images with your label masks.
dataset_train = train_model() # Train the dataset
print(“Dataset Class Names”)
print(dataset_train.class_names) # Shows your defined classes + Background classes
predict_static_image(“test/test3.jpg”) #Predict new image

This is the link learned all of this and then implemented on my own:
https://www.immersivelimit.com/tutorials/using-mask-r-cnn-on-custom-coco-like-dataset

You can get a Notebook file in my Git Repo which can be used directly on colab without touching any code.

If you face any problem feel free to knock me via email: ahmedyunuspilot@gmail.com

Mask-RCNN on custom COCO like dataset on Windows machine

Yunus0or1/Object-Detection-Python

This repo contains different projects on object detection using deep learning algorithms such as Yolo, mask-RCNN etc. …

Written by Ahmed Yunus