Extending CORe50 for Object Detection and Segmentation

Giacomo Bartoli
ContinualAI
Published in
5 min readJan 21, 2019

--

CORe50 is a dataset specifically designed for Continual Learning. However, it can be used as static dataset for tasks such as Object Recognition, Object Detection and even Object Segmentation.

In this article, we extend CORe50 to support Object Detection and Segmentation. If you wish to know more about CORe50, V. Lomonaco already wrote an extensive article about it.

Object Detection

Object Detection is the capability of an algorithm to correctly identify an object inside an image and locate that object to a specific area of the original image. The fact that the algorithm is able to provide a location makes object detection a task particularly relevant for many practical use cases:

  • Is there a children crossing the road?
  • How many cars are queued in front of the traffic light?
  • Surveillance systems in an IoT scenario
  • Eye tracking for ads analytics
  • Medical image analysis

The idea behind this project is to is to apply detection using CORe50 over original images, in the format 350x350. We want to prove that CORe50 can be exploited as dataset in a concrete case study. Moreover, being able to distinguish 50 objects is certainly a challenging task.

We used Tensorflow Object Detection APIs and trained an SSD_MobileNet_v1 for detection. MobileNets are a new deep architecture recently introduced by Google that can easily trade-off efficiency and accuracy by simply setting two hyper-parameters. This way, the same neural network can be applied for a large number of devices, which are tipically embedded systems. However, MobileNets are not suitable for detection. They can only be used for classification. Due to this reason we decided to use SSD_MobileNet, which is a meta-architecture: SSD (Single Shot Detector) is one of the fastest neural networks specifically designed for detection and we exploited MobileNets as module of feature extraction. The final result is an hybrid model called SSD_MobileNet.

We started from a pretrained SSD_Mobilent on COCO dataset and we retrained only the last layers of the network. Results are good enough, reaching an accuracy (mAP) close to 70%:

These are some visual results:

Light Bulb
Marker
Remote Control

Object Segmentation

Segmentation is the task of drawing segments that delimit a specific object. This means that each pixel will be coloured depending on whether it belongs to the object or not.

In this case we are lucky, because there is no need to train a deep (and complex) neural network. CORe50 already provides depth information. We can exploit this information for deleting the background. In fact, pixels that belong to the background will have high values of depth, while pixels belonging to the hand or to the objects will not.

1. Original image

It is sufficient to find a threshold for the depth value and to discards all the pixels whose value falls below that threshold. So, by running this algorithm what we got is the segmentation of the object joint with the holding hand:

2. Deleting the background

Now, what we want to is to delete the hand. The final image will result in the segmented object. We are going to use a simple SVM classifier. I choose some random pictures taken from CORe50 and I manually segmented the hand. After this, I wrote a simple algorithm that saves into a numpy array all the pixels that represent the hand by using the mention above segmentation. Now that we have reached this point, the process is quite straightforward. I used those pixels for training a SVM classifier. After the training phase, I saved the model on disk and then I reload the model for the inference phase. This way, I was able to delete the hand from almost all the CORe50 images.

This is the final result:

3. Final Segmentation

A short recap:
1. First, I delete the background by finding an optimal threshold for depth values.
2. I trained a SVM classifer to identify pixels belonging to the holding hand.
3. I run the trained model for deleting pixels of the hand.

All the files and scripts used for these experiments are available through Github repo. Moreover, if you wish to test Object Detection we can already provide you all the files for configuring the training pipeline according to the Tensorflow Object Detection APIs:

Conclusions

In this article we have shown that CORe50, despite being specifically designed for Continual Learning, can also be used for object detection and segmentation. Next goals will be testing Continual Learning techniques for detection and even segmentation.
Which one will achieve the best performances?

📚 References

--

--

Giacomo Bartoli
ContinualAI

Shaping a better world through the use of tech.