Generating synthetic dataset for object detection

Noufal Samsudin
Analytics Vidhya
Published in
3 min readJan 5, 2021

--

Using OpenCV to generate a text localization dataset for training an object detection model (Faster R-CNN) and evaluating the model on a real dataset

Data is currently the major limiting factor in deep learning. Good data is hard to come by. Collecting and labeling data is hard manual labor. It is expensive, time consuming and difficult.

It’s not who has the best algorithm that wins. It’s who has the most data

-Andrew Ng

One “sub-optimal” solution to this issue is generating synthetic data programmatically. Fake data is undoubtedly inferior to real-world data. However, in the absence or scarcity of real data, it is the best alternative.

Introduction

In this article I will be building a object detection model using detectron2 for detecting English and Arabic texts on signboards. Typically to build an multi-class object detection model we need about 300 training image per class. Creating such a dataset by bulk downloading images and manually labeling them would take about 6 hours of manual work. This probably would give us the best results. For experimental purposes, I decided not to take this approach. Instead I programmatically create a synthetic dataset.

The synthetic dataset comprises rectangular or circular “sign boards” with some English or Arabic text, randomly placed on a background image. The bounding box coordinates of the texts are also available for each image.

Methodology:

  1. Create Synthetic Dataset comprising images and bounding box coordinates for Arabic and English texts.
  2. Train a Faster RCNN model on the synthetic dataset
  3. Collect a few real life examples of English and Arabic sign boards and evaluate the model

Synthetic Dataset

Steps involved in creation of the dataset:

  1. Select a signboard shape
  2. Select an Arabic and English Phrase — randomly select 1–4 words and form a phrase
  3. Create Signboard — Select a font, location of text on the signboard
  4. Select texture — overlay texture on the signboard
  5. Select background image
  6. Select scale, rotation, place image on background
Fake Images

Train Faster R-CNN

Facebook’s detectron2 package can be used to quickly train and evaluate object detection models. Checkout the training script on github for the full implementation.

Evaluation

Evaluate the model on some real images:

Faster R-CNN evaluation on real data
Faster R-CNN evaluation on real data

This seems to be doing reasonably well for this small evaluation.

Despite the synthetic data being so evidently crude, hastily put-together and obviously fake, the model still manages to learn the basic text localization from it. It is able to detect the text in the real images we tested it on and correctly classify them.

Please note that I did not follow any scientific method for ensuring the unbiasedness or statistical relevance of these results. This is not an academic study, just a anecdotal account. Typically synthetic data is used to augment and balance real life dataset and not to replace them.

Full code available on my github.

Shoulders of giants:

  1. Playing card detection with YOLO : https://www.youtube.com/watch?v=pnntrewH0xg
  2. Detectron2: https://github.com/facebookresearch/detectron2
  3. https://towardsdatascience.com/object-detection-in-6-steps-using-detectron2-705b92575578

About the author

I work in Dubai Holding, UAE as a data scientist. You can reach out to me at kvsnoufal@gmail.com or https://www.linkedin.com/in/kvsnoufal/

--

--