Creating a Custom Object Detector Using Transfer Learning

How to train your own object detector using Open Images dataset and TensorFlow Object Detection API

Joyce Varghese
The Startup
8 min readSep 10, 2020

--

Photo by Nick Morrison on Unsplash

This article aims to help out beginners in machine learning on creating your own custom object detector. I have been trying to create a simple object detector and had to go through many articles spread across the internet to find all the required information. So I figured I’ll gather all the information I found in one place to make things easier for the next me.I’ll keep this as easy and informative as possible.

This article describes everything required to create a working object detector from gathering data to exporting the model for our use.

  1. Prerequisites
  2. Setting up the work environment
  3. Making the dataset
  4. Downloading and configuring the pre-trained model
  5. Training and evaluation
  6. Exporting the model

Prerequisites

This tutorial does not assume any previous knowledge of TensorFlow. I have tried to keep it as simple as I can so that anyone could get a working model at the end. For beginners, I definitely suggest this exercise as I found it to be an excellent first step into the world of transfer learning.

This tutorial utilizes Python. Several python packages are required to get this going. I’m just going to list them below. If you are reading this, you’ll probably have them already installed. If not, all these packages are very popular and there are lots of tutorials on the internet on how to install them.

  1. Pandas
  2. Tensorflow
  3. Tensorboard (Optional)
  4. Openimages

Installing TensorFlow Object Detection API

We are going to use TensorFlow Object Detection API to perform transfer learning.

To install TensorFlow API, git clone the following repository to your computer.

The Tensorflow Object Detection API uses Protobufs to configure model and training parameters. To use Protobufs, the library needs to be downloaded and compiled.

  1. Go to Protobufs and download the latest version of protoc for your system.
  2. Extract the contents of the downloaded zip to a folder. You can find the protoc file inside the bin folder.
  3. Run the following command from within the TensorFlow/models/research folder inside the repository that you cloned.
protoc object_detection/protos/*.proto --python_out=.
cp object_detection/packages/tf2/setup.py .
python -m pip install .

Testing your installation

To test the installation, run the following command from within Tensorflow/models/research :

python3 object_detection/builders/model_builder_tf2_test.py

Successful installation will result in an output similar to


[ OK ] ModelBuilderTF2Test.test_create_ssd_models_from_config
[ RUN ] ModelBuilderTF2Test.test_invalid_faster_rcnn_batchnorm_update
[ OK ] ModelBuilderTF2Test.test_invalid_faster_rcnn_batchnorm_update
[ RUN ] ModelBuilderTF2Test.test_invalid_first_stage_nms_iou_threshold
[ OK ] ModelBuilderTF2Test.test_invalid_first_stage_nms_iou_threshold
[ RUN ] ModelBuilderTF2Test.test_invalid_model_config_proto
[ OK ] ModelBuilderTF2Test.test_invalid_model_config_proto
[ RUN ] ModelBuilderTF2Test.test_invalid_second_stage_batch_size
[ OK ] ModelBuilderTF2Test.test_invalid_second_stage_batch_size
[ RUN ] ModelBuilderTF2Test.test_session
[ SKIPPED ] ModelBuilderTF2Test.test_session
[ RUN ] ModelBuilderTF2Test.test_unknown_faster_rcnn_feature_extractor
[ OK ] ModelBuilderTF2Test.test_unknown_faster_rcnn_feature_extractor
[ RUN ] ModelBuilderTF2Test.test_unknown_meta_architecture
[ OK ] ModelBuilderTF2Test.test_unknown_meta_architecture
[ RUN ] ModelBuilderTF2Test.test_unknown_ssd_feature_extractor
[ OK ] ModelBuilderTF2Test.test_unknown_ssd_feature_extractor
— — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — —
Ran 20 tests in 68.510s
OK (skipped=1)

We are all set to create our own object detector model.

Setting up the work environment

Since the project files have similar names and can get very confusing, I have made a project file structure to avoid confusion. Git clone this repository to get started.

Inside the Project folder, you will find 6 subfolders :

  1. Annotations: It will hold all the TFRecords and label maps.
  2. Images: It will hold all the images for training and test in two folders.
  3. Pretrained models: It will hold our pretrained models in different folders.
  4. Models: It will hold all the model checkpoints after training.
  5. Exported models: It will hold all the models that are exported after training.
  6. Scripts: It will hold all the code required for this project.

Copy the files model_main_tf2.py and exporter_main_v2.py from TensorFlow/models/research/object_detection/ in the Tensorflow/models repository that you cloned during the installation of Tensorflow Object Detection API to the Scripts folder in the Project folder.

Making the Dataset

In this tutorial, we are going to use Google’s OpenImages dataset which contains millions of images grouped into thousands of labels with bounding boxes. For any regular object, chances are you will find it in this dataset.

Since it is impractical to download such a huge dataset, we employ openimages API for this. We have already installed this in our prerequisites.

You can use the following command to download the dataset with the arguments

  • csv_dir — where the CSV files will be downloaded to
  • base_dir — where the images and annotations in XML format will be downloaded to. It is best to keep these locations to single folder
  • labels — the labels for which images are to be downloaded
  • format — pascal or darknet. These are two popular annotations formats. We are going to use the pascal format
  • limit — The maximum number of images of each class type

Replace the locations and labels with your requirements and run the command. Downloading the data will take some time.

oi_download_dataset — csv_dir ~/<dir_A> — base_dir ~/<dir_A> — labels Zebra Binoculars — format pascal — limit 200

What if required label is not in OpenImages dataset ?

We can also use LabelImg tool for labelling.How to use it is a story for another blog. It is simple to install and use.You’ll figure it out.

Once the data is downloaded to the corresponding folder, we can find it to be divided into folders based on labels which again contains images and annotations as sub-folders. Move all these files outside to our images folder.

So now we have a folder with all images and XML files in one place.

Splitting to train and test

Inside the scripts folder, you will find a python file partition_dataset.py.

Run this file using the following command

python3 partition_dataset.py -x -i [PATH_TO_IMAGES_FOLDER] -r 0.1

Here the 10% of the images are used for the test set. You can change this by modifying the -r argument. Give the images folder as the -i argument.

After running the command, you will find the images folder containing images and XML files split into two folders train and test using the given ratio.

Once you make sure the files are safely copied you can delete the originals.

Creating the label map

Inside the annotations folder, you will find a file label_map.pbtxt containing labels in the format

item {
id: 1
name: 'bed'
}
item {
id: 2
name: 'bench'
}

Modify this file by adding your labels in the given format and save it.

Converting to TFRecords

TFRecord file stores your data as a sequence of binary strings. This can improve the performance of your process drastically. So here we are going to convert the data in our XML files to tfrecord format for faster execution.

To convert XML files to tfrecords, we use the generate_tfrecord.py script inside the scripts folder with the commands

# Create train data:
python3 generate_tfrecord.py -x [PATH_TO_IMAGES_FOLDER]/train -l [PATH_TO_ANNOTATIONS_FOLDER]/label_map.pbtxt -o [PATH_TO_ANNOTATIONS_FOLDER]/train.record -c [PATH_TO_ANNOTATIONS_FOLDER]/train.csv
# Create test data:
python3 generate_tfrecord.py -x [PATH_TO_IMAGES_FOLDER]/test -l [PATH_TO_ANNOTATIONS_FOLDER]/label_map.pbtxt -o [PATH_TO_ANNOTATIONS_FOLDER]/test.record -c [PATH_TO_ANNOTATIONS_FOLDER]/test.csv

Once we run these scripts, we can find two tfrecord files and two CSV files inside our annotations folder.

Be sure to make sure that the data from the XML files are copied over to the CSV files correctly. This can be done by simply comparing the values in one XML file and its corresponding column in the CSV file. Inconsistency between these can lead to errors in the future that will be difficult to debug.

Some of you may have the dataset in the form of .csv files. I have included a dataset.csv file inside the annotations folder that prescribes the format of how the data should be formatted for use in these models. Convert your CSV files to the prescribed format and convert them to tfrecords to continue with the tutorial. Project/scripts/generate_tfrecord_csv.py can help you with the conversion.

Downloading and configuring the pre-trained model

Plenty of pre-trained detection models are available at Tensorflow Model Zoo. Download your preferred model from it as a compressed file.

Extract this tar.gz file into the pretrained models folder. An example is placed in the folder.

To configure this pretrained model for our use, create a sub-folder with the name of the model you are using in the models folder(This folder will be referred to as modelname from henceforth). Copy the pipeline.config file from the downloaded folder into the modelname folder. This file contains the configurations for the training pipeline. An example file is kept in the models folder.

We have to modify these configurations to suit your needs. Some parameters you will need to modify are listed below.

  1. num_classes: Number of class labels detected using the detector. Change this to the number of class labels required in your detector.
  2. batch_size: Batch size used for training. Change this value according to your memory availability. A higher batch size requires higher memory.
  3. fine_tune_checkpoint: Path to the checkpoint of the pretrained model.
  4. fine_tune_checkpoint_type : Set this to “detection”.
  5. label_map_path : Path to the label_map.pbtxt file we created earlier.
  6. input_path: Path to the tfrecord files we created earlier.

I have marked these parameters in an example file by using comments below.

Note that label_map_path and input_path have to be modified for both train_input_reader and eval_input_reader. Both can use the same label_map_path but input_path should point to train.record and test.record respectively. All paths should be absolute or relative to the scripts folder.

Training and evaluation

Training

Now that we have pre-processed our dataset and configured our training pipeline, let’s get to training our model. Inside the scripts folder, you will find model_main_tf2.py file. This code will be used to train our model.

python3 model_main_tf2.py 
--model_dir = <MODEL PATH>
--pipeline_config_path = <CONFIG PATH>
--checkpoint_dir= <CHECKPOINT PATH>

Here replace

<MODEL PATH> — with the location of the modelname folder.

<CONFIG PATH>- with the location of the pipeline.config file inside the modelname folder.

<CHECKPOINT PATH>- with the location of the modelname folder.

The model should start training on the above command. While it is training, It will evaluate the model based on the latest checkpoint files in modelname and the results are saved as events.out.tfevents.* files inside modelname/train. These files can be used to monitor the model performance in Tensorboard as shown in the next step.

Evaluation

Tensorboard is a feature of Tensorflow that allows us to monitor our model’s performance. This can be used to analyze is the model is over-fitting or under-fitting or if it is learning anything at all. We can use our generated events.out.tfevents.* files inside modelname/train with Tensorboard to monitor our model’s performance.

To whip up Tensorboard, run the following command from within the Project folder.

tensorboard --logdir=models/modelname

If everything went well, the following message should appear.

TensorBoard 2.2.2 at http://localhost:6006/

Now open up http://localhost:6006/ on your browser to see the training metrics on the tensorboard as the model trains.

Exporting the model

Now that we have trained our data to suit our requirements, we have to export this model for use in our desired applications. Inside the scripts folder, you will find exporter_main_v2.py file. This code will be used to export our model.

python3 exporter_main_v2.py --input_type image_tensor --pipeline_config_path <CONFIG PATH> --trained_checkpoint_dir <CHECKPOINT PATH> --output_directory <OUTPUT PATH>

Here replace

<CONFIG PATH>- with the location of the pipeline.config file inside the modelname folder.

<CHECKPOINT PATH>- with the location of the modelname folder.

<OUTPUT PATH>- with the path of a folder where you want to save the trained model. You can create a sub-folder within the exported models folder to save the trained model and give the location of this folder here.

That’s it! You now have a model retrained to suit your exact needs.

As I mentioned in the introduction, I do not own any of these code. All this code is already available on the internet and I just made some tweaks to get them working. I’m thankful to all the developers who made my life easy.

This was my first article so leave your suggestions for improvement. Feel free to email me or ping me on LinkedIn with doubts or suggestions.

Peace✌️

--

--