STEP BY STEP GUIDE

TensorFlow 2 Object Detection API With Google Colab

This article will guide you through all the steps required for object recognition model training, from collecting images for the model to testing the model!

Nisarg Kapkar

Published in

The Startup

9 min readSep 21, 2020

A simple Dog Breed Classifier! Can recognize 10 different breeds of dogs!

UPDATE:
Thank you all for your amazing support. Recently hit almost 35k views on this blog!

I am also currently sharing more amazing content about javascript, web development, software engineering, etc. on my Twitter. Let’s connect: @nnkkaapp
Subscribe to get an email whenever I publish a new article

In this tutorial, we will use Google Colab (for model training) and Google Drive (for storage).

Colab is a free Jupyter NoteBook environment hosted by Google that runs on the cloud. Google Colab provides free access to GPUs (Graphical Processing Units) and TPUs (Tensor Processing Units).

You can read more about Google Colab on their Intro and FAQ page.

NOTE:
Sessions on Google Colab are 12 hours long. After 12 hours everything on Colab storage is wiped out (Notebooks will also disconnect from Virtual Machines if they are left idle for too long). So, it is advisable to use Google Drive for storage rather than using Colab’s storage.

Considering that you know the basics of Colab, let’s start with our Object Recognition Model!

Step 1- Prerequisites (Gather/Label images, Create label_map… )

Gather and Label images

We need to provide properly labeled images to the Object Detection API. These images will be used to train our model.

The first step is to gather images for all the objects you want your model to classify. You can collect images from the internet, or use some public datasets. You can search for public datasets using Google’s Dataset Search.

Next, we need to label all the desired objects in the collected images. LabelImg is a superb tool for annotating images. You can find the installation and usage instructions on its GitHub page. (skip this step if you are using a public dataset and you already have labeled images)

After labeling, divide the dataset into two parts- train (80% of images with their corresponding XML files) and test (remaining 20% of images with their corresponding XML files).

For this tutorial, I am using Fruit Image for Object Detection Dataset from Kaggle. The database already contains labeled images divided into two sets (train and test).

Create label_map.pbtxt

A label_map maps each class(label) to an int value. label_map file should have the extension as .pbtxt.

Below is the label_map file for the Fruit Detection dataset:

item {
      id: 1
      name: 'apple'
}
item {
     id: 2
     name: 'orange'
}
item {
     id: 3
     name: 'banana'
}

Similarly, you must make a label_map.pbtxt file for your dataset.

Download a pre-trained model to apply transfer learning

We will use pre-trained models provided by TensorFlow for training.
Download any per-trained model of your choice from the TensorFlow 2 Detection Model Zoo. (just click on the name of the model you want to use to start the download)

For this tutorial, I am using the SSD Resnet50 V1 FPN 640X640 model.

Download generate_tfrecords.py script

This script(generate_tfrecords.py) will be used to covert the annotations into the TFRecord format. Download the script from here.

Huge thanks to Lyudmil Vladimirov for allowing me to use some of the content from their amazing TensorFlow 2 Object Detection API Tutorial for Local Machines!

Step 2- Set up the directory structure on Google Drive

Go to your Google Drive and make a new folder named “TensorFlow”.

Make a directory structure in your TensorFlow folder as shown below.
(You can give names of your choice to folders. If you are using different names, change all the paths in Jupyter NoteBook according to your folder names)

TensorFlow
├───scripts
│   └───preprocessing
└───workspace
    └───training_demo
        ├───annotations
        ├───exported-models
        ├───images
        │   ├───test
        │   └───train
        ├───models
        └───pre-trained-models

We will now add all the collected files (from Step 1) to their respective directories.

Add the train and test images (with their corresponding XMLfiles) to ‘training_demo/images/train’ and ‘training_demo/images/test’ folder respectively.
Add the label_map.pbtxt file to ‘training_demo/annotations’.
Add the generate_tfrecord.py script to ‘scripts/preprocessing’.
Extract the downloaded pre-trained-model and add the extracted folder to ‘training_demo/pre-trained-models’.
Go to ‘training_demo/models’ and make a new folder named ‘my_ssd_resnet_v1_fpn’ (name the folder according to the pre-trained-model you have downloaded)
Copy the pipeline.config file from ‘training_demo/pre-trained-models/ssd_resnet50_v1_fpn_640x640_coco17_tpu-8’ (or from the respective folder of the pre-trained-model you have downloaded and extracted) and paste it into the newly created ‘my_ssd_resnet_v1_fpn’ folder (or the folder new you created in ‘training_demo/models’ according to your pre-trained-model).

After uploading all the files, this is how your directory structure should look like: (new files and folders highlighted in bold)

TensorFlow
├───scripts
│   └───preprocessing
│     └───generate_tfrecord.py 
└───workspace
    └───training_demo
        ├───annotations
        │   └───label_map.pbtxt 
        ├───exported-models
        ├───images
        │   ├───test
        │   │     └───test images with corresponding XML files
        │   └───train
        │         └───train images with corresponding XML files
        ├───models
        │   └───my_ssd_resnet50_v1_fpn
        │     └───pipeline.config
        └───pre-trained-models
            └───ssd_resnet50_v1_fpn_640x640_coco17_tpu-8

We will now do most of the steps on Google Colab.

I have made a Notebook containing all the steps and relevant codes. (Run the cell with a particular step number to execute that step)
You can download the NoteBook from my GitHub Repository.

Open Colab and load the downloaded Notebook.

Step 3- Select the Hardware Accelerator

On Colab, go to Runtime→Change Runtime Type and select Hardware accelerator as GPU.

NOTE:
If you have given different names to your folders and files, don’t forget to change the paths in cells according to your files and folder in Colab Notebook!

Step 4- Mount Google Drive

from google.colab import drive
drive.mount('/content/gdrive')

You will be given a URL and you will be asked to enter an authentication code to mount your google drive.

Step 5- Download TensorFlow Model Garden

#cd into the TensorFlow directory in your Google Drive
%cd '/content/gdrive/My Drive/TensorFlow'#and clone the TensorFlow Model Garden repository
!git clone https://github.com/tensorflow/models.git#using a older version of repo
%cd '/content/gdrive/MyDrive/TensorFlow/models'
!git checkout -f e04dafd04d69053d3733bb91d47d0d95bc2c8199

You should now have a new folder named ‘models’ in your TensorFlow directory!

NOTE:
Some steps in the tutorial are not compatible with the latest version of TensorFlow 2. So, we will be using an older version of the repository for this tutorial (Date of older version: 21st Sept 2020)

Step 6- Install some required libraries and tools

!apt-get install protobuf-compiler python-lxml python-pil
!pip install Cython pandas tf-slim lvis

Step 7- Compile the Protobuf libraries

#cd into 'TensorFlow/models/research'
%cd '/content/gdrive/My Drive/TensorFlow/models/research/'!protoc object_detection/protos/*.proto --python_out=.

Step 8- Set the environment

import os
import sysos.environ['PYTHONPATH']+=":/content/gdrive/My Drive/TensorFlow/models"sys.path.append("/content/gdrive/My Drive/TensorFlow/models/research")

Step 9- Build and Install setup.py

!python setup.py build
!python setup.py install

Step 10- Test the Installation

#cd into 'TensorFlow/models/research/object_detection/builders/'
%cd '/content/gdrive/My Drive/TensorFlow/models/research/object_detection/builders/'!python model_builder_tf2_test.py
from object_detection.utils import label_map_util
from object_detection.utils import visualization_utils as viz_utils
print('Done')

If all the installations were successful, you should see output similar to the one shown below.

…
[       OK ] ModelBuilderTF2Test.test_invalid_model_config_proto
[ RUN      ] ModelBuilderTF2Test.test_invalid_second_stage_batch_size
[       OK ] ModelBuilderTF2Test.test_invalid_second_stage_batch_size
[ RUN      ] ModelBuilderTF2Test.test_session
[  SKIPPED ] ModelBuilderTF2Test.test_session
[ RUN      ] ModelBuilderTF2Test.test_unknown_faster_rcnn_feature_extractor
[       OK ] ModelBuilderTF2Test.test_unknown_faster_rcnn_feature_extractor
[ RUN      ] ModelBuilderTF2Test.test_unknown_meta_architecture
[       OK ] ModelBuilderTF2Test.test_unknown_meta_architecture
[ RUN      ] ModelBuilderTF2Test.test_unknown_ssd_feature_extractor
[       OK ] ModelBuilderTF2Test.test_unknown_ssd_feature_extractor
----------------------------------------------------------------------
Ran 20 tests in 42.274sOK (skipped=1)
Done

Step 11- Generate Tfrecords

#cd into preprocessing directory
%cd '/content/gdrive/My Drive/TensorFlow/scripts/preprocessing'#run the cell to generate test.record and train.record!python generate_tfrecord.py -x '/content/gdrive/My Drive/TensorFlow/workspace/training_demo/images/train' -l '/content/gdrive/My Drive/TensorFlow/workspace/training_demo/annotations/label_map.pbtxt' -o '/content/gdrive/My Drive/TensorFlow/workspace/training_demo/annotations/train.record'!python generate_tfrecord.py -x '/content/gdrive/My Drive/TensorFlow/workspace/training_demo/images/test' -l '/content/gdrive/My Drive/TensorFlow/workspace/training_demo/annotations/label_map.pbtxt' -o '/content/gdrive/My Drive/TensorFlow/workspace/training_demo/annotations/test.record'# !python generate_tfrecord.py -x '[path_to_train_folder]' -l '[path_to_annotations_folder]/label_map.pbtxt' -o '[path_to_annotations_folder]/train.record'# !python generate_tfrecord.py -x '[path_to_test_folder]' -l '[path_to_annotations_folder]/label_map.pbtxt' -o '[path_to_annotations_folder]/test.record'

You should now have two new files “test.record” and “train.record” in ‘workspace/training_demo/annotations’ folder.

Step 12- Copying some files

Copy the “model_main_tf2.py” file from “TensorFlow\models\research\object_detection” and paste it in training_demo folder. We will need this file for training the model.
Copy the “exporter_main_v2.py” file from “TensorFlow\models\research\object_detection” and paste it in training_demo folder. We will need this file to export the trained model.

Step 13- Configure the pipeline file

Go to ‘training_demo/models/my_ssd_resnet50_v1_fpn’. (or the folder you have created for the downloaded model in your ‘training_demo/models’ directory)

Open the pipeline.config file. (you can open a file in Colab by simply double-clicking it)

Change the lines shown below according to your dataset. (set paths according to your folders name and downloaded pre-trained-model)

Line 3:
num_classes: 3 (#number of classes your model can classify/ number of different labels)Line 131:
batch_size: 16 (#you can read more about batch_size here)Line 161:
fine_tune_checkpoint: "pre-trained-models/ssd_resnet50_v1_fpn_640x640_coco17_tpu-8/checkpoint/ckpt-0" (#path to checkpoint of downloaded pre-trained-model)Line 162:
num_steps: 250000 (#maximum number of steps to train model, note that this specifies the maximum number of steps, you can stop model training on any step you wish)Line 167:
fine_tune_checkpoint_type: "detection" (#since we are training full detection model, you can read more about model fine-tuning here)Line 168:
use_bfloat16: false (#Set this to true only if you are training on a TPU)Line 172:
label_map_path: "annotations/label_map.pbtxt" (#path to your label_map file)Line 174:
input_path: "annotations/train.record" (#path to train.record)Line 182:
label_map_path: "annotations/label_map.pbtxt" (#path to your label_map file)Line 186:
input_path: "annotations/test.record" (#Path to test.record)

Step 14- Start TensorBoard

TensorBoard allows you to track and visualize various training metrics while training is ongoing.
You can read more about TensorBoard here.

#cd into training_demo
%cd '/content/gdrive/My Drive/TensorFlow/workspace/training_demo'#start the Tensorboard
%load_ext tensorboard
%tensorboard --logdir=models/my_ssd_resnet50_v1_fpn# %load_ext tensorboard
# %tensorboard --logdir=models/[name_of_pre-trained-model_you_downloaded]

Initially, you will get a message saying “No dashboards are active for the current data set”.
But once the training start, you will see various training metrics.

Step 15- Train the Model

!python model_main_tf2.py --model_dir=models/my_ssd_resnet50_v1_fpn --pipeline_config_path=models/my_ssd_resnet50_v1_fpn/pipeline.config# !python model_main_tf2.py --model_dir=models/[name_of_pre-trained-model_you_downloaded] --pipeline_config_path=models/[name_of_pre-trained-model_you_downloaded]/pipeline.config

Once your model training starts, you should see output similar to one shown below:

INFO:tensorflow:Step 100 per-step time 1.154s loss=0.899
I0918 04:22:33.549013 140442778175360 model_lib_v2.py:652] Step 100 per-step time 1.154s loss=0.899
INFO:tensorflow:Step 200 per-step time 1.133s loss=0.861
I0918 04:24:27.194712 140442778175360 model_lib_v2.py:652] Step 200 per-step time 1.133s loss=0.861
INFO:tensorflow:Step 300 per-step time 1.138s loss=0.685
I0918 04:26:20.992518 140442778175360 model_lib_v2.py:652] Step 300 per-step time 1.138s loss=0.685
INFO:tensorflow:Step 400 per-step time 1.131s loss=0.546
I0918 04:28:14.755549 140442778175360 model_lib_v2.py:652] Step 400 per-step time 1.131s loss=0.546
…

You can see various training parameters/metrics (like classification_loss, total_loss,learning_rate…) in your TensorBoard. The training log displays loss once after every 100 steps.

Training time depends on several factors, such as batch_size, the complexity of objects, hyper-parameters, etc; so be patient and don’t cancel the process.

A new checkpoint file is saved every 1000 steps. (These checkpoints can be used to restore training progress and continue model training)

It is advisable to train the model until the loss is constantly below 0.3! If you do not achieve good results, you can continue training the model (the checkpoints will allow you to restore training progress) until you get satisfactory results!

Step 16- Export the Trained Model

We have finished training our model, it’s time to extract our saved_model. This saved_model will be used to perform object recognition.

!python exporter_main_v2.py --input_type image_tensor --pipeline_config_path ./models/my_ssd_resnet50_v1_fpn/pipeline.config --trained_checkpoint_dir ./models/my_ssd_resnet50_v1_fpn/ --output_directory ./exported-models/my_model# !python exporter_main_v2.py --input_type image_tensor --pipeline_config_path ./models/[name_of_pre-trained-model you downloaded]/pipeline.config --trained_checkpoint_dir ./models/[name_of_pre-trained-model_you_downloaded]/ --output_directory ./exported-models/my_model

You should now have a new folder named ‘my_model’ inside your ‘training_demo/exported-models’ directory. This folder contains our saved_model.

Now it’s time to test our trained model!

Step 17- Testing the model (Loading saved_model)

#Loading the saved_model(change the path according to your directory names)import tensorflow as tf
import time
from object_detection.utils import label_map_util
from object_detection.utils import visualization_utils as viz_utilsPATH_TO_SAVED_MODEL="/content/gdrive/My Drive/TensorFlow/workspace/training_demo/exported-models/my_model/saved_model"print('Loading model...', end='')# Load saved model and build the detection function
detect_fn=tf.saved_model.load(PATH_TO_SAVED_MODEL)print('Done!')

Step 18- Testing the model (Loading label_map)

#Loading the label_map
category_index=label_map_util.create_category_index_from_labelmap("/content/gdrive/My Drive/TensorFlow/workspace/training_demo/annotations/label_map.pbtxt",use_display_name=True)#category_index=label_map_util.create_category_index_from_labelmap([path_to_label_map],use_display_name=True)

Step 19- Testing the model (Loading images)

#Loading the image
img=['/content/img1.jpg','/content/img2.jpg']
print(img)#list containing paths of all the images

Step 20- Running the Inference

import numpy as np
from PIL import Image
import matplotlib.pyplot as plt
import warnings
warnings.filterwarnings('ignore')def load_image_into_numpy_array(path):
    return np.array(Image.open(path))for image_path in img:print('Running inference for {}... '.format(image_path), end='')
    image_np=load_image_into_numpy_array(image_path)input_tensor=tf.convert_to_tensor(image_np)
    input_tensor=input_tensor[tf.newaxis, ...]detections=detect_fn(input_tensor)num_detections=int(detections.pop('num_detections'))
    detections={key:value[0,:num_detections].numpy()
                   for key,value in detections.items()}
    detections['num_detections']=num_detectionsdetections['detection_classes']=             detections['detection_classes'].astype(np.int64)image_np_with_detections=image_np.copy()viz_utils.visualize_boxes_and_labels_on_image_array(
          image_np_with_detections,
          detections['detection_boxes'],
          detections['detection_classes'],
          detections['detection_scores'],
          category_index,
          use_normalized_coordinates=True,
          max_boxes_to_draw=100,     
          min_score_thresh=.5,      
          agnostic_mode=False)%matplotlib inline
    plt.figure()
    plt.imshow(image_np_with_detections)
    print('Done')
    plt.show()

If everything is successful, you should see your loaded images with bounding boxes, labels, and accuracy!

Output with Bounding Boxes, Labels, and Accuracy!

Acknowledgments and References:

Huge Thanks to Lyudmil Vladimirov for allowing me to use some of the content from their amazing TensorFlow 2 Object Detection API for Local Machines!
Link to their GitHub Repository.