Training custom-data using tensorflow for object-detection

Published in

Analytics Vidhya

4 min readApr 29, 2020

Object detection using custom data is always fun to work on. Today let’s get our hands dirty on detecting 5 different sports balls including cricket ball, tennis ball ,rugby ball, volleyball and golf ball.

Get the repository:

git clone https://github.com/tensorflow/models

Install dependencies:

pip install — user Cython
pip install — user contextlib2
pip install — user pillow
pip install — user lxml
pip install — user jupyter
pip install — user matplotlib

Environment setup:

download protoc zip file from https://github.com/protocolbuffers/protobuf/releases, extract and you will get bin folder with protoc.exe file in it.

Now, run below command from the terminal

cd <path_to_tensorflow_cloned_repo>/models/research/
python setup.py build
python setup.py installprotoc object_detection/protos/*.proto --python_out=.
export PYTHONPATH=$PYTHONPATH:`pwd`:`pwd`/slim

Creating labeled data-set:

Collect images of all these 5 sets of balls, 500 images of each set should give pretty good results. After collecting, save all images to /research/object_detection/images

Now, there are various ways to label your images. One of the most widely used application would be labelImg. It has a user-friendly GUI and you won’t take much time to get used to it. Now, you have to annotate all your images and they will be stored as xml files in the Pascal VOC format.

Annotate all the images, keep providing all sets of similar images with similar names. Here, is how an annotated image looks like. This is just an example of using labelImg, you will have to annotate sports balls.

Now, convert all xml files to csv file using the code given below:

import os
import glob
import pandas as pd
import xml.etree.ElementTree as ETdef xml_to_csv(path):
 xml_list = []
 for xml_file in glob.glob(path + ‘/*.xml’):
 tree = ET.parse(xml_file)
 root = tree.getroot()
 for member in root.findall(‘object’):
 value = (root.find(‘filename’).text,
 int(root.find(‘size’)[0].text),
 int(root.find(‘size’)[1].text),
 member[0].text,
 int(member[4][0].text),
 int(member[4][1].text),
 int(member[4][2].text),
 int(member[4][3].text)
 )
 xml_list.append(value)
 column_name = [‘filename’, ‘width’, ‘height’, ‘class’, ‘xmin’, ‘ymin’, ‘xmax’, ‘ymax’]
 xml_df = pd.DataFrame(xml_list, columns=column_name)
 return xml_dfdef main():
 image_path = os.path.join(os.getcwd(), ‘annotations’)
 xml_df = xml_to_csv(image_path)
 xml_df.to_csv(‘sports_balls.csv’, index=None)
 print(‘Successfully converted xml to csv.’)main()

Finally, csv file should be converted into tf.record file by running the code below:

python generate_tfrecord.py — csv_input=<path to csv file>— output_path=images/train.record — image_dir=<path to images>

from __future__ import division
from __future__ import print_function
from __future__ import absolute_importimport os
import io
import pandas as pd
import tensorflow as tffrom PIL import Image
from object_detection.utils import dataset_util
from collections import namedtuple, OrderedDictflags = tf.app.flags
flags.DEFINE_string(‘csv_input’, ‘’, ‘Path to the CSV input’)
flags.DEFINE_string(‘output_path’, ‘’, ‘Path to output TFRecord’)
flags.DEFINE_string(‘image_dir’, ‘’, ‘Path to images’)
FLAGS = flags.FLAGS# TO-DO replace this with label map
def class_text_to_int(row_label):
 if row_label == ‘Rugbyball’:
 return 1
 elif row_label == ‘Cricketball’:
 return 2
 elif row_label == ‘Golfball’:
 return 3
 elif row_label == ‘Volleyball’:
 return 4
 elif row_label == ‘Tennisball’:
 return 5
 else:
 return 0def split(df, group):
 data = namedtuple(‘data’, [‘filename’, ‘object’])
 gb = df.groupby(group)
 return [data(filename, gb.get_group(x)) for filename, x in zip(gb.groups.keys(), gb.groups)]def create_tf_example(group, path):
 with tf.gfile.GFile(os.path.join(path, ‘{}’.format(group.filename)), ‘rb’) as fid:
 encoded_jpg = fid.read()
 encoded_jpg_io = io.BytesIO(encoded_jpg)
 image = Image.open(encoded_jpg_io)
 width, height = image.sizefilename = group.filename.encode(‘utf8’)
 image_format = b’jpg’
 xmins = []
 xmaxs = []
 ymins = []
 ymaxs = []
 classes_text = []
 classes = []for index, row in group.object.iterrows():
 xmins.append(row[‘xmin’] / width)
 xmaxs.append(row[‘xmax’] / width)
 ymins.append(row[‘ymin’] / height)
 ymaxs.append(row[‘ymax’] / height)
 classes_text.append(row[‘class’].encode(‘utf8’))
 classes.append(class_text_to_int(row[‘class’]))tf_example = tf.train.Example(features=tf.train.Features(feature={
 ‘image/height’: dataset_util.int64_feature(height),
 ‘image/width’: dataset_util.int64_feature(width),
 ‘image/filename’: dataset_util.bytes_feature(filename),
 ‘image/source_id’: dataset_util.bytes_feature(filename),
 ‘image/encoded’: dataset_util.bytes_feature(encoded_jpg),
 ‘image/format’: dataset_util.bytes_feature(image_format),
 ‘image/object/bbox/xmin’: dataset_util.float_list_feature(xmins),
 ‘image/object/bbox/xmax’: dataset_util.float_list_feature(xmaxs),
 ‘image/object/bbox/ymin’: dataset_util.float_list_feature(ymins),
 ‘image/object/bbox/ymax’: dataset_util.float_list_feature(ymaxs),
 ‘image/object/class/text’: dataset_util.bytes_list_feature(classes_text),
 ‘image/object/class/label’: dataset_util.int64_list_feature(classes),
 }))
 return tf_exampledef main(_):
 writer = tf.python_io.TFRecordWriter(FLAGS.output_path)
 path = os.path.join(FLAGS.image_dir)
 examples = pd.read_csv(FLAGS.csv_input)
 grouped = split(examples, ‘filename’)
 for group in grouped:
 tf_example = create_tf_example(group, path)
 writer.write(tf_example.SerializeToString())writer.close()
 output_path = os.path.join(os.getcwd(), FLAGS.output_path)
 print(‘Successfully created the TFRecords: {}’.format(output_path))if __name__ == ‘__main__’:
 tf.app.run()

Creating labelmap.pbtxt file

Create a file named labelmap.pbtxt inside with the content as:

item {
 id: 1
 name: 'Rugbyball'
}item {
 id: 2
 name: 'Cricketball'
}item {
 id: 3
 name: 'Golfball'
}item {
 id: 4
 name: 'Volleyball'
}item {
 id: 5
 name: 'Tennisball'
}

Save this file inside /research/object_detection/training folder

Using pre-trained model:

download the model from https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/detection_model_zoo.md
unzip it in /research/object_detection

Save your tf.record files(train and test) in /research/object_detection/images folder. Save labelmap.pbtxt inside /research/object_detection/training folder.

Changes in config file:

from /research/object_detection/samples/configs→copy a config file to /research/object_detection/training folder.

Make changes in this config file as below:

num_classes: 5fine_tune_checkpoint: "../faster_rcnn_inception_resnet_v2_atrous_coco_2018_01_28/model.ckpt"train_input_reader: {
 tf_record_input_reader {
 input_path: “../images/train.record”
 }
 label_map_path: “../training/labelmap.pbtxt”
}eval_config: {
 num_examples: 8000
 # Note: The below line limits the evaluation process to 10 evaluations.
 # Remove the below line to evaluate indefinitely.
 max_evals: <no. of test images>
}eval_input_reader: {
 tf_record_input_reader {
 input_path: “../images/test.record”
 }
 label_map_path: “../training/labelmap.pbtxt”
 shuffle: false
 num_readers: 1
}

Training:

Inside /legacy/train.py run below command:

python3 train.py --logtostderr --train_dir=../training --pipeline_config_path=../training/faster_rcnn_inception_resnet_v2_atrous_coco.config

I have used the model with high accuracy but less speed, faster_rcnn_inception_resnet_v2_atrous_coco_2018_01_28, you have to write the name of your downloaded model.

Tensorboard:

Visualize your results now.To check tensorboard run below command from /object_detection
tensorboard — logdir=training

Create inference graph:

python export_inference_graph.py \
 — input_type image_tensor \
 — pipeline_config_path training/faster_rcnn_inception_resnet_v2_atrous_coco.config \
 — trained_checkpoint_prefix training/model.ckpt-280 \
 — output_directory inference-graph