Training custom-data using tensorflow for object-detection

Ashish Gusain
Analytics Vidhya
Published in
4 min readApr 29, 2020

Object detection using custom data is always fun to work on. Today let’s get our hands dirty on detecting 5 different sports balls including cricket ball, tennis ball ,rugby ball, volleyball and golf ball.

Get the repository:

git clone https://github.com/tensorflow/models

Install dependencies:

pip install — user Cython
pip install — user contextlib2
pip install — user pillow
pip install — user lxml
pip install — user jupyter
pip install — user matplotlib

Environment setup:

download protoc zip file from https://github.com/protocolbuffers/protobuf/releases, extract and you will get bin folder with protoc.exe file in it.

Now, run below command from the terminal

cd <path_to_tensorflow_cloned_repo>/models/research/
python setup.py build
python setup.py install
protoc object_detection/protos/*.proto --python_out=.
export PYTHONPATH=$PYTHONPATH:`pwd`:`pwd`/slim

Creating labeled data-set:

Collect images of all these 5 sets of balls, 500 images of each set should give pretty good results. After collecting, save all images to /research/object_detection/images

Now, there are various ways to label your images. One of the most widely used application would be labelImg. It has a user-friendly GUI and you won’t take much time to get used to it. Now, you have to annotate all your images and they will be stored as xml files in the Pascal VOC format.

Annotate all the images, keep providing all sets of similar images with similar names. Here, is how an annotated image looks like. This is just an example of using labelImg, you will have to annotate sports balls.

Now, convert all xml files to csv file using the code given below:

import os
import glob
import pandas as pd
import xml.etree.ElementTree as ET
def xml_to_csv(path):
xml_list = []
for xml_file in glob.glob(path + ‘/*.xml’):
tree = ET.parse(xml_file)
root = tree.getroot()
for member in root.findall(‘object’):
value = (root.find(‘filename’).text,
int(root.find(‘size’)[0].text),
int(root.find(‘size’)[1].text),
member[0].text,
int(member[4][0].text),
int(member[4][1].text),
int(member[4][2].text),
int(member[4][3].text)
)
xml_list.append(value)
column_name = [‘filename’, ‘width’, ‘height’, ‘class’, ‘xmin’, ‘ymin’, ‘xmax’, ‘ymax’]
xml_df = pd.DataFrame(xml_list, columns=column_name)
return xml_df
def main():
image_path = os.path.join(os.getcwd(), ‘annotations’)
xml_df = xml_to_csv(image_path)
xml_df.to_csv(‘sports_balls.csv’, index=None)
print(‘Successfully converted xml to csv.’)
main()

Finally, csv file should be converted into tf.record file by running the code below:

python generate_tfrecord.py — csv_input=<path to csv file>— output_path=images/train.record — image_dir=<path to images>

from __future__ import division
from __future__ import print_function
from __future__ import absolute_import
import os
import io
import pandas as pd
import tensorflow as tf
from PIL import Image
from object_detection.utils import dataset_util
from collections import namedtuple, OrderedDict
flags = tf.app.flags
flags.DEFINE_string(‘csv_input’, ‘’, ‘Path to the CSV input’)
flags.DEFINE_string(‘output_path’, ‘’, ‘Path to output TFRecord’)
flags.DEFINE_string(‘image_dir’, ‘’, ‘Path to images’)
FLAGS = flags.FLAGS
# TO-DO replace this with label map
def class_text_to_int(row_label):
if row_label == ‘Rugbyball’:
return 1
elif row_label == ‘Cricketball’:
return 2
elif row_label == ‘Golfball’:
return 3
elif row_label == ‘Volleyball’:
return 4
elif row_label == ‘Tennisball’:
return 5
else:
return 0
def split(df, group):
data = namedtuple(‘data’, [‘filename’, ‘object’])
gb = df.groupby(group)
return [data(filename, gb.get_group(x)) for filename, x in zip(gb.groups.keys(), gb.groups)]
def create_tf_example(group, path):
with tf.gfile.GFile(os.path.join(path, ‘{}’.format(group.filename)), ‘rb’) as fid:
encoded_jpg = fid.read()
encoded_jpg_io = io.BytesIO(encoded_jpg)
image = Image.open(encoded_jpg_io)
width, height = image.size
filename = group.filename.encode(‘utf8’)
image_format = b’jpg’
xmins = []
xmaxs = []
ymins = []
ymaxs = []
classes_text = []
classes = []
for index, row in group.object.iterrows():
xmins.append(row[‘xmin’] / width)
xmaxs.append(row[‘xmax’] / width)
ymins.append(row[‘ymin’] / height)
ymaxs.append(row[‘ymax’] / height)
classes_text.append(row[‘class’].encode(‘utf8’))
classes.append(class_text_to_int(row[‘class’]))
tf_example = tf.train.Example(features=tf.train.Features(feature={
‘image/height’: dataset_util.int64_feature(height),
‘image/width’: dataset_util.int64_feature(width),
‘image/filename’: dataset_util.bytes_feature(filename),
‘image/source_id’: dataset_util.bytes_feature(filename),
‘image/encoded’: dataset_util.bytes_feature(encoded_jpg),
‘image/format’: dataset_util.bytes_feature(image_format),
‘image/object/bbox/xmin’: dataset_util.float_list_feature(xmins),
‘image/object/bbox/xmax’: dataset_util.float_list_feature(xmaxs),
‘image/object/bbox/ymin’: dataset_util.float_list_feature(ymins),
‘image/object/bbox/ymax’: dataset_util.float_list_feature(ymaxs),
‘image/object/class/text’: dataset_util.bytes_list_feature(classes_text),
‘image/object/class/label’: dataset_util.int64_list_feature(classes),
}))
return tf_example
def main(_):
writer = tf.python_io.TFRecordWriter(FLAGS.output_path)
path = os.path.join(FLAGS.image_dir)
examples = pd.read_csv(FLAGS.csv_input)
grouped = split(examples, ‘filename’)
for group in grouped:
tf_example = create_tf_example(group, path)
writer.write(tf_example.SerializeToString())
writer.close()
output_path = os.path.join(os.getcwd(), FLAGS.output_path)
print(‘Successfully created the TFRecords: {}’.format(output_path))
if __name__ == ‘__main__’:
tf.app.run()

Creating labelmap.pbtxt file

Create a file named labelmap.pbtxt inside with the content as:

item {
id: 1
name: 'Rugbyball'
}
item {
id: 2
name: 'Cricketball'
}
item {
id: 3
name: 'Golfball'
}
item {
id: 4
name: 'Volleyball'
}
item {
id: 5
name: 'Tennisball'
}

Save this file inside /research/object_detection/training folder

Using pre-trained model:

download the model from https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/detection_model_zoo.md
unzip it in /research/object_detection

Save your tf.record files(train and test) in /research/object_detection/images folder. Save labelmap.pbtxt inside /research/object_detection/training folder.

Changes in config file:

from /research/object_detection/samples/configs→copy a config file to /research/object_detection/training folder.

Make changes in this config file as below:

num_classes: 5fine_tune_checkpoint: "../faster_rcnn_inception_resnet_v2_atrous_coco_2018_01_28/model.ckpt"train_input_reader: {
tf_record_input_reader {
input_path: “../images/train.record”
}
label_map_path: “../training/labelmap.pbtxt”
}
eval_config: {
num_examples: 8000
# Note: The below line limits the evaluation process to 10 evaluations.
# Remove the below line to evaluate indefinitely.
max_evals: <no. of test images>
}
eval_input_reader: {
tf_record_input_reader {
input_path: “../images/test.record”
}
label_map_path: “../training/labelmap.pbtxt”
shuffle: false
num_readers: 1
}

Training:

Inside /legacy/train.py run below command:

python3 train.py --logtostderr --train_dir=../training --pipeline_config_path=../training/faster_rcnn_inception_resnet_v2_atrous_coco.config

I have used the model with high accuracy but less speed, faster_rcnn_inception_resnet_v2_atrous_coco_2018_01_28, you have to write the name of your downloaded model.

Tensorboard:

Visualize your results now.To check tensorboard run below command from /object_detection
tensorboard — logdir=training

Create inference graph:

python export_inference_graph.py \
— input_type image_tensor \
— pipeline_config_path training/faster_rcnn_inception_resnet_v2_atrous_coco.config \
— trained_checkpoint_prefix training/model.ckpt-280 \
— output_directory inference-graph

Results:

Now, we are all done. This inference graph can be used to detect objects for the respective images. Let’s see the final results of the detection :

The entire work for this object detection can be seen at https://github.com/AshishGusain17/via_google_colab/blob/master/sports_balls_detection_using_tf.ipynb

For any queries, you can reach me via :

Email : ashishgusain12345@gmail.com

Github : https://github.com/AshishGusain17

LinkedIn : https://www.linkedin.com/in/ashish-gusain-257b841a2/

--

--

Ashish Gusain
Analytics Vidhya

Full Stack Developer | MERN Stack | Data Science | ML