Step by Step TensorFlow Object Detection API Tutorial — Part 2: Converting Existing Dataset to TFRecord

At this point you have selected a pre-trained model that you want to adapt to a new object detection task. In this post I’ll show you how you can convert the dataset into a TFRecord file so you can fine tune the model. This is one of the trickiest parts of the entire process and will require you to write some code unless the dataset you choose is already in a specific format.

In this tutorial I’m creating a traffic light classier that can identify the state of a traffic light. The pre-trained model is capable of identifying a traffic light in an image, but not it’s state (green, yellow, red, etc). I have decided to use the Bosch Small Traffic Light Dataset which seems to be ideal for what I’m trying to accomplish.


Dataset Labels

The TensorFlow Object Detection API requires all the labeled training data to be in TFRecord file format. If your dataset comes with labels stored in individual .xml files like the PASCAL VOC dataset, there exists a file called create_pascal_tf_record.py which you can use (might require slight modifications) to convert your dataset in to a TFRecord file.

However if you aren’t so lucky, you will have to write your own script to create a TFRecord file from your dataset. The Bosch dataset labels are all stored in a single .yaml file, a snippet of which is shown below.

The image 720654.png contains two green lights and 720932.png contains none.

A TFRecord combines all the labels (bounding boxes) and images for the entire dataset into one file. While it’s a bit of a pain to create a TFRecord file, it’s very convenient to use once it’s created.


Creating a Single TFRecord Entry

TensorFlow provides us with a sample script in the file using_your_own_dataset.md which I’ll go over now.

The function above is given the label and data information of a single image that is pulled from the .yaml file. Using this information you need to write code to populate all of the given variables. Note that in addition to the bounding box and class information, the encoded image data must also be supplied. This can be done using the tensorflow.gifle.GFile() function. With all those variables populated, you are ready to move to the second part of the script.


Creating the Entire TFRecord File

With the create_tf_record function completed, you just need to create a loop to call that function for every label in your dataset. The TensorFlow sample script gives the following area to do so.

With that finished, you are all set to run your script. Anthony Sarkis has a very clean implementation for the TFRecord script for the Bosch dataset if you want to see a completed example.

If you didn’t modify your .bashrc file previously, ensure that you run the export PYTHONPATH line (located in Part 1) in your terminal window before you run this script. With your terminal in the folder containing your TFRecord script and your data (images) in the same location listed in your .yaml (or other file that contains the image paths), run the following command.

python tf_record.py --output_path training.record

To ensure you did everything correctly, you can compare the size of the created training record file to the size of the folder containing all of the training images. If they are almost exactly the same, your done!

Your dataset will likely have a separate training and evaluation dataset. Ensure that you create a separate TFRecord file for each.

In the next post I’ll show you how create your own dataset so you can squeeze that last bit of performance out of your model!