Step by Step TensorFlow Object Detection API Tutorial — Part 1: Selecting a Model

TensorFlow’s Object Detection API is a very powerful tool that can quickly enable anyone (especially those with no real machine learning background like myself) to build and deploy powerful image recognition software. However since it’s so new and documentation is pretty sparse, it can be tough to get up and running quickly.

Reading other guides and tutorials I found that they glossed over specific details which took me a few hours to figure out on my own. I’m creating this tutorial to hopefully save you some time by explicitly showing you every step of the process.

I’ll be creating a traffic light classifier which will try to determine if the light is green, yellow, or red. Currently the pre-trained models only try to detect if there is a traffic light in the image, not the state of the traffic light.

This series of posts will cover selecting a model, adapting an existing data set, creating and annotating your own data set, modifying the model config file, training the model, saving the model, and finally deploying the model in another piece of software. I do this entire tutorial in Linux but it’s information can be used on other OS’s if they can install and use TensorFlow.


git clone 

somewhere easy to access as we will be coming back to this folder routinely. Open up and follow the instructions to install TensorFlow and all the required dependencies. Note, even if you already have TensorFlow installed you still need to follow the “Add Libraries to PYTHONPATH” instructions. If you aren’t familiar with modifying your .bashrc file, navigate a terminal console to the models/research/ folder and enter the command

export PYTHONPATH=$PYTHONPATH:`pwd`:`pwd`/slim 

into your terminal window. You will have to redo this if you close your terminal window.

In the models/research/objection_detection/ folder, open up the jupyter notebook object_detection_tutorial.ipynb and run the entire notebook. Do not move this file outside of this folder or else some of the visualization import statements will fail.

At this point you should have a few sample images of what you are trying to classify. Place them in the tests_images folder and name them image3.jpg, image4.jpg, imageN.jpg, etc. In the notebook modify the line under the detection heading to

TEST_IMAGE_PATHS = [ os.path.join(PATH_TO_TEST_IMAGES_DIR, 'image{}.jpg'.format(i)) for i in range(1, N+1) ]

Where N is the last number of the image you placed in the folder. When you re-run the notebook you will find that your images have been classified. When I did this with 3 sample traffic light images I got the following result.

As shown in the images, the model is able to classify the light in the first image but not the second image.

The default model in the notebook is the simplest (and fastest) pre-trained model offered by TensorFlow. Looking at the table below, you can see there are many other models available. mAP stands for mean average precision, which indicates how well the model performed on the COCO dataset. Generally models that take longer to compute perform better. However these models also have a number of subtle differences (such as performance on small objects) and if you want to understand their strengths and weakness, you need to read the accompanying papers.

To get a rough approximation for performance just try each model out on a few sample images. If the item you are trying to detect is not one of the 90 COCO classes, find a similar item (if you are trying to classify a squirrel, use images of small cats) and test each model’s performance on that.

To test a new model, just replace the MODEL_NAME in the jupyter notebook with the specific model download location found in the detection_model_zoo.mb file located in the g3doc folder. I ended up settling on the R-FCN model which produced the following results on my sample images.

Next post I’ll show you how to turn an existing database into a TensorFlow record file so that you can use it to fine tune your model for the problem you wish to solve!