Building an Image Classifier using TensorFlow

Photo by Arif Wahid on Unsplash

The prospect of incorporating Computer Vision with Machine Learning gives me chills! It’s really fascinating how we can build and train models to make machines identify between images, such as a picture of a dog or a cat, with phenomenal precision. The potential is endless. In this article I am going to explain how you can build an image classifier yourself with the help of TensorFlow for Poets, created by Google, to recognize just about anything in the world!

Before we begin, I feel obligated to provide some background information. TensorFlow is an open-source library created by Google that specializes in machine learning applications. For anybody trying to get started with computer vision and machine learning, this is a great starting point to understand the elaborate process of image classification

Building an image classifier from scratch is a colossal and daunting task. There are millions of things that need to be taken into consideration. Luckily for us, Google has open-sourced one of its best image classifier models called the Inception, which was trained on a staggering 1.2 million images from a thousand different categories for two weeks at a stretch in some of the fastest machines in the world.

The Inception model: convolutional neural network with multiple layers of abstractions

We are going to use this existing model and build our own on top of it. This approach brings with it numerous advantages. For instance, it will save us a lot of time, some of the parameters that the Inception has already learned can be reused and we can still build a pretty accurate classifier with far less training data. This process of reusing pre-trained models on different but related tasks is known as Transfer Learning in the world of Deep Learning.

1. Download Training Images

First step is to download the training images for your classifier. These will consist of the images that you want your classifier to learn to recognize. You need to keep them neatly divided and labeled into separate folders. The folder_names are considered as the label for the photos they contain.

For this example, we will download images of 5 types of flowers with over 700 images for each type. Download the images here. You can choose to classify something else, but make sure your directory is neatly divided like above. Ideally, you should have over a hundred images for each category of pictures (e.g >100 images of cats, >100 images of dogs, etc). The more photos you provide, and the more diverse they are, the more accurate your classifier will become.

Tips: There is a really cool Chrome Extension called Fatkun Batch Download Image to bulk download images from Google.

2. Download TensorFlow scripts

All the scripts that we need are kept in googlecodelab’s git repository. We will need to clone the repository to our computer. Note that you need to have git installed in your computer. Open terminal and use the following command to clone the repo:

git clone https://github.com/googlecodelabs/tensorflow-for-poets-2

The repo contains the following scripts:

Copy the flower_photos folder that contains all your training images inside the tf_files folder of the repository. To confirm the contents of your working directory, use the command ls:

ls tf_files/flower_photos

This should display the folders of flowers that you are about to retrain your classifier on.

3. Retrain the network

As I have mentioned before, these image classification models contain millions of parameters. We are simply trying to build our classifier on top of it. In other words, we will simply train the final layer of the network. Although we are not required to explicitly code any part of the script, it is highly recommended that we do understand some of the parameters being used by the script.

A Little Theory

According to TensorFlow’s Image Retraining documentation, ‘Bottleneck’ is an informal term used to describe the layer that comes just before the final layer that performs the actual classification. The bottleneck layer has been trained to come up with a set of values that are a good representative of the images and which can be used by the classifier to differentiate between the different classes it’s been asked to recognize. [Read the documentation for a more detailed explanation.]

The following directory stores the cache of all the bottleneck values so that they don’t have to be recalculated and therefore saves precious time:

--bottleneck_dir=tf_files/bottlenecks 

The following commands just direct to the different directories of the scripts:

--model_dir=tf_files/models/"${ARCHITECTURE}" \
--summaries_dir=tf_files/training_summaries/"${ARCHITECTURE}" \
--output_graph=tf_files/retrained_graph.pb \
--output_labels=tf_files/retrained_labels.txt \

And finally, adding the directory of our training images:

--image_dir=tf_files/flower_photos

Implementation

Combined, we have these set of commands. We will run it in the terminal to start the retraining process - downloads the pre-trained model, adds a new final layer, and trains that layer on the flower photos downloaded.

python -m scripts.retrain \
--bottleneck_dir=tf_files/bottlenecks \
--model_dir=tf_files/models/"${ARCHITECTURE}" \
--summaries_dir=tf_files/training_summaries/"${ARCHITECTURE}" \
--output_graph=tf_files/retrained_graph.pb \
--output_labels=tf_files/retrained_labels.txt \
--image_dir=tf_files/flower_photos

It will take around 30 minutes to train the classifier on all the images. The time will vary depending on the number of training images that you have provided. [Occasionally, you may be overwhelmed with errors. Don’t be intimidated. I have included solutions to two common errors in the footnote. Solutions to rest can be found in StackOverflow and Quora]

4. Classify Images

Once you have the trained classifier, it is ready to be tested. You could download a new image of one of the category of flowers that we trained our classifier on, or select a picture from our existing sets of training images. This time we will call our label_image script. Run the following command to classify the new image:

python -m scripts.label_image \
--image=tf_files/flower_photos/daisy/test_image.jpg

Result

You will get a list of all the categories with their corresponding confidence score. The above result claims that the test_image is a daisy with (~99%) confidence, which means that our classifier has predicted pretty accurately.

You have now successfully built a pretty accurate image classifier. Good luck.

References and useful resources:

  1. TensorFlow -How to Retrain an Image Classifier for New Categories
  2. TensorFlow for Poets
  3. Josh Gordon’s ML Recipes
  4. Siraj Raval’s Image Classifier tuts
  5. tensorflow for poets 2 repository
  6. A poet does TensorFlow
  7. TensorFlow ValueError Solution
  8. ‘import/input’ refers to an Operation not in the graph Solution
  9. Image Classification Walk-through - Chris Dahms