TUTORIAL: How to Classify Pothole Images Using Transfer Learning

Photo by Marc-Olivier Jodoin on Unsplash

In this tutorial, we will learn how to complete a classification challenge using the easy and fast open-source Tensorflow and Tensorflow Hub without writing a single line of code. This is the first in the series of pothole detection techniques for autonomous vehicles and subsequent articles will improve the task to detect instances and pixel segmentation. While achieving this feat, you’ll learn how to use the Tensorflow framework of Google, and Tensorflow Hub platform for transfer learning purpose.

Overview

  1. Introduction to Image classification
  2. Introduction to Pre-trained models of the Tensorflow Hub
  3. Transfer learning with Tensorflow
  4. Image Classification (Pothole)

Prerequisites

  1. Basic knowledge of Python for deep learning development (Anaconda distribution is recommended).
  2. Basic Knowledge of Tensorflow library, as this is used to write and scripts. It is recommended if you want to read and understand the codes.
  3. Basic knowledge of Convolutional Neural Networks.

Introduction to Transfer Learning

If you are a compulsive ML learner like myself, iterating models till a desirable result is achieved is very common. Now, if your initial model takes 10 hours to successfully train and the need to change some hyper-parameters occurs or to perform some data augmentation, or even you decided to change the architecture of the model you used, this is another, let say, 10 hours for the second iteration, 10 hours for the third iteration, and so it keeps going. This has proven to be very frustrating.

Another frustrating experience for ML engineers is the need to have enough image dataset to train the model on, as Convolutional Neural Networks only do a good work if there are a lot of images to learn from (in millions or multiple of thousands). This is where transfer learning comes in.

In practice, researchers typically prefer not to train a Convolutional Neural Network from scratch, especially when the above-stated challenges will be encountered. Instead, a transfer learning approach is adopted by using a CNN trained on huge datasets (such as Inception, AlexNet, VGGNet, GoogleNet) either as an initialization or as a feature extractor.

A major approach to transfer learning is to fine-tune the layers of a pre-trained CNN by adjusting the weights of desired layers. Yosinski et al. (2014) observed that earlier features learned by a CNN have more universal features that can be applied to different tasks, while layers towards the end of a CNN architecture become limited to the features of the categories defined in the input dataset as it approaches the final layer.

As for feature extraction, the last fully-connected layer of the CNN which contains the output of the score of the various classes is removed and the remaining layers are treated as a fixed feature extractor for the newly desired dataset.

Tensorflow Hub contains modules (pre-trained models) that can be used for the purpose of transfer learning. A particular module (model) contains the weights of a particular network, it’s tensorflow graph along with its assets. Follow the installation guide here to get Tensorflow Hub on your computer.

Inception v3

Google Brain team Convolutional Neural Network, Inception model version 3 was trained on millions of images belonging to about 1000 different categories from the ImageNet database. The model has learnt a wide range of feature representations which can be extended to new classification challenges outside the original classes (1000). We will use this model for our classification challenge.

Pothole Image Classification with Tensorflow

Training Dataset

Since the intention is to classify images that have potholes in it, ImageNet category pothole and chuck hole are used. This category contains about 1,300 images of potholes. This serves as the positive set, while the negative set consists of images from random categories.

Training Procedures

We create a folder that contains the training dataset folder and the training code folder. With this, no additional code will have to be written.

mkdir ~/training_code
cd ~/training_code
curl -LO https://github.com/tensorflow/hub/raw/r0.1/examples/image_retraining/retrain.py

To train the model, simply run the following code:

python retrain.py --image_dir ~/training_data

Analysis (What happens under the hood)

At first, the algorithm goes through all the images in the folder and create bottlenecks (extract feature vectors) from the images, which is then cached in a temporary folder. These bottlenecks have learnt enough to allow the model to classify the new categories. These are saved in tmp/bottleneck.

After the bottlenecks are created, the actual training begins. The training consists of 4000 steps, with each step taking 10 images at random from the training set, and fed their bottlenecks to the final layer for classification. For each training set, the training accuracy, cross entropy (monitors progress of training), and the validation accuracy (new data not used for training) are outputted. This training takes about 30 minutes, depending on the size of the training data and the capability of the system.

Results

To use the trained model to classify a new image, use the following codes:

curl -LO https://github.com/tensorflow/tensorflow/raw/master/tensorflow/examples/label_image/label_image.py
python label_image.py \
--graph=/tmp/output_graph.pb --labels=/tmp/output_labels.txt \
--input_layer=Placeholder \
--output_layer=final_result \
--image=$HOME/image_folder/image_filename.jpg

On a pothole image supplied to the model, the following output was given:

To visualize the training process, the following code launches tensorboard in the browser at localhost:6006:

tensorboard --logdir /tmp/retrain_logs

In the training codes, to achieve better results, you can adjust some hyper-paramters as evident in literature. These hyper-parameters are batch-size, the number of iterations, performing regularization, learning rates. Readers are advised to look up literatures for effective measures to increase performance.

Conclusion

This short tutorial aim is to show deep learning beginners what they can achieve with convolutional neural networks without much dedicated time and energy. While this serves as an introduction to deep learning through the transfer learning approach, I hope it will encourage readers to dive into studying different techniques in Machine learning to achieve a better result. Readers should watch out for subsequent articles on simplified deep learning techniques.

Resources

  1. https://www.tensorflow.org/tutorials/image_retraining
  2. Yosinski, J., Clune, J., Bengio, Y. and Lipson, H. (2014). How transferrable are features in deep neural networks? Advances in Neural Information Processing Systems, 27, pp. 3320–3328.
  3. https://arxiv.org/pdf/1512.00567.pdf