Nerd For Tech
Published in

Nerd For Tech

Transfer Learning with ResNet50 — Lucy (French Bulldog) Classifier

Disclaimer: this article isn’t going to spit out code and steps for how to fine tune an existing neural network — there are already plenty of such articles and you can find the code for this project on GitHub. Instead, this is about the thought process behind my project and what I learned from it.

Lucy. Image by author.

Motivation for the Project

Everybody who knows me knows how much I love French Bulldogs. Even those who don’t know me would be able to tell just from looking at my website. I’ve become increasingly interested in computer vision and wanted to do a personal project, but didn’t want to toss another MNIST digit, fashion, or Iris classifier into the internet.

Since computer vision tasks vary in level of difficulty, with image classification at the “easy” end and full-blown object detection (which entails both classification and localization of an object within an image) towards the “hard” end, I thought it best to start with classification and work my way up. Note that the way I’m using the term “level of difficulty” has nothing to do with the computer vision task itself, but rather the actual process of building such a classifier or detector.

Data and Model Selection

I decided I would need a custom dataset if I were to make a truly unique image classifier. Naturally, I have loads of images of my Frenchie, Lucy, at my fingertips. Despite having many, I certainly don’t have enough to train my own neural network from scratch.

Transfer learning is perfect for such a circumstance. For those who aren’t familiar, transfer learning is the process of freezing the weights of a pre-trained neural network and training a custom classification head. ResNet50, an existing neural network, was trained on over one million images from the ImageNet database, including French Bulldogs.

I wanted the network to do more than classify an image of a French Bulldog (as it already could). More specifically, I wanted it to discriminate between Lucy and any other random Frenchie. In addition to labeling my own images of Lucy, I sourced images of random French Bulldogs from the Stanford Dogs Dataset. My dataset consisted of training, validation, and test images for two classes — Lucy and “other frenchie.”

Classification Head

It took quite a bit of experimentation to find the existing classification head on my final model. In an effort to heed the advice of Occam’s Razor, I started with the most parsimonious head possible — a Dense layer of 64 neurons, Dropout and Normalization layers to combat overfitting, and a final classification layer.

I wanted the final validation accuracy to be greater than 0.9. No matter how I played with the hyper parameters, I simply could not hit this metric without adding more Dense layers. As you can see (if you look at the Jupyter notebook), I ended up adding three more Dense layers and corresponding Dropout and Normalization layers after each.

Yes, I know I could have used cross validation, but I wanted to experiment with the behavior of the network firsthand.

Results

After fifteen epochs of training, my model reaches a training accuracy of 0.95, validation accuracy of 0.92, and a test accuracy of 0.96 (the larger test accuracy is likely a result of a smaller test dataset compared to the train and validation sets).

Loss decreases steadily with each epoch and accuracy increases incrementally, as can be seen in Figures 1 and 2 below.

Figure 1. Validation loss decreases with each epoch.
Figure 2. Accuracy on the validation set increases to a peak value of 0.95 and a final value of 0.92.

Learnings

Here is what I learned from what I thought would be a relatively simple project:

  1. Data processing can take more time than actual model training and tuning. Finding, labeling, splitting, rescaling, and resizing images, within the correct folder structure (at least for ImageDataGenerators, as I used in this case) can be time-consuming and challenging. Moving forward, I will never underestimate the complexity that this can bring to a project.
  2. Training and validation data sets need to be relatively similar. I knew this before the project, but I experienced it firsthand when I began training with images I scraped from Unsplash (before I used the Stanford dataset). The network’s validation and training accuracies were way too high to be believable. After closer examination, the pictures from Unsplash were very high quality and the Frenchie in each of them was front and center with very consistent lighting conditions. This is not representative of my images of Lucy.
  3. Image classification (and machine learning tasks, generally) is easier understood than implemented. Having a deep understanding of how a neural network like ResNet50 classifies images is one thing, but actually building a classifier requires an entirely additional skillset. Doing it takes more than just knowing how it works — that’s why I did this project. I think I’m ready for the next one (object detection).

Stay tuned.

--

--

--

NFT is an Educational Media House. Our mission is to bring the invaluable knowledge and experiences of experts from all over the world to the novice. To know more about us, visit https://www.nerdfortech.org/.

Recommended from Medium

Re-tell a Paper: “Deep Learning for 3D Building Reconstruction: a Review”

Image Denoising with Feature Attention

Top 5 Classification Algorithms in Machine Learning

Intel AI for Non-Invasive Brain Cancer Research

Attempting Open-ai’s taxi-v2 using the SARSA-max algorithm

How To Import and Display The Fashion MNIST Dataset Using Tensorflow

png

How I used a Random Forest Classifier to Day Trade for 2 Months — Part 2

Recommender System Metrics — Clearly Explained

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Jgalvin

Jgalvin

Former consultant | Emotion AI | Analytics Software | Skiing, Golf, Mountain Biking, and French Bulldogs

More from Medium

Comparison & Implementation Of Different Pre-Trained Keras Model For Transfer Learning.

Image Data Augmentation — Computer Vision, Deep Learning

Hand Gesture Recognition using TinyML on OpenMV

Computer Vision, Deep Learning and Object Detection