Transfer Learning for Robust Image Classification

Learn how to use Google Colab with VScode and submit your solution to Kaggle

Published in

Towards Data Science

5 min readNov 19, 2021

It is well known that Google Colab is a very useful tool to develop machine learning projects using the hardware made available by Google. It is also true that it is sometimes complicated to structure a project using a single page on Colab, while it would be much easier to manage everything using an IDE like VScode. Well, there is a solution!

Open a new Colab notebook and enter the following commands

Now in the editor that has been launched on the web page select and open the content folder.

You‘re officially using VScode on top of Google Colab!

Dataset

We will use the dogs vs cats dataset (which has a free license ) that you can find at the following link: https://www.kaggle.com/c/dogs-vs-cats-redux-kernels-edition/overview. I will show you how to create a model to solve this binary classification task and submit your solution.

The first thing to do in order to download this dataset is to access Kaggle with your credentials and then download the kaggle.json file that you can get by clicking on the Create New API Token button.

You must now load the kaggle.json file we downloaded into your content working directory. Then let’s create the following .py files that you will need for this project

Utils

Let’s start with utils.py file. We first define a function to download the compressed folders containing the images and extract them appropriately.

You should now see a directory structure like this

All the images that you will find in the training folder must be split and put in the subfolders train and val that contain the images of our training and validation set respectively.
We write a function that performs this split, based on the sklearn library.

In the same file, we need two more functions. The first one is get_avg_size that will iterate over all the images of the training set to compute the average height and width of all the images. This is needed because we should specify an input size to our convolutional network, since the input of each image is different, taking the average seems a good compromise.

Now that we have cleaned up our file system e we know the average sizes of our pictures we can use our data to actually create the training, validation and test set to feed to our network. TensorFlow provides us with the ImageDataGenerator class to write basic data processing in a very simple way.

Both training and validation preprocessors in the following code will perform scaling of the input image pixels dividing them by 255.

The subsequent input parameters of ImageDataGenerator allow us to modify and also increase the amount of data by adding slightly modified copies of existing ones (Data Augmentation). For example, the rotation_range rotates images through any degree between 0 and 360.

You can get an idea of all the arguments used for Data Augmentation by reading the Tensorflow documentation: https://www.tensorflow.org/api_docs/python/tf/keras/preprocessing/image/ImageDataGenerator

Notice that the preprocessor of the validation data has no data augmentation features because we want to leave it unchanged to better validate our model.

The same thing should naturally be done for the test set generator.

DeepLearning Model

We now move on to creating our model in the deeplearning_models.py file.
In solving most Kaggle tasks you don’t write a network from scratch but you use a pre-trained model called base_model and adapt it to the task at hand. Think of base_model as a model that has already learned to recognize important features in images. What we want to do is to adapt it by adding a head composed of other dense layers. In our case, the last dense layer will be composed of a single neuron that will use a sigmoid activation function so that we will have an output probability of being 0 or 1 (cat or dog).

We must be careful not to train the base model that has already been previously trained.
The base_model that we will import is MobileNetV2 very used in tasks of image classification tasks. (https://www.tensorflow.org/api_docs/python/tf/keras/applications/mobilenet_v2/MobileNetV2)

Training the model

Let’s now work on the model_training.py file.

In the training step, we are going to use a callback ModelCheckPoint that allows us from time to time to save the best model (evaluated on the validation loss) found at each epoch. The EarlyStopping callback instead is used to interrupt the training phase if after a patience=x times there was no improvement. We compile and fit the model as usual. Remember to include the two callbacks.

Let’s now work on the model_training.py file.
In the training step, we are going to use a callback ModelCheckPoint that allows us from time to time to save the best model (evaluated on the validation loss) found at each epoch. The EarlyStopping callback instead is used to interrupt the training phase if after a patience=x times there was no improvement. So let’s compile and fit the model as usual. Remember to include the two callbacks.

Predictions

Let's write our prediction procedure in predictions.py.

After predicting the test images, download the generated submission.csv file and upload it to Kaggle, in order to complete the challenge!

What position did you get on the Leaderboard?

The End

Marcello Politi

Linkedin, Twitter, CV