The Art of Transfer Learning (Part-II)

Harish Dasari
Analytics Vidhya
Published in
11 min readSep 5, 2020
Photo by Bruce Timaná on Unsplash

Hello, fellow readers, it’s very good to see you all back, today’s blog-post is the second installment of my previous post regarding transfer learning if in case you have missed reading it I recommend you to please go through once you can access it by clicking on the following link here. In this post, I will try to implement the first type of transfer learning technique which I explained in my previous blog is “Network as an arbitrary feature extractor” in Keras and python programming language.

After completing this blog you will be able to implement:

  • A custom class for writing features into an HDF5 dataset format.
  • How to extract features from the different input image datasets from a pre-trained model which it was not originally trained on.
  • Training an image classifier on the extracted features and displaying the performance of the classifier on different classes.

So let's get started right into it if you want to skip the reading and want the code then you can follow this link.

Source: google gif images

The directory structure for the following project is:

The outer directory structure.
Inner directory structure

In the above two images, you can see the directory structure of our feature extraction project and we will explore each file ahead in this blog.

  1. Dataset collection:
Cat, dog, and panda sample images from animal dataset.

For solving any AI-related task we first have to first gather relevant data so create a new folder named Dataset in which I used the Animal dataset from Kaggle having three classes in it such as Dog, Cat, and pandas. I kept the dataset under a folder named animals which you can see in the above image and it also has a folder named hdf5 which will have its respective features.hdf5 file, like this you can keep multiple category datasets.

2. HDF5 dataset creation script:

Now there is a utils folder which will contain all utility files needed for the project in this there is our first script named HDF5DatasetWriter.py script under the DatasetWriter folder. Take a code-editor of your choice and create a script named HDF5DatasetWriter.py or anything related.

Hdf5DatasetWriter.py code snippet 2.1

Throughout the blog, I will try to explain the main gist of the code and core functionality of each method or function because line by line code explanation will be very much longer and you can easily get lost so please bear with me for a while. As you can see from the above code snippet I have basically created a Class named HDF5DatasetWriter and the constructor accepts four parameters out of which two are optional.

dims: The dims parameter controls the dimension or shape of the data we will be storing in the dataset. Think of dims as the .shape of a NumPy array. If we were storing the (flattened) raw pixel intensities of the 28 28 = 784 MNIST dataset, then dims=(70000, 784) as there are 70,000 examples in MNIST, each with a dimensionality of 784. In future scripts, we will be using the VGG16 network for feature extraction taking the final output of the POOL layer the output layer shape will be (512 X 7 X 7) = 25,088 feature vector when flattened. dims = (N, 25088) Where N is total images in our dataset.

Outputpath: This is a required parameter specifying the path where our output HDF5 file will be stored on the disk.

dataKey: It is an optional parameter that specifies the name of the HDF5 file by default it is “images” as we are mainly dealing with images only.

buffSize: Another optional parameter bufSize controls the size of our in-memory buffer, which we default to 1,000 feature vectors/images. Once we reach bufSize, we’ll flush the buffer to the HDF5 dataset.

Further, we are creating a database using the datakey and creating two datasets one to store the image/features and the others for storing the class labels and then initializing the buffers.

add method: The add method requires two parameters: the rows that we’ll be adding to the dataset, along with their corresponding class labels. Both the rows and labels are added to their respective buffers and once the buffers are full the flush method is called for writing the buffers to the file and reset them.

flush method: The flush method is used for keeping track of the current index into the next available index where we can store data (without overwriting existing data) it also apply NumPy array slicing to store the data and labels in the buffers and then resets the buffers.

Hdf5DatasetWriter.py code snippet 2.2

In the above code snippet, there are remaining two methods which are:

storeClassLabels method: On calling this method it will store the raw string names of the class labels in a separate dataset

close method: This method is used for checking other entries in the buffer that needs to be flushed to the disk and then close the database.

As you can see that these methods do not require any fancy deep-learning library and did not perform any deep-learning specific function it is mainly a simple class that stores data in HDF5 data format.

3. The feature Extraction process:

The internal representation of how deep learning systems builds its understanding of images, from edges and textures in the first layers to patterns, parts, and objects in deeper layers

Now that our HDF5DatasetWriter is implemented, we can move on to actually extracting features using pre-trained Convolutional Neural Networks. Fire up the editor and define a python script named feature_extractor.py & follow me

3.1 feature_extractor.py code snippet

In the above code snippet, I have first imported the necessary libraries and then declared an object of the argparse class for taking different parameters as input for the path to the input dataset, path to the HDF5 file, batch size for images and buffer size for the previous script. After a list of images is taken in random order by shuffling it which makes it easy for making the training and testing splits. The class labels are extracted from the image paths for example if a cat image is chosen randomly from shuffling then its path will look like this “Dataset\animals\cats\cats_00026” so the second last character separated by “\” which is “cats” will be encoded as integers (we’ll perform one-hot encoding during the training process) by using the LabelEncoder() and stored into the labels variable.

3.2 Feature_extractor.py code snippet

Now after arranging the labels from the Keras libraries we will first load the pre-trained model VGG16 till the final Pool layer ignoring the final FC layer by writing this line “VGG16(weights=”imagenet”, include_top=False)”. Here imagenet weights are taken So we are doing exactly the same as explained in my previous blog and would treat VGG16 as an arbitrary feature extractor. The HDF5DatasetWriter class object named “dataset” is created by setting the parameter values as“(dims=(len(imagePaths), 512 * 7 * 7),args[“output”], dataKey=”features”, bufSize=args[“buffer_size”])”. Please refer to the HDF5DatasetWriter code explanation section for an in-depth explanation of each parameter. String names of the class labels according to the label encoder are stored in the “dataset” object.

Now it’s time to perform the actual feature extraction: from Lines 57 to 77 it is basically a twin for loop in the outer on the outer for loop we start looping over our imagePaths in batches of — batch-size. Lines 59 and 60 extracts the image paths and labels for the corresponding batch, while Line 61 initializes a list to store the images about to be loaded and fed into VGG16. Preparing an image for feature extraction is exactly the same as preparing an image for classification via a CNN: in the inner, for loop, we loop over each image path in the batch. Each image is loaded from disk and converted to a Keras-compatible array (Lines 67and 68). We then preprocess the image on Lines 73 and 74, followed by adding it to batch images (Line 77).To obtain our feature vectors for the images in batch images, all we need to do is call the .predict method of the model.

3.3 feature_extractor.py code snippet

We use the .vstack method of NumPy on Lines 81 to “vertically stack” our images such that they have the shape (N, 224, 224, 3) where N is the size of the batch. Passing batchImages through our network yields our actual feature vectors — remember, we chopped off the fully-connected layers at the head of VGG16, so now we are left with the values after the final max-pooling operation (Line 82). However, the output of the POOL has the shape (N, 512, 7, 7), implying there are 512 filters, each of size 7 X 7. To treat these values as a feature vector, we need to flatten them into an array with shape (N, 25088), which is exactly what Line 86 accomplishes. Line 89 adds our features and batchLabels to our HDF5 dataset, the final lines handles closing our HDF5 dataset.

3.4 CMD_Output snapshot.

In the above snippet, the usage command of the feature extractor.py is shown “python feature_extractor.py -d Dataset\animals -o Dataset\animals\hdf” which is the name of our script followed by the input parameters such as -d “path to the input dataset”, “-o Output path for storing the extracted features in HDF5 format.”. Please ignore the warnings, below we observe this“Extracting Features: 100% |#############| Time: 0:04:46” You can see above a progress-bar which gives a nice visual feed of the progress in extracting the features, It took me around 4min:46seconds to extract features from all the 3000 images and it can become faster if you have a GPU.

4.1 Training a classifier on the extracted features.

Since we have used a pre-trained CNN model to successfully extract the features from our target dataset, let’s see how discriminative these features really are, especially given that the VGG16 was trained on ImageNet and not Animals. So let's see how good a linear model will do on top of these extracted features and how high can we get in terms of the model accuracy? anything more than 90% will be insanely good so let's create a new python file named training.py and start writing the following code.

4.1 training.py code snippet.

In the above code snippet, you can see that from Lines 2–7 required python packages are imported at the beginning of the script and we are using a sklearn package for importing LogisticRegression as a linear model for our image classifier which will classify 3 types of class labels which are (Dog, Cat, Panda). GridSearchCV is used for tuning the hyperparameters and the Classification report is used for displaying the reports of our new classifier in terms of various metrics such as Precision, Recall, and Accuracy. We’ll be using pickle to serialize our LogisticRegression model to disk after training. Finally, h5py will be used so we can interface with our HDF5 dataset of features.

Now we are declaring the argument parser for taking the input from the users -d: path of the HDF5 dataset containing the extracted features and class labels

-m: path for storing the linear model’s output model file

-j: for selecting the number of concurrent jobs while tuning the hyperparameters of the linear model using the grid-search technique.

In the previous script, while creating the HDF5 database of the features we intentionally shuffled the image paths the reason is clarified in lines 20 to 21 Given that our dataset is too large to fit into memory, we need an efficient method to determine our training and testing split. Since we know how many entries there are in the HDF5 dataset (and we know we want to use 75% of the data for training and 25% for evaluation), we can simply compute the 75% index i into the database. Any data before the index i is considered training data anything after i is testing data.

4.2 training.py code snippet.

Since we have our training and testing splits now we can train our logistic regression classifier starting with tuning the hyperparameter “C” by using the GridSearchCV where you can also specify various parameters for efficient training of our model. The .fit model is called which takes (X, Y) so here X is our db[features][:i] and Y is our class label db[labels][:i] and once the best hyperparameters are found, we then evaluate the classifier on the testing data from line 32 to 34. Notice here that our testing data and testing labels are accessed via the array slices:

Anything after the index i is part of our testing set. Even though our HDF5 dataset resides on disk (and is too large to fit into memory), we can still treat it as if it was a NumPy array, which is one of the huge advantages of using HDF5 and h5py together for deep learning and machine learning tasks. Finally, we save our LogisticRegression model to disk and close the database.

4.2 Training the classifier on Animals dataset:

To train a Logistic Regression classifier on the features extracted via the VGG16 network on the Animals dataset, simply execute the following command shown below in the output snapshot.

4.2.1 Output snapshot of training.py

In the above snapshot, you can see that the best hyperparameter we got is “C”:10.0, and after evaluating the model on the test data we have macro and weighted average accuracy, precision, recall, and f1-score is around 98% which is excellent for an image classifier with fewer efforts.

You can find the codes used in the blog from my Github repository: https://github.com/harishdasari1595/Personal_projects/tree/master/transfer-learning

Summary:

  1. we can use both extract_features.py & train_model.py to rapidly build robust image classifiers based on features extracted from pre-trained CNN.
  2. Clearly, the networks such as VGG are capable of performing transfer learning, encoding their discriminative features into output activations that we can use to train our own custom image classifiers.
  3. Here we strictly focussed on the feature extraction power of transfer learning demonstrating that deep pre-trained CNNs are capable of acting powerful feature extraction machines even more powerful than hand-designed algorithms such as HOG, SIFT, and Local Binary Patterns.
  4. Whenever approaching a new problem with deep learning and Convolutional Neural Networks, always consider if applying feature extraction will obtain reasonable accuracy — if so, you can skip the network training process entirely, saving you a ton of time, effort, and headache.

References:

--

--