Data Augmentation in Image Classification Models

(A Practical Guide for Augmenting Data in Image Classifications)

Yash Chauhan
Accredian
5 min readAug 24, 2022

--

Introduction

Image Classification is the classic Deep Learning problem everyone tries their hands at when first learning about Convolutional Neural Networks(CNN). But more often than not, we use a standardized dataset with enough samples(number of images) to get a highly accurate model, which takes our expectations sky-high.

Then comes the reality check when we try to implement the same model on a real problem, and we are left with a confused mind and an average model. Truly, to create a practically usable CNN model, we need enough data, but how much is enough? The answer is it’s never enough. The more data we give into the training, the better results we will get, right?

In such a situation, what will you think if I tell you there is a simple technique by which you can increase the number of images in your dataset without having to collect more data? Sound’s interesting; that’s because it is. Let me introduce the concept of Image Data Augmentation.

Example of Image Augmentation

What is Data Augmentation

Data Augmentation is the most widely used technique in a deep learning project while working with image data. It basically means transforming the images you currently have using some simple operation(we will look at some examples later) and then adding those altered images into the dataset. By doing this, we effectively increase the dataset size used for training.

Depending on the Augmentation techniques, we can create synthetic images using all the training images we have to effectively get a 10 or even 100 times larger dataset. The best part is that this can be done batch-wise while the model is training, which means you don’t have to save the transformed images on your local system (It saves a lot of disk space).

Apart from increasing the size of the data, data augmentation also helps prevent overfitting, improves the model’s performance on unseen data, and requires no additional efforts to label the images.

Augmentation Examples

So we can conclude that Data Augmentation is incredible. Let’s look at some examples of transformations that can be used in an augmentation pipeline.

We will use the TensorFlowImageDataGenerator API for augmentation and a sample image to test and showcase different transformations.

Let’s first Import the required libraries.

The Sample Image

Sample Image

1. Random Rotation

We can define a maximum value (max_rot) for rotation in the ImageDataGenerator API, and it will pick any random value from the range [0, max_rot] and apply a rotation transformation to our image.

The Code

So the code is straightforward. We use PIL and NumPy to load and convert the image to a NumPy array. Then we specify the maximum rotation in the ImageDataGenerator API using the “rotation_range” argument.

Then we create an iteration object “aug_iter” by specifying the batch size = 1 because we only have 1 image (our sample image). Doing this will get a newly transformed image whenever we trigger the iteration object.

Then we simply display 6 transformed images in a 2 x 3 grid using Matplotlib.

The Output

Random Rotation Augmentation

2. Random Brightness

Similarly, as in the previous example, we can specify the range of brightness shift in the ImageDataGeneratro using the “brightness_range” argument. This will randomly offset the exposure of the image.

The Code

The Output

Random Brightness Augmentation

3. Random Shifts

We can also shift the image’s content in the horizontal and vertical direction by a specified range of translation. We can specify the maximum allowed value for shift using the “width_shift_range” and “height_shift_range” arguments of ImageDataGenerator.

Note that both of these arguments take inputs as fractions, so if we give 0.45 as input, we allow the API to shift the image’s content by a max of 45% of the height/width of the image.

The Code

The output

Random Shift Augmentation

4. Random Flips

Random flips are the most straightforward transformation to understand. We simply set the “horizontal_flip” and “vertical_flip” arguments to ‘True”, and the API will randomly flip the image about the horizontal or vertical axis.

The Code

The Output

Random Flip Augmentation

5. Random Zoom

The most helpful transformation, in my opinion, is the Random Zoom transformation. We can specify a range for zoom out (min) and zoom in (max) using the “zoom_range” argument, and the API will randomly pick a value and give us a transformed image.

The Code

The output

Random Zoom Augmentation

Conclusion

  • So, we covered all the basic transformations in the TensorFlow ImageDataGenerator API. You can combine these transformations and create a pipeline to generate new images for your model to train on.
  • Congratulation, the power to Image Data Augmentation is now in your hands. Use it and give your old 80% accurate CNN models new life and witness the change first-hand.
  • Next, We will dive deeper into Data Augmentation and look at how we can create our own custom transformation functions using computer vision libraries like OpenCV. So stay tuned if you’re interested in improving your Computer Vision skills and Deep learning models.
  • Follow me for more upcoming Data Science, Machine Learning, and artificial intelligence articles.

Final Thoughts and Closing Comments

There are some vital points many people fail to understand while they pursue their Data Science or AI journey. If you are one of them and looking for a way to counterbalance these cons, check out the certification programs provided by INSAID on their website. If you liked this story, I recommend you to go with the Global Certificate in Data Science & AI because this one will cover your foundations, machine learning algorithms, and deep neural networks (basic to advance).

--

--

Yash Chauhan
Accredian

Trying to juggle my Passion for Data Science and my Love for Literature, Sculpting a part of me through every word I write.