Data Augmentation — Is it really necessary?

Amey Gondhalekar
Analytics Vidhya
Published in
4 min readMar 24, 2020
Photo by Kyle Glenn on Unsplash

In this article we introduce Data Augmentation, its necessity and various techniques to do so.

Performance of a Deep Learning model depends on 2 things:

  • Neural Network model
  • Quality & Quantity of Data

Most of the time, even after using a proper model we do not get satisfactory results. The problem then lies in the data used to train the network. Having a large dataset is crucial for the performance of the deep learning model. However, we may lack the quantity and diversity of data we wish to train the model for a custom requirement thereby hampering its performance.

So you might be wondering how do I get more data if I don’t have “more data”? This is where the Data Augmentation part comes into the picture.

Data Augmentation

Data augmentation is a technique to artificially create new training data from existing training data. This is done by applying domain-specific techniques to examples from the training data that create new and different training examples.

It helps us to increase the size of the dataset and introduce variability in the dataset, without actually collecting new data. The neural network treats these images as distinct images anyway. Also Data Augmentation helps reduce over-fitting. Our dataset may have images taken in a limited set of conditions but we might fall short in a variety of conditions that we don’t account for. Here the modified/augmented data helps deal with such scenarios.

So, to get more data, we need to make minor alterations to our existing training data. Here we are specifically talking about Image Data Augmentation. These alterations include flipping the image horizontally, vertically, padding, cropping, rotating, scaling and few other translations.

Image Data Augmentation techniques
Image Data Augmentation techniques

We will be using the Keras deep learning library to demonstrate various data augmentation techniques. Keras has an ImageDataGenerator class which pretty much does the entire job without much hassle. Similarly Tensorflow has TFLearn’s DataAugmentation and MXNet has Augmenter classes. Also imgaug is another powerful package for image augmentation containing over 60 image augmenters and augmentation techniques, complex augmentation pipelines and many helper functions for visualization. The images in the dataset are not used directly. Instead, only augmented images are provided to the model. The augmentations are performed randomly.

Horizontal Shift Augmentation

All the pixels of the image are shifted horizontally while keeping the image dimensions same. This causes some pixels to be clipped off the image. There may be a positive shift or negative shift. We tune the width_shift_range parameter in a range of pixel values or ratio of the image.

Vertical Shift Augmentation

All the pixels of the image are shifted vertically while keeping the image dimensions same. Here we tune the height_shift_range argument in a ratio of the image height dimension.

Horizontal and Vertical Flip Augmentation

An image flip means reversing the rows or columns of pixels in the case of a vertical or horizontal flip respectively. For photographs of animals, birds, horizontal flips may make sense, but vertical flips would not. For other types of images, such as aerial photographs, cosmology photographs, and microscopic photographs, perhaps vertical flips make sense.

Random Rotation Augmentation

A rotation augmentation randomly rotates the image clockwise by a given number of degrees from 0 to 360.

Random Brightness Augmentation

The brightness of the image can be augmented by either randomly darkening images, brightening images, or both. The intent is to allow a model to generalize across images trained on different lighting levels.

Random Zoom Augmentation

A zoom augmentation randomly zooms the image in and either adds new pixel values around the image or interpolates pixel values respectively. The default value of [1, 1] does not give the zoom effect.

Re-scaling of images is another technique to scale the image of various sizes to a fixed size to be fed to the neural network.

The above list of augmentation techniques is not exhaustive. You can use any technique or a combination of above techniques based on the use-case of the problem you are trying to solve and the type of dataset you have.

I would like to conclude here that even though augmentation is computationally expensive, it is worth trying!

Feel free to make any suggestions!

You can reach out to me on LinkedIn.

References:

🔗 https://arxiv.org/abs/1905.05393

🔗 http://cs231n.stanford.edu/reports/2017/pdfs/300.pdf

--

--