Image Augmentation

Improving Deep learning models

Sanchit Tanwar

Published in

Analytics Vidhya

5 min readAug 4, 2021

https://miro.medium.com/max/1400/0*0Je9h2iT9m7ribFJ.png

This blog is Part A, of 2 part blog series about image augmentation. It is divided as:-

Part A: Introduction to Image Augmentation, various augmentation techniques, and its implementation through available libraries.

Part B: Building and training a PyTorch model and analyzing the effect of the use of image augmentation on the performance.

If you are new to the field of deep learning, at some point in time you may have heard of the topic of image augmentation. This article will discuss what image augmentation is and implement it in three different python libraries i.e Keras, PyTorch, and augmentation (specifically for image augmentation). So the first question arises what is image augmentation or in general data augmentation.

What is image augmentation?

Augmentation is the action or process of making or becoming greater in size or amount.

In deep learning, deep networks require a large amount of training data to generalize well and achieve good accuracy. But in some cases, image data is not large enough. In such a case we use some techniques to increase our training data. It artificially creates training data, processing the given data using techniques such as random rotation, shifts, shear, and flips (we will discuss some of them later).

Image Augmentation is the process of generating new images for training our deep learning model. These new images are generated using the existing training images and hence we don’t have to collect them manually.

Different Image Augmentation Techniques

Various techniques can be used for image augmentation to feed input the model such as :

Spatial augmentation

Scaling
Cropping
Flipping
Rotation
Translation

Pixel augmentation

Brightness
Contrast
Saturation
Hue

Data Augmentation in Deep Learning

In Deep Learning, Data Augmentation is a common practice. Therefore, every deep learning framework has its own augmentation methods or even a whole library. For example, let’s see how to apply image augmentations using built-in methods in Keras, PyTorch, and Albumentations.

https://www.kreedon.com/wp-content/uploads/2018/07/sunil.jpg

1. Keras.

Keras ImageDataGenerator class provides a quick and easy way to augment your images. It provides a host of different augmentation techniques like standardization, rotation, shifts, flips, brightness change, and many more. You can find more on its official documentation page.
However, the main benefit of using the Keras ImageDataGenerator class is that it is designed to provide real-time data augmentation. Meaning it is generating augmented images while your model is in the training stage.

Image Augmentation With ImageDataGenerator
ImageDataGenerator class ensures that the model receives new variations of the images at each epoch. But it only returns the transformed images and does not add them to the original corpus of images. If it was, in fact, the case, then the model would be seeing the original images multiple times which would definitely overfit our model.
Another advantage of ImageDataGenerator is that it requires lower memory usage. This is so because without using this class, we load all the images at once. But on using it, we are loading the images in batches which saves a lot of memory.

A range of techniques are supported, as well as pixel scaling methods. We will focus on five main types of data augmentation techniques for image data; specifically:

Image shifts via the width_shift_range and height_shift_range arguments.
The image flips via the horizontal_flip and vertical_flip arguments.
Image rotations via the rotation_range argument
Image brightness via the brightness_range argument.
Image zoom via the zoom_range argument.

For example, an instance of the ImageDataGenerator class can be constructed.

Random augmented images would be generated shown below, which is then given to the model.

Random image augmentation generated using ImageDataGenerator

2.Pytorch

PyTorch is a Python-based library that facilitates building Deep Learning models and using them in various applications. But this is more than just another Deep Learning library. It’s a scientific computing package.

The main advantage of using PyTorch is we can individually apply image augmentation techniques for selected images.

Starting with importing image we will define and imshow() function to visualize an actual and transformed image

Scaling: In scaling or resizing, the image is resized to the given size (e.g. the width of the image can be doubled.)

Cropping: In cropping, a portion of the image is selected e.g. in the given example the center cropped image is returned.

Flipping: In flipping, the image is flipped horizontally or vertically.

Pixel augmentation

Pixel augmentation or color jittering deals with altering the color properties of an image by changing its pixel values.

Compositions of Transforms

Composes several transforms together. This transform does not support touchscript. It just clubs all the transforms provided to it. So, all the transforms in the transforms.Compose are applied to the input one by one.

3.Albumentation

Albumentations is a computer vision tool that boosts the performance of deep convolutional neural networks.
Albumentations is a Python library for fast and flexible image augmentations. It efficiently implements a rich variety of image transform operations that are optimized for performance and does so while providing concise, yet powerful image augmentation interfaces for different computer vision tasks, including object classification, segmentation, and detection.

Flipping: In flipping, the image can be flipped horizontally or vertically.

ShiftScaleRotate: In flipping, the image is can be rescaled and rotated randomly within a given range.

Compose Augmentation in Albumentations

Compose receives a list with several augmentations for eg: A.RandomCrop, A.HorizontalFlip, A.RandomBrightnessContrast, etc with help of which we can perform various augmentation techniques under one run. Let’s see an example of compose augmentation using albumentations library.

In the above code, we used compose augmentation using randomcrop and random color jitters.

Image created using compose augmentation

Summary

In this tutorial, you discovered how to use image data augmentation when training deep learning neural networks.

Specifically, we learned:

Image data augmentation is used to expand the training dataset to improve the model’s performance and ability to generalize.
Image data augmentation is supported in the Keras, PyTorch, Albumentation deep learning library.

Thanks Himanshu Wagh for your contribution.