Image Data Augmentation- Image Processing In TensorFlow- Part 2

Paras Patidar

Published in

MLAIT

8 min readJan 5, 2020

What You Will Learn?

What is Data Augmentation and Why it is needed?
Exploring Techniques to do Image Data Augmentation
Implementation In Tensorflow

Data Augmentation and Its Use Cases

Data Augmentation is a technique used to expand or enlarge your dataset by using the existing data of the dataset. We apply different techniques to expand our dataset so that it will help to train our model better with a large dataset. If you are having a small dataset and if you use that dataset to train your model and overfit the data. So to increase the ability and performance of your model, or to generalize our model we need a proper dataset so that we can train our model. Data Augmentation helps you to achieve this.

Image Augmentation is one of the technique we can apply on an image dataset to expand our dataset so that no overfitting occurs and our model generalizes well.

So, If you have relatively small dataset then go with this technique to expand your dataset to generalize your model.

Hope you got the idea, why we do data augmentation. We will be exploring some of the techniques and implement them with TensorFlow, which will be pretty easy.

I got one meme to explain overfitting…

Image Augmentation Techniques and Implementation

Rotation
Width Shifting
Height Shifting
Brightness
Shear Intensity
Zoom
Channel Shift
Horizontal Flip
Vertical Flip

I will be using the ImageDataGenerator class which is used to generate the batches of tensor image data with real-time data augmentation. I will be using one of the method of ImageDataGenerator class to show you different outputs of an image. ImageDataGenerator is a very powerful technique for image processing and image augmentation, I will be writing another blog on ImageDataGenerator class separately to explain it even better. Let’s start understanding and implementing the image augmentation techniques.

Loading the necessary libraries

import tensorflow as tf
import matplotlib.pyplot as plt
import numpy as np
import os
import matplotlib.image as mpimg
from tensorflow.keras.preprocessing.image import ImageDataGenerator

Downloading the Cats Vs Dogs Dataset which contains 2000 images in 2 categories and divided into train and validation directory

_URL = 'https://storage.googleapis.com/mledu-datasets/cats_and_dogs_filtered.zip'path_to_zip = tf.keras.utils.get_file('cats_and_dogs.zip', origin=_URL, extract=True)PATH = os.path.join(os.path.dirname(path_to_zip), 'cats_and_dogs_filtered')

Defining the train and validation directory, we will not be using the validation directory because we are not making a model we are just experimenting augmentation technique. We will define our batch size and height and width of our image basically target size.

train_dir = os.path.join(PATH, 'train')
validation_dir = os.path.join(PATH, 'validation')
batch_size = 128
IMG_HEIGHT = 150
IMG_WIDTH = 150

We will be using the helper function for plotting the 5 images together.

def plotImages(images_arr):
    fig, axes = plt.subplots(1, 5, figsize=(20,20))
    axes = axes.flatten()
    for img, ax in zip( images_arr, axes):
        ax.imshow(img)
        ax.axis('off')
    plt.tight_layout()
    plt.show()

Let’s start applying the techniques of Image Augmentation…

1.Rotation

We can specify the angle in degrees and this then apply it to a large dataset we can use the rotation_range parameter to specify the range of values which then generates the images in the range of +rotation_range to -rotation_range(in degrees).

image_generator =
ImageDataGenerator(rescale=1./255,rotation_range=135)train_data_gen = image_generator.flow_from_directory(
batch_size=batch_size,
directory=train_dir,
shuffle=True,
target_size=(IMG_HEIGHT, IMG_WIDTH)
)
augmented_images = [train_data_gen[0][0][0] for i in range(5)]
plotImages(augmented_images)

2.Width Shifting

We can apply the width_shift_range technique to shift the image in the x-direction and we can specify a floating-point number between 0.0 to 1.0 which tell us the upper bound of the fraction of total width by which is image is randomly shifted either in right or left direction.

image_generator = ImageDataGenerator(rescale=1./255,width_shift_range=.15)train_data_gen = image_generator.flow_from_directory(
batch_size=batch_size,
directory=train_dir,
shuffle=True,
target_size=(IMG_HEIGHT, IMG_WIDTH)
)
augmented_images = [train_data_gen[0][0][0] for i in range(5)]
plotImages(augmented_images)

Here at the right side of the image, it is showing some blurry image, we will see this why? What values come when we shift our image left or right or top or bottom. How we specify the values? Stay till the end, I 'll explain them.

3.Height Shifting

We can apply the height_shift_range technique to shift the image in the y-direction and we can specify a floating-point number between 0.0 to 1.0 which tell us the upper bound of the fraction of total width by which is image is randomly shifted either in the top or bottom direction.

image_generator = ImageDataGenerator(rescale=1./255,height_shift_range=.15)train_data_gen = image_generator.flow_from_directory(
batch_size=batch_size,
directory=train_dir,
shuffle=True,
target_size=(IMG_HEIGHT, IMG_WIDTH)
)
augmented_images = [train_data_gen[0][0][0] for i in range(5)]
plotImages(augmented_images)

4. Brightness

We can apply the brightness_range technique for randomly picking a brightness shift value from and we can specify a floating-point number between 0.0 to 1.0 which tell us that 0.0 means no brightness & 1.0 corresponds to maximum brightness.

image_generator = ImageDataGenerator(rescale=1./255,brightness_range=(0.1,0.9))train_data_gen = image_generator.flow_from_directory(
batch_size=batch_size,
directory=train_dir,
shuffle=True,
target_size=(IMG_HEIGHT, IMG_WIDTH)
)
augmented_images = [train_data_gen[0][0][0] for i in range(5)]
plotImages(augmented_images)

5. Shear Intensity

Shear is sometimes also referred to as transvection. A transvection is a function that shifts every point with constant distance in a basis direction(x or y). It slants the shape of the image. Here, we fix one axis and stretch the certain angle known as the shear angle. It stretches the image which is different than the rotation technique. we specify the shear_range in the degrees.

image_generator = ImageDataGenerator(rescale=1./255,shear_range=45.0)train_data_gen = image_generator.flow_from_directory(
batch_size=batch_size,
directory=train_dir,
shuffle=True,
target_size=(IMG_HEIGHT, IMG_WIDTH)
)
augmented_images = [train_data_gen[0][0][0] for i in range(5)]
plotImages(augmented_images)

6. Zoom

We use zoom_range argument to specify the values. If zoom_range is less than 1.0 then it magnifies the image and zoom_range greater than 1.0 zooms out of the image.

image_generator = ImageDataGenerator(rescale=1./255,zoom_range=0.5)train_data_gen = image_generator.flow_from_directory(
batch_size=batch_size,
directory=train_dir,
shuffle=True,
target_size=(IMG_HEIGHT, IMG_WIDTH)
)
augmented_images = [train_data_gen[0][0][0] for i in range(5)]
plotImages(augmented_images)

7. Channel Shift

It randomly shifts the channel values by a random value chosen from the range. we use channel_shift_range parameter to specify the values.

image_generator = ImageDataGenerator(rescale=1./255,channel_shift_range=150)train_data_gen = image_generator.flow_from_directory(
batch_size=batch_size,
directory=train_dir,
shuffle=True,
target_size=(IMG_HEIGHT, IMG_WIDTH)
)
augmented_images = [train_data_gen[0][0][0] for i in range(5)]
plotImages(augmented_images)

8. Horizontal Flip

It flips the images horizontally by specifying the boolean value in the horizontal_flip parameter. By specifying true it flips them horizontally.

image_generator = ImageDataGenerator(rescale=1./255,horizontal_flip=True)train_data_gen = image_generator.flow_from_directory(
batch_size=batch_size,
directory=train_dir,
shuffle=True,
target_size=(IMG_HEIGHT, IMG_WIDTH)
)
augmented_images = [train_data_gen[0][0][0] for i in range(5)]
plotImages(augmented_images)

9. Horizontal Flip

It flips the images vertically by specifying the boolean value in the vertical_flip parameter. By specifying true it flips them vertically.

image_generator = ImageDataGenerator(rescale=1./255,vertical_flip=True)train_data_gen = image_generator.flow_from_directory(
batch_size=batch_size,
directory=train_dir,
shuffle=True,
target_size=(IMG_HEIGHT, IMG_WIDTH)
)
augmented_images = [train_data_gen[0][0][0] for i in range(5)]
plotImages(augmented_images)

Putting it all together

image_generator = ImageDataGenerator(rescale=1./255,
rotation_range=45,
width_shift_range=.15,
height_shift_range=.15,
horizontal_flip=True,
zoom_range=0.5)train_data_gen = image_generator.flow_from_directory(
batch_size=batch_size,
directory=train_dir,
shuffle=True,
target_size=(IMG_HEIGHT, IMG_WIDTH)
)
augmented_images = [train_data_gen[0][0][0] for i in range(5)]
plotImages(augmented_images)

But Why this?

The part marked under the red area. Why this?

If I take an example with width_shift which shifts the given image in the left or right direction when the corner pixel values are shifted and to fill the empty values it is filled with some pixel values to maintain the quality of the image. We have several options among which we can choose how we want these regions to be filled. We use fill_mode parameter to specify the values.

1.Nearest

This is the default option where the closest pixel value is chosen and repeated for all the empty values. (E.g. aaaaaaaa|abcd|dddddddd)

image_generator = ImageDataGenerator(rescale=1./255,width_shift_range=.15,fill_mode='nearest')train_data_gen = image_generator.flow_from_directory(
batch_size=batch_size,
directory=train_dir,
shuffle=True,
target_size=(IMG_HEIGHT, IMG_WIDTH)
)
augmented_images = [train_data_gen[0][0][0] for i in range(5)]
plotImages(augmented_images)

2. Reflect

This mode creates a ‘reflection’ and fills the empty values in reverse order of the known values. (E.g. abcddcba|abcd|dcbaabcd)

image_generator = ImageDataGenerator(rescale=1./255,width_shift_range=.15,fill_mode='reflect')train_data_gen = image_generator.flow_from_directory(
batch_size=batch_size,
directory=train_dir,
shuffle=True,
target_size=(IMG_HEIGHT, IMG_WIDTH)
)
augmented_images = [train_data_gen[0][0][0] for i in range(5)]
plotImages(augmented_images)

3. Constant

If we want to fill all the points lying outside the boundaries of the input by a constant value, this mode helps us achieve exactly that. The constant value is specified by the cval argument.

image_generator = ImageDataGenerator(rescale=1./255,width_shift_range=.15,  fill_mode='constant',cval=100)train_data_gen = image_generator.flow_from_directory(
batch_size=batch_size,
directory=train_dir,
shuffle=True,
target_size=(IMG_HEIGHT, IMG_WIDTH)
)
augmented_images = [train_data_gen[0][0][0] for i in range(5)]
plotImages(augmented_images)

4. Wrap

Instead of a reflect effect, we can also create a ‘wrap’ effect by copying the values of the known points into the unknown points, keeping the order unchanged. (E.g. abcdabcd|abcd|abcdabcd)

image_generator = ImageDataGenerator(rescale=1./255,width_shift_range=.15,  fill_mode='wrap')train_data_gen = image_generator.flow_from_directory(
batch_size=batch_size,
directory=train_dir,
shuffle=True,
target_size=(IMG_HEIGHT, IMG_WIDTH)
)
augmented_images = [train_data_gen[0][0][0] for i in range(5)]
plotImages(augmented_images)

Add Ons to the ImageDataGenerator Class

We can also use some of these parameters to make your augmentation even more better.

featurewise_center: Boolean. Set input mean to 0 over the dataset, feature-wise.
samplewise_center: Boolean. Set each sample mean to 0.
featurewise_std_normalization: Boolean. Divide inputs by std of the dataset, feature-wise.
samplewise_std_normalization: Boolean. Divide each input by its std.
zca_epsilon: epsilon for ZCA whitening. Default is 1e-6.
zca_whitening: Boolean. Apply ZCA whitening.
rescale: rescaling factor. Defaults to None. If None or 0, no rescaling is applied, otherwise we multiply the data by the value provided (after applying all other transformations).
preprocessing_function: function that will be implied on each input. The function will run after the image is resized and augmented. The function should take one argument: one image (Numpy tensor with rank 3), and should output a Numpy tensor with the same shape.

Basically, with preprocessing_function, you can make your own custom function to perform preprocessing of the image.

If you like this tutorial follow us for more interesting tutorials.

Previous Part

Affine Transformation- Image Processing In TensorFlow- Part 1

Affine Transformation helps to modify the geometric structure of the image, preserving parallelism of lines but not the…

medium.com

Additional Resources

Check out the blog to learn more https://neptune.ai/blog/data-augmentation-in-python