Data Augmentation in Image processing

3 min readSep 5, 2024

Data augmentation is a technique used in machine learning and deep learning to artificially increase the size and diversity of a training dataset by applying various transformations to the existing data. This helps improve the generalization capability of a model, as it sees a more diverse range of examples during training.

Key Concepts of Data Augmentation

Purpose:

Increase Dataset Size: By generating new, transformed versions of existing data, augmentation helps in creating a larger and more diverse dataset.
Improve Model Generalization: Helps the model become more robust to variations in the data and reduces overfitting by exposing it to different variations of the training data.

Types of Augmentations:

Geometric Transformations: Modify the spatial properties of images.
Rotation: Rotates images by a specified angle.
Translation: Shifts images horizontally or vertically.
Scaling: Zooms in or out of images.
Flipping: Mirrors images horizontally or vertically.
Shearing: Skews images in a specified direction.
Color Adjustments: Modify the color properties of images.
Brightness: Adjusts the brightness level.
Contrast: Changes the contrast between light and dark areas.
Saturation: Alters the color saturation.
Noise Addition: Adds random noise to images to make the model more robust to noisy data.
Cropping: Randomly crops portions of images to simulate different focal points.
Elastic Deformation: Warps images to simulate realistic deformations.
Cutout: Randomly masks out sections of the image to simulate occlusions.

Benefits

Enhanced Model Performance:

Helps improve the performance of the model by making it more resilient to variations and perturbations in the input data.

Reduced Overfitting:

By exposing the model to a more diverse range of examples, augmentation helps prevent overfitting to the limited training data.

Simulation of Real-world Variations:

Mimics real-world variations in data, which helps the model perform better on unseen data

ImageDataGenerator in TensorFlow/Keras used for data augmentation

ImageDataGenerator

The ImageDataGenerator class from the Keras library in TensorFlow provides a way to perform real-time data augmentation and preprocessing of images. The code snippet you provided is used to normalize images before they are fed into a neural network for training.

PYTHON CODE

from tensorflow.keras.preprocessing.image import ImageDataGenerator

# Create an instance of ImageDataGenerator with sample-wise centering and normalization

image_generator = ImageDataGenerator(

samplewise_center=True,

samplewise_std_normalization=True

)

The parameters used in the ImageDataGenerator class:

Parameters

samplewise_center=True
samplewise_std_normalization=True

samplewise_center=True

Purpose: This parameter ensures that each individual image has its mean pixel value set to zero.
How it works: For each image, the mean pixel value is computed and subtracted from every pixel value in that image. This centers the image data around zero, making the mean of the pixel values of each image equal to zero.

samplewise_std_normalization=True

Purpose: This parameter ensures that each individual image has its pixel values scaled such that the standard deviation of the pixel values is 1.
How it works: After centering the image data (if samplewise_center=True is set), the standard deviation of the pixel values is computed. Each pixel value is then divided by this standard deviation. This scales the pixel values so that they have a unit standard deviation.

Data Augmentation in Image processing

samplewise_center=True

samplewise_std_normalization=True

Written by Deepa S