Data Augmentation using Keras Preprocessing Layers.

Fabian Christopher
featurepreneur
Published in
3 min readMay 31, 2021

Introduction

Hey there! Data augmentation is a really cool technique to easily increase the diversity of your training set. This is done by applying several random but realistic transformations to the data such as image rotation. In this article, we will be discussing how to perform Data Augmentation using Keras Preprocessing Layers.

With that said, Let’s Get Started

Setup

Let’s start by importing some basic libraries that we’ll need:

import matplotlib.pyplot as plt
import numpy as np
import tensorflow as tf
import tensorflow_datasets as tfds

from tensorflow.keras import layers

Downloading the dataset

  • I will be using the tf_flowers dataset for this demonstration. You can download the dataset using Tensorflow Datasets.

Use pip install tensorflow datasets to download it.

(train_ds, val_ds, test_ds), metadata = tfds.load(
'tf_flowers',
split=['train[:80%]', 'train[80%:90%]', 'train[90%:]'],
with_info=True,
as_supervised=True,
)
  • The flowers dataset has five classes. To view them:
num_classes = metadata.features['label'].num_classes
print(num_classes)
  • Now, Let’s get an image from the dataset to demonstrate data augmentation.
get_label_name = metadata.features['label'].int2strimage, label = next(iter(train_ds))
_ = plt.imshow(image)
_ = plt.title(get_label_name(label))

Resizing and rescaling images.

  • You can now use Keras preprocessing layers to resize your images to a consistent shape or to rescale pixel values.
IMG_SIZE = 180resize_and_rescale = tf.keras.Sequential([
layers.experimental.preprocessing.Resizing(IMG_SIZE, IMG_SIZE),
layers.experimental.preprocessing.Rescaling(1./255)
])
  • Now, to view the resulting image,
result = resize_and_rescale(image)
_ = plt.imshow(result)

Data augmentation using preprocessing layers

  • You can also use preprocessing layers for data augmentation. Let’s start by creating a few preprocessing layers and applying them to the same image.
data_augmentation = tf.keras.Sequential([
layers.experimental.preprocessing.RandomFlip("horizontal_and_vertical"),
layers.experimental.preprocessing.RandomRotation(0.2),
])
  • Now add the image to a batch:
image = tf.expand_dims(image, 0)
  • Now let’s view the images:
plt.figure(figsize=(10, 10))
for i in range(9):
augmented_image = data_augmentation(image)
ax = plt.subplot(3, 3, i + 1)
plt.imshow(augmented_image[0])
plt.axis("off")
  • There are several preprocessing layers you can use for data augmentation. Some examples include layers.RandomContrast, layers.RandomCrop, layers.RandomZoom, and others.

Two options to use the preprocessing layers

Option 1: Make the preprocessing layers part of your model

model = tf.keras.Sequential([resize_and_rescale,data_augmentation,layers.Conv2D(16, 3, padding='same', activation='relu'),layers.MaxPooling2D(),# Rest of your model])

Option 2: Apply the preprocessing layers to your dataset

aug_ds = train_ds.map(
lambda x, y: (resize_and_rescale(x, training=True), y))

Apply the preprocessing layers to the datasets

batch_size = 32
AUTOTUNE = tf.data.AUTOTUNE

def prepare(ds, shuffle=False, augment=False):
# Resize and rescale all datasets
ds = ds.map(lambda x, y: (resize_and_rescale(x), y),
num_parallel_calls=AUTOTUNE)

if shuffle:
ds = ds.shuffle(1000)

# Batch all datasets
ds = ds.batch(batch_size)

# Use data augmentation only on the training set
if augment:
ds = ds.map(lambda x, y: (data_augmentation(x, training=True), y),
num_parallel_calls=AUTOTUNE)

# Use buffered prefecting on all datasets
return ds.prefetch(buffer_size=AUTOTUNE)
train_ds = prepare(train_ds, shuffle=True, augment=True)
val_ds = prepare(val_ds)
test_ds = prepare(test_ds)

Train a model

model = tf.keras.Sequential([
layers.Conv2D(16, 3, padding='same', activation='relu'),
layers.MaxPooling2D(),
layers.Conv2D(32, 3, padding='same', activation='relu'),
layers.MaxPooling2D(),
layers.Conv2D(64, 3, padding='same', activation='relu'),
layers.MaxPooling2D(),
layers.Flatten(),
layers.Dense(128, activation='relu'),
layers.Dense(num_classes)
])
model.compile(optimizer='adam',
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=['accuracy'])
epochs=5
history = model.fit(
train_ds,
validation_data=val_ds,
epochs=epochs
)
loss, acc = model.evaluate(test_ds)
print("Accuracy", acc)

Custom data augmentation

You can also create custom data augmentation layers.

def random_invert_img(x, p=0.5):
if tf.random.uniform([]) < p:
x = (255-x)
else:
x
return x
def random_invert(factor=0.5):
return layers.Lambda(lambda x: random_invert_img(x, factor))

random_invert = random_invert()
plt.figure(figsize=(10, 10))
for i in range(9):
augmented_image = random_invert(image)
ax = plt.subplot(3, 3, i + 1)
plt.imshow(augmented_image[0].numpy().astype("uint8"))
plt.axis("off")

Next, implement a custom layer by subclassing.

class RandomInvert(layers.Layer):
def __init__(self, factor=0.5, **kwargs):
super().__init__(**kwargs)
self.factor = factor

def call(self, x):
return random_invert_img(x)
_ = plt.imshow(RandomInvert()(image)[0])

Conclusion

Hope you had fun working with Data Augmentation!

Do check out my other articles where I cover topics such as deep learning and other trending technologies.

Thanks for stopping by! Happy Learning!

You can check out my Linkedin at https://www.linkedin.com/in/fabchris10/

--

--