Tutorial 7: Visualizing Data Augmentation with Animations

Published in

Fenwicks

4 min readApr 25, 2019

Prerequisite: None.

We have encountered several image augmentation methods in previous tutorials, such as random_pad_crop, random_flip and cutout in Tutorial 2 and Tutorial 3. In Tutorial 4, we use the data augmentation scheme in Google’s InceptionV3 example, which includes two data augmentations: distorted_bbox_crop and distort_color. What exactly are these? In this tutorial, we show them using GIF animations.

To build GIF animations, we need to install a software package called ImageMagick:

!apt -qq install imagemagick

Let’s first download our test image, “Lena”:

fn = './Lenna_(test_image).png'
url = f'https://upload.wikimedia.org/wikipedia/en/7/7d/{fn[2:]}'fw.io.download(url, fn)

First, let’s see what random_flip does:

fw.anim.show_transform(fw.transform.random_flip, fn)

By default, random_flip only flips the input image horizontally, not vertically. To enable vertical flipping, we can set flip_vert to True:

tfm = fw.transform.tfm_random_flip(flip_vert=True)
fw.anim.show_transform(tfm, fn)

In the above example, we used the tfm_random_flip function to obtain a parameterized random_flip. In general, a function tfm_XYZ changes the default parameters of the underlying data augmentation method XYZ, and we’ll see more examples below.

In some applications, horizontal flipping makes sense, and vertical flipping does not. For example, a car photo flipped horizontally is still a car, just facing another direction; a car photo flipped vertically is, well, a disaster. In Tutorials 2–3, we applied only horizontal flipping, which is also the default behavior of random_flip as mentioned before.

Next, let’s look atcutout:

tfm = fw.transform.tfm_cutout(h=64, w=64)
fw.anim.show_transform(tfm, fn)

Cutout (proposed in a research paper) is a regularization technique that helps us fight overfitting. The idea is similar to dropout, except that cutout drops a whole square of pixels instead of random ones as in dropout.

Next one is distort_color:

fw.anim.show_transform(fw.transform.distort_color, fn)

There are many ways to distort colors in an image. The main advantage of distort_color is speed: it uses only simple arithmetic operations rather than transforming the colors to Hue-Saturation-Value space. As we see above, the default setting for distort_color modifies colors rather aggressively.

Let’s also visualize the other image augmentation method in Inception V3: distorted_bbox_crop:

As we can see, the default setting of distorted_bbox_crop magnifies a part of the image very aggressively. This is because in the ImageNet data (on which Inception V3 is trained), there are often tiny objects that need to be recognized. This probably doesn’t make sense in other applications, such as medical imaging.

A similar and much gentler transform is random_pad_crop, used in Tutorials 2–3:

tfm = fw.transform.tfm_pad_crop(32)
fw.anim.show_transform(tfm, fn)

The Fenwicks implementation of random_pad_crop uses the reflect setting for padding, which copies the boundary pixels. This is usually a better idea than padding with zeros, which correspond to black pixels.

Coming up: common image augmentations that we haven’t used in previous tutorials. First, random rotation:

fw.anim.show_transform(fw.transform.random_rotate, fn)

By default, random_rotate rotates the input image by up to 10 degrees in either direction. Its implementation is rather tricky, since we can’t use common image manipulation libraries such as PIL or NVidia’s Dali, which are not available for TPUs. Tensorflow’s own tf.contrib.image package has issues on TPUs too. Fenwicks contains a pure Tensorflow implementation of affine transforms, which include rotation, shift, zoom, and shear.

Let’s look at another affine transform: zoom:

fw.anim.show_transform(fw.transform.random_zoom, fn)

By default, random_zoom magnifies the center of the image by up to 1.1x, which is a very gentle transformation.

This is shear, one more type of affine transform:

fw.anim.show_transform(fw.transform.random_shear, fn)

And random shifts — similar effects as random_pad_crop:

fw.anim.show_transform(fw.transform.random_shift, fn)

Multiple affine transforms can be applied together using a random_affine_combo. By default, it does rotation and zooming.

fw.anim.show_transform(fw.transform.random_affine_combo, fn)

Let’s look at a color transform: random_lighting:

fw.anim.show_transform(fw.transform.random_lighting, fn)

Lastly: a combination of transforms similar to the default one in fast.ai. This one combines affine and lighting transforms:

tfms = fw.transform.get_fastai_transforms(299, 299, True,
  normalizer=lambda x:x)
fw.anim.show_transform(tfms, fn)

Here’s the complete Jupyter notebook:

fenwickslab/fenwicks

Contribute to fenwickslab/fenwicks development by creating an account on GitHub.

github.com

All tutorials:

Deep Learning on a Free TPU

Deep learning is an artificial intelligence technology that is surprisingly successful.

medium.com

Tutorial 7: Visualizing Data Augmentation with Animations

fenwickslab/fenwicks

Contribute to fenwickslab/fenwicks development by creating an account on GitHub.

Deep Learning on a Free TPU

Deep learning is an artificial intelligence technology that is surprisingly successful.

Written by David Yang