Preseizing Techniques for Image Classification using FASTAI

Ijaz Khan
unpack
Published in
3 min readJan 4, 2021
Data Augmentation: How to use Deep Learning when you have Limited Data
Source : KDnuggets

Presizing is the name given to the techniques of resizing and augmenting the data before feeding it to the machine learning models. Image Classification is one of the important research areas in computer vision and machine learning. Before we classify any kind of image-based data, we need to look at the different aspects of data, for example, the setting of dependant variables corresponding to the independent variables, resizing and augmenting images for better classification e.t.c.

Specifically, the images in the datasets need to have the same dimensions so that before passing them to the GPU they can collate well in the tensors. Also, we need as few as possible distinct augmentations on images. The goal is to reduce the number of lossy operations and numbers of computations for more efficient processing on GPU.

How these resizing and augmentations work using FASTAI? below are a few examples from the book “ Deep Learning for Coders with fastai and PyTorch, the book

Datablock in FastAI (Source:Deep Learning for Coders with fastai and PyTorch, the book)

In FastAI we use “Datablocks” objects to pass the data to the model. Describing datablocks is not in the scope of this article but this is how a data block looks like in FASTAI.

Resizing

This technique is used to resize the image by giving it a different size than its original size. Additionally, we can do padding, squishing, and cropping the image along with resizing. Given below are some examples from the above-mentioned book.

Original Images (Source:Deep Learning for Coders with fastai and PyTorch, the book)
Resize Method Squish (Source:Deep Learning for Coders with fastai and PyTorch, the book)
Zero Padding (Source: Deep Learning for Coders with fastai and PyTorch, the book)

Data Augmentation

It refers to creating different variations of the given data without changing the meaning of the data. Examples include rotation, flipping, perspective warping, brightness changes, and contrast changes. In practice, we first resize all the images to the same size and then apply batch-wise augmentation on the images. In the picture below you can see different types of augmentations applied to the image.

Challenges

If presizing (resizing and augmentation) is done one by one at the same time, various data augmentation may cause problems like creating empty zones around the image, degrading the data, or both. For example, if we rotate the image at 45 degrees, it will fill the corner of the images with emptiness, which will let the model learn irrelevant features. Besides this zooming and rotation will require interpolating for pixels creation.

Solution

Presizing adopts two basic strategies to work around the challenges mentioned above.

  1. Images should be resized to larger dimensions than their original counterparts in the training datasets.
  2. After all the images are resized to the final target size, then compose all the augmentation operations into one and apply combined operations on the GPU at once at the end of the processing, rather than performing it on individual images and interpolating multiple times.
Source:Deep Learning for Coders with fastai and PyTorch, the book

The image above shows two steps.

  1. Crop Height or full width: It is done in the first step using “item_tfms” before copying the image to GPU. It is performed on individual images.
  2. Random Augment & crop: This is done batch-wise using “batch_tfms” , which means it is applied to the whole batch all at once on the GPU, which means it will be faster than doing on individual images.

Thanks for reading.

References:

  1. Deep Learning for Coders with fastai and PyTorch, the book)
  2. https://colab.research.google.com/github/fastai/fastbook/

--

--