Image Augmentation in Numpy. The spell is simple but quite unbreakable.

7 min readApr 29, 2019

The story is about implementation of image augmentation techniques using NumPy library. I’ll discuss purpose, pros and cons of data augmenting and show the very basic examples of data augmentation in numpy.

Why do we need data augmentation?

The first thought that may cross our minds is increasing dataset size. But it’s important to understand what problem we are solving, the problem is famous and is one of the most popular in ML — the problem of overfitting. If we have few examples of data we are interesting to build classifier/regressor upon, we are overfitted even before a model is builded. It is well known that increasing dataset size can reduce overfitting, that’s why we are never satisfied with the amount of data given to us from the outside. Now, increasing our dataset means collecting more samples, but almost always that action causes the expenses we can’t afford to ourselves: it is either money-issue or too time-consuming. As a result we want a way to increase our data fast and cheaply. Here comes data augmentation.

Imagine you have to build the model to classify some rare species from image data, and you don’t satisfied with the amount of photos you can acquire. Just the action of flipping images horizontally increase the size of your dataset twice. If objects you want to classify are more or less symmetric, flip the data again vertically and you get 4x more images you have initially. Think a minute about additional transformations and you are able to increase the dataset size in 10x, 100x, 1000x times.

When we can try to apply data augmentation?

We have several cases in which it is worth trying data augmentation

We have to build predictor and the data is low. Increasing our dataset size may help our model to learn patterns from data to predict better, although resulting artificial dataset will be monotonous.
The model we are working with does not require labels, thus the only thing we are interested in — to get as most similar samples to existing ones as we can. An example can be Generative Adversarial Network family, where we can try to increase our dataset to stabilise the training process a little bit.
The data we are predicting has highly unbalanced nature. If labels distribution of the dataset is highly unbalanced i.e. the amount of positive samples ten times smaller than negative labels, the model will certainly be biased towards predicting majority of the unseen samples as negative ones. The most famous approach to this case called SMOTE algorithms family.
Reducing adversarial attack issue. The robustness of DNN models has been recently challenged by adversarial attacks where small disturbance on input samples may result in misclassification. The data tackling can be the fast and cheap way to reduce the problem
Increasing regularisation aspect of our model. We can try to merge several samples from our dataset and mislabel it for purpose to make our model not so highly prone to overfitting

When do we should not use data augmentation?

We have really few samples. Data augmentation can help a bit to increase robustness of your predictor, but in the case of really small amount of data I recommend to think towards different appoaches.
The training process is too expensive in terms of time and calculations and we already have enough data. That scenario, to be honest, is quite a rare one, as even very deep models give best performance of millions of samples. But in some applications we have an ocean of raw data already, and we want our samples to be as much representative as they could be. In this case data augmentation have no sense. On the opposite it is recommended to build “prototype” samples of our data and train model upon them.

The word of caution

The data augmentation should not be seen a silver bullet for overfitting. Of course the ability to increase the size of dataset in thousand times is highly tempting. Also, if validation is performed on the same augmented data that we are training on, the metrics will rise continuously even on cross-validation setup. You should always remember that the final goal of designing any ML system is to be good at production. With that in mind excessive usage of data augmentation can part your optimisation environment from production environment. If your dataset is really small you should first contemplate on the type of model you are going to use. It is meaningless to apply classical heavy neural network to the few samples as its optimizer was not designed for low data setup. In such a case I recommend to look towards either simple models (nearest neighours-based) or to the modern neural network approaches designed specifically to solve low-data problems (few-shot learning and meta learning models).

Data Augmentation in Numpy

In the following sections I’ll show simple use cases of image augmenting using only general tools. The three thing we are needed are NumPy library, matplotlib+seaborn for image visualisation and scipy for image rotation (you may you Pillow library as alternative)

As a sample image I’ll use the famous Lenna picture — the classical image from image processing papers in the 70s. The picture is RGB and has a size of 100x100 in the following examples.

First, let’s import packages and have a look on a color distribution of Lenna.

Note, that loaded image have 4 channels (3 for colours, and one for transparency) and values in each channel lie in the range of [0, 1]. Next couple of functions are for simpler image display in jupyter notebook and for displaying pictures in a grid.

Translations

Translations are simple shifting of a picture in some direction. Let’s create function that accepts desired direction to move, amount of pixels for shift and behaviour of the patch that left empty when the image has been shifted. I prefer to roll patch that disappears on the edge to the other side of the image.

The similar technique to translate an image is to crop a random patch from it and then resize it to the desired format. As a result you can obtain several slightly different random patches for the same image. The following snippet make exactly that action

Rotations

The quite effective way to augment image is to rotate it a random degrees. The small detail to keep in mind is we have to replace the “empty” space in the corners by some content to make image more natural. In the following snippet I simply filled them by mean of colours from the corner-patch.

Random Noise

After applying translations and rotations it is helpful to add additioinal randomness in the augmented images by applying gaussian noise. We can use np.random.normal as a simple way to change our sample

Distortions

Another interesting way to change original sample is to somehow distort it. As a simple example we can apply continuous shift of the rows or columns of our image guided by trigonometric functions (cosine or sinus). The resulting image would be “wavy” in horizontal or vertical directions. By tuning function parameters we can achieve required distortion power that produce different image with the same content.

Color Channels

The last snippet concerns changing the single image channel to produce slightly different color theme of original one. The very simple thing to do is just to multiply some channel by given ratio

More sophisticated way is to change channel values with some random process

The aforementioned basic modifications can come in handy in the process of training machine learning models. The notebook containing the snippets can be found at GitHub. As a recent personal use-case I can give an example of augmenting dataset of flowers from the Group of Visual Geometry of Oxford (Flowers). The original dataset contained 8k images. However, the model that I want to train (DCGAN) usually required a lot of data. I translate each image for several times, than rotate each translation by 6 angels and put some gaussian noise on top. As a result, dataset size increased to 172k and model showed acceptable results.

Results of DCGAN based on augmented data

Thanks for reading, and I hope you can apply aforementioned transformations in your own cases!