A surprisingly simple way to make DNNs robust against many types of image corruptions

Published in

Bethgelab

6 min readJan 31, 2020

In this post we describe how a very simple training strategy —data augmentation with Gaussian noise — can lead to surprisingly high robustness against common image corruptions. This result contrasts with previous work that claimed little to no generalization effect from Gaussian noise to other corruptions. We show, however, that generalization crucially relies on a well-tuned variance of the Gaussian noise. Choosing the variance too small or too large can break generalization and explains the negative results of previous work. We further explore an adversarially trained noise-model in order to increase robustness against common image corruptions even further. See our preprint for all results.

A neural network predicts the wrong class when snow is added to a query image. We humans can still clearly recognize the snow leopard. The snow corruption was generated with the Python package imagecorruptions.

Deep Learning is a tremendous success story: there are breakthroughs of Deep Neural Networks in Computer Vision, Speech Recognition, and even playing complex games such as Go and Dota2. While neural networks seem to be able to surpass humans in some tasks, they still struggle when the data distribution changes between training and testing. For example, if the training dataset does not contain images with snow, the network will not have learned how to deal with it and might mistake a snow leopard for an alligator. In the real world, this is the reason why autonomous cars still can’t deal with bad weather. This lack of generalization beyond the conditions met during training is a huge issue since training data sets are limited in size and variety.

On the success of data augmentation with Gaussian noise

Facing the complex problem of model robustness towards image corruptions, we are interested to see how far we can get with the possibly simplest baseline: data augmentation with additive Gaussian noise. We use the recently published Common Corruptions benchmark (ImageNet-C)[2] for evaluation. The benchmark contains 15 different image corruptions with 5 severity levels each; the corruptions are organized in the categories: Noise, Blur, Weather and Digital. There is also a small holdout set with one corruption per category. We find that carefully scaled Gaussian noise works surprisingly well to increase the robustness of a ResNet50 architecture, the workhorse of Deep Learning. Gaussian noise has two parameters: the mean μ and the standard deviation σ. We set μ=0 and treat σ as a hyper-parameter. We observe that for a good choice of σ, data augmentation with Gaussian noise not only increases the robustness on the corruptions of the Noise category of ImageNet-C strongly, but that it also generalizes to the other three categories. One can also tune σ on the holdout corruptions and finds the same optimal value.

Training the classifier on data augmented with additive Gaussian noise can strongly increase the accuracy on ImageNet-C even without the Noise corruptions if the standard deviation σ is chosen well.

Previously, there have been several attempts to use noise-based data augmentation to improve corruption robustness with either small [3, 4] or no success [5]. Lopes et al. [3] use Gaussian noise augmentation like us, but they pick a very small σ regime where our approach also fails to produce strong robustness improvements (they sample σ between 0 and 0.1). Ford et al. [4] also use Gaussian data augmentation and observe only a small increase in corruption robustness. They train their model from scratch whereas we only fine-tune a pretrained one. Like Lopes et al., they sample σ uniformly between 0 and one specific value for each image; in our experiments, we find that setting σ to a specific value instead of sampling it from an interval works better. Since the approach of Ford et al. is conceptually very similar to ours, we have included a detailed comparison between their method and ours in our paper. Geirhos et al. [5] train their model against a fixed set of corruptions but find no generalized robustness against unseen corruptions. For example, their model that is trained with data augmentation with uniform noise performs at chance level when tested against Salt and Pepper noise. The reason for this apparent discrepancy are the vastly higher standard deviations that Geirhos et al. have used in their work both during training and testing. In that sense, ImageNet-C corruptions are much more benign compared to their setting.

Adversarial Noise

Building on the success of our simple Gaussian noise-based strategy, we wonder: what if adding Gaussian noise to images is the simplest but not the best strategy for data augmentation? Thus, as our next step, we learn the shape of the distribution adversarially and create the Noise Generator: a simple generative neural network trained with backpropagation that learns a per-pixel i.i.d. noise distribution. It is trained with the goal to produce a noise distribution that can most successfully confuse a given classification model (for a fixed perturbation size). We call this worst-case noise Adversarial Noise.

Game of Noise

We can now retrain our vanilla classifier with images augmented by samples from the learned Adversarial Noise distribution. As a result, the classifier becomes robust against this noise type, but might be vulnerable against a new noise distribution. Thus, a new Adversarial Noise pattern exists! In a GAN-like fashion, we play the Game of Noise and train the classifier and the Noise Generator simultaneously. This way, the Noise Generator must find new corner cases to fool the classifier while the classifier must adjust its features to account for the new quirks of the Noise Generator. After playing the Game of Noise for many epochs, the Noise Generator has a hard time to find new attacks to surprise the seasoned and sophisticated classifier.

We find that as the classifier learns to defend itself against Adversarial Noise, it simultaneously becomes more robust against most image corruptions. In fact, our results beat the previous state-of-the-art on ImageNet-C for a ResNet50 architecture. Combining the Game of Noise with training on Stylized ImageNet enables us to further boost the performance on image corruptions.

Our models trained with Gaussian data augmentation (GNTσ0.5) and by playing the Game of Noise (ANT, ANT+SIN) achieve state-of-the-art accuracy on ImageNet-C.

What have we learned from this?

It is still a difficult problem to make neural networks generalize to scenarios they have not seen during training. Humans suffer much less from this problem as we do not need to be trained on a multitude of distortions from the set of all possible noise patterns to still be able to recognize objects. Humans are simply very good at it. It is still an open question how to train robust and powerful neural networks. Adversarial training is currently the most successful strategy to increase the robustness of classifiers to input modifications, at least to defend against regular and universal adversarial attacks.

When thinking about how to increase robustness against image corruptions, we first tried the simplest strategy: data augmentation with Gaussian noise. We were surprised to discover that this simple method, when trained until convergence, already beats many much more sophisticated approaches. Motivated by these strong results, we hypothesized that the Gaussian distribution might not be the best one to sample noise from and devised the Noise Generator. Its task was to capture many possible distortion types, and to also evolve alongside the robustified classifier. Playing the Game of Noise gave us promising robustness improvements against natural image corruptions, going a part of the way towards human-like robustness in machines.

Model weights of the trained models are available here.

References

[1] E. Rusak, L. Schott, R. S. Zimmermann, J. Bitterwolf, O. Bringmann, M. Bethge, W. Brendel. Increasing the robustness of DNNs against image corruptions by playing the Game of Noise

[2] D. Hendrycks, T. Dietterich. Benchmarking neural network robustness tocommon corruptions and perturbations, ICLR (2019)

[3] R. G. Lopes, D. Yin, B. Poole, J. Gilmer, E. D. Cubuk. Improving robustness without sacrificing accuracy with patch gaussian augmentation

[4] N. Ford, J. Gilmer, N. Carlini, D. Cubuk. Adversarial examples are a natural consequence of test error in noise, ICML (2019)

[5] R. Geirhos, C. R. M. Temme, J., H. H. Schütt, M. Bethge, F. A. Wichmann. Generalisation in humans and deep neural networks, NIPS (2018)

[6] R. Zhang. Making convolutional networks shift-invariant again, ICML (2019)

[7] R. Geirhos, P. Rubisch, C. Michaelis, M. Bethge, F. A. Wichmann, W. Brendel. ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness, ICLR (2019)