Adversarial Training

Published in

DSC JSS Noida

3 min readApr 19, 2020

In recent years, machine learning has gained tremendous popularity. The number of research papers per year has drastically increased. This brings up various interesting techniques and algorithms to make this domain more and more capable. Adversarial Training is one such technique that makes your model robust and better at generalisation. It’s all about good training you know ;)

In 2015, Ian J. Goodfellow, Jonathon Shlens & Christian Szegedy published a paper known as “Explaining and harnessing adversarial examples”. This paper showed the vulnerability of machine learning models and also how it can be removed.

There was an interesting discovery that was made by Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian Goodfellow, Rob Fergus in 2014. In this paper, they explained how machine learning models can be fooled easily. He mentioned that if we add little noise to the image that is invisible to the naked eye the model will misclassify it. Ian J. Goodfellow also mentioned in his 2015 paper that a wide variety of models with different architectures trained on different subsets of the training data misclassifies the same slightly changed noisy image. This shows that our training method has a blind spot and our model goes like:

Finding it difficult to understand?

Let me make this clear to you with an example published by Ian Goodfellow in his 2015 paper.

Here is our first image from some dataset

Can you tell the difference between the two? Try it.

I don’t think you can. But what if I say they both are not the same images. Yes, you heard me right. Slight noise was added in the first image to make the second image. The first image when was submitted to GooLeNet (on ImageNet) the model predicted it is a panda with 57.7% score and when the second image was submitted then the model predicted it is a gibbon and that too with a 99.3% confidence.

I know right.

Here is the complete image of the paper:

Therefore this proves that our highly trained models can also be fooled easily and that they are not learning the true underlining concepts. Hence it is important for us that we create such a model that can’t be fooled and have a better generalisation and is robust. Therefore we do Adversarial Training.

What we can do during training is that we generate adversarial examples for our models and since we know the labels of all our data we will be keeping it the same for adversarial examples. This way the model will get trained on the noisy image too and will try to learn the underlining concepts and will attain better generalisation. This is one approach.

The purpose of this blog was to tell ML beginners that yes there are blind spots in our traditional training methods and how we are trying to overcome them. The blog gives a basic idea of adversarial training without touching anything technical. I would highly recommend you to read the two papers that I have mentioned in the references.

References

[1] https://arxiv.org/pdf/1412.6572.pdf
[2] https://arxiv.org/pdf/1312.6199.pdf

If you like this blog do-follow DSC JSSATEN and Animesh Seemendra for more. If you have any suggestions, feel free to contact us at admin@dscjss.in
You can also connect with us on our Instagram, Twitter or Facebook page.

Adversarial Training

References

Written by Animesh Seemendra