What are GANS?

Vidyashree K S
8 min readSep 23, 2021

--

You can have data without information , but you cannot have information without data” ~Daniel Keys Moran

GANS stand for Generative Adversarial Networks. “GANS” well it might sound complex but actually its not .Ian Good Fellow et al. published “Generative Adversarial Networks” in 2014, which was the first study to describe GANs. Since then, GANs have seen a lot of attention given that they are perhaps one of the most effective techniques for generating large, high-quality synthetic images .

What exactly the term “generative” signify in the name “Generative Adversarial Network”? “Generative” describes a class of statistical models that contrasts with discriminative models. Informally I can say that Generative models can generate new data instances. Discriminative models discriminate between different kinds of data instances.

Generative adversarial networks (GANs) are an exciting recent innovation in machine learning. GANs are generative models: they create new data instances that resemble your training data. For example, GANs can create images that look like photographs of human faces, even though the faces don’t belong to any real person.

A visual depict of battle between Generator and Discriminator

Let me give you insights regarding some revolutionary architectures that have been evolving and adding new variations over the years. There are lots of different generative architectures like,

GAN variants timeline image source

However, we’ll focus on the GANs, which have been the most popular and successful thus far.

Generator and Discriminator :

fig1.GAN Network

This is literally a battle , Where on one hand the generator is taking the random noise in the input and trying to invent a fake image or fake data that as similar as possible to the original training data . And on the other hand discriminator is receiving two types of inputs. one type of data originating from real dataset(real X) and another one emerging as a generated one from the generator(fake X).

Now discriminator has to predict the images/data that has been received whether they belong to real images or fake images. And there by producing two types of losses.

What are the Loss Values then? loss values are the results of a mathematical function that is evaluating how far are we from perfect result of the network.so these loss values evaluates difference between our ideal results and our current result. Here , we have two losses one for generator and one for discriminator as they each have different objective.

The reason is generator is going to be trying to fool the discriminator, it tries to convince the discriminator that the fake image is real. Conversely, discriminator has different objective. The discriminator always tries its best to predict fake as fake and real as real.so they both have different objectives and different loss value as well.

GANs Training

The Math behind GANs

Training of the GANs involve the training of Generator and discriminator as well. So how do we Train a Generator?

So, let’s begin with z (fig1) that represents noise vector which is a mathematical vector made up of random numbers noise. We are going to give it as a input for generator architecture . generator architecture is going to produce output that we are going to call it as Fake x. because it is an output image that is fake. It does not belong to original real training data and the same is used to deceive the discriminator. So fake x is passed as the input to the discriminator and the discriminator produces an output Y.

Now we shall use a mathematical function to calculate the Loss value, a value that expresses How good is the performance of the generator ? How close are the results of the generator to the ideal results that we want to obtain now ?

Well ,remember generator wants to fool the discriminator. It means the generator wants to convince the discriminator that fake x that they emit is real. And here the value 1 represents real image/data and 0 for fake.so when we pass fake x to the discriminator as input we usually obtain 0.3,0.7or some 0.9 some decimal number≥1.so that’s going to be the value of Y i.e the output of discriminator. and next we enter the same in the mathematical loss function and that mathematical function compares two things. its going to compare the output Y with the real label(1).Once the loss value is calculated it is used in chain rule of calculation and back propagation algorithm that in other hand update the parameters ,the internal parameters of the generator network. And this is repeated over and over , difference between the output of the discriminator and real label becomes smaller and smaller and there by the performance of the model will be gradually increased.

So that is the training of the Generator. Now what about the training of the Discriminator ? So for discriminator again we take noise vector Z and pass it to the generator and again generator produces fake samples along with generated sample we also take real samples from the original dataset. So discriminator receives a set of both fake and real images. now the output y (fake Y) and output Y (real Y) again could be 0.9,0.7,0.4 whatever. we then take that to the mathematical loss function to calculate the loss.

further we compare the output of processing of the fake inputs with zero and processing of real inputs with one (fake Y vs fake label and real Y vs real label) Why? because the discriminator wants to be able to predict that the fake inputs are fake therefore zero ,and real inputs are real therefore one. Once the calculation of the loss is done we use backpropagation to update the parameters of the discriminators only.

So How the loss or performance value of discriminator and generator are calculated ?

In order to calculate the loss we use something called cross entropy which is the measure of the difference between two probability distribution. let’s begin with entropy which is the number of bits required to select a randomly selected event from a probability distribution. The equation of the entropy is represented as:

where, X is a random variable that has xi possible outcomes, p is the probability and b is the logarithmic base value.

Cross entropy which is called cross as we compare to different distributions and then entropy is employed to compare them. Here goes the equation of cross entropy:

cross entropy loss measures the dissimilarity between the true label distribution y and the predicted label distribution ˆy.

Discriminator Loss :

Here we have used Binary Cross entropy to calculate the loss. why BCE? if you are training a binary classifier chances are, you are using binary cross-entropy as your loss function. And of course our discriminator does the same thing right? As the discriminator wants to predict real as real and fake as fake the equation has two different parts. In the above fig. y represents target probability/label and ^y represents the prediction i.e the output of discriminator. one part of the equation focuses on real data(log y)and another on fake(1-log(^y)). Now we combine both things to get final equation Min Max equation ,which is average over all the samples of the logarithm of the output of discriminator(log(D(x)) and the logarithm of the output of generator(log(1-D(G(z))). log(d(x)) is calculated w.r.t training data samples and log(1-D(G(z)) is calculated w.r.t samples generated by the generator from z. when we start training, ideally the output of the discriminator should be 1 when the data comes from training data(real samples) and the output of the discriminator should be 0 when data comes from the generator. So all the above maximizing the minmax equation makes sense so that it lead to discriminator performing optimally.

This can be called as minmax game as discriminator wants the first part of the equation to output one because we want the real images to be predicted as real. so discriminator wants to bring ^y more and more towards 0 . and the second part should be predicted as 0 as we want the fake images to be predicted as fake. whereas generator wants to maximize the second part(D(G(z)) in order to maximize the error of the discriminator. So, its a battle!!and from this battle we will get beautifully trained model.

Generator Loss :

Calculation of generator loss is much easier than that of discriminator because in this case we don’t have to deal with both real and fake data rather we only deal with only fake data. here the generator deals only with any one of the label i.e real ,as it tries to deceive the discriminator that the fake data is real one. Consequently the second part of the equation is mapped to 0 and the whole equation is just the logarithm of the prediction as shown in the fig. the obtained final equation is the average over all the samples of then logarithm of the output of the discriminator applied to the output of the generator. So, ideally it tries to induce D(G(z)) to be close to 1 rather than close to 0 in order to maximize the discriminator network error. When real and fake data are fed into the discriminator, if the discriminator correctly classifies the generator’s output as close to 0, the generator loss equation becomes a very large negative number and shows that discriminator is rightly classifying the data. however, When the discriminator fails to distinguish the generator’s output as belonging to class 1, log(G(z)) approaches 0, increasing discriminator loss while decreasing generator loss.

Mind Blowing Applications of GANs :

  1. Data Augmentation.
  2. Image- to- image and Text- to- image translation.
  3. Face swapping.
  4. Video prediction.
  5. Privacy preserving.
  6. Domain adaptation.
  7. Drug Discovery.

Conclusion

If all these processes are understood in detail it will be much easier in the future to understand the other variations of generative structures and other architectures as well, and basically mathematic calculations involved in are the most important part of these processes. The calculation of the LOSS value ,calculation of the performance drives through the back propagation algorithm which change the parameters in the networks. So, the most important thing is the way the loss is calculated, the way the network is given, and the way you gave the network about how to tweak its parameter through back propagation is very important.

P.S: Please leave a comment if you find any mistakes/inaccuracies!

--

--

Vidyashree K S

Junior year undergraduate student, majoring in computer science and engineering.