Ideas Behind GAN’s

Salih Talha Akgün
CARBON CONSULTING
Published in
7 min readMar 23, 2021

GAN’s are neural network structures that can be used to generate new image with the statistics of the images we have.

Generative = relating to or capable of production

Adversarial = characterized by conflict or opposition

To understand GAN’s we have to understand two type of neural networks that GAN’s are using. A GAN learns from the conflict of two neural networks. These neural networks are called Discriminator and Generator.

Discriminator’s purpose in GAN is to identify the image if it is real of fake. It looks at the image (looks = gets information from image data with the help of CNN’s) and outputs 1 if it thinks data is real and 0 otherwise.

Discriminator = characteristic which enables people or things to be distinguished from one another (outputs 0 or 1)

Generator = a person or thing that generates something (outputs new image)

As you can understand from the name, Generator generates new image. These are the purposes of Discriminator and Generator.

You can think of Generator as an inexperienced artist (that creates images) and Discriminator as an inexperienced art buyer (that tells if an image is real of fake). As they learn and build experience, they will become better at their jobs.

At the end of training, we put aside the Discriminator and use Generator(our experienced artist) to accomplish our task which is generating new images.

How GAN’s Work?

I’ll assume that you’ve understand these two type of neural networks and continue on how GAN’s work.

Sidenote: Please don’t worry about why these networks work and try to understand how they work. I’ll try to explain why this system works at the end.

Grokking Deep Learning For Computer Vision Chapter 8 Page 90 ( Livebook )

These photos are the outputs of the GAN’s Generator that trained on MNIST dataset. As you can see at first epoch, Generator outputs just some random noise. This is not a problem since our Discriminator will also output randomly 0 or 1 at first. But the important thing here is that they will gradually learn from our real data and race each other to be better. As race continues, learning will continue. After many epochs, both Discriminator and Generator will be better at their jobs. Our goal here is to make Generator good enough and use it while putting Discriminator out of the game.

What I mean by saying this a race is actually defined in mathematics. Maybe you’ve already noticed, this can be defined as a minimax problem:

Minimax Loss

Loss functions that are used in GAN’s are very long topic. Understanding the concept of racing is enough for now. I’ll break into losses that are used in a moment.

Why Use Discriminator?

Real question here is: why do we measure Generator’s success with Discriminator’s output, instead of the real data directly?

Adversarial = characterized by conflict or opposition

Because if we use real data directly we will approximate to the real data but we want to approximate it while having some randomness in the system. Such randomness can be add at the beginning with giving random weights to Discriminator and wait for it to discover new ways to discriminate between real and generated data. This will also determine the real success of our Generator.

How GAN’s Learn?

There is the structure of a Generative Adversarial Network

https://developers.google.com/machine-learning/gan

As you can see, besides Generator and Discriminator’s weights there is another randomness in the system, which is random input to the Generator. This ensures that generator will not always give the same output but gives random output with respect to random input.

System works like this:

1- Generator gets random input and creates random output (an image).

2- Discriminator gets 2 images, one is real and one is generated by Generator. Discriminators job is to label them as real(1) or generated(0).

3- Discriminator Loss and Generator Loss will be calculated by Discriminator’s output.

If Discriminator label photos correctly, that means Generator must learn from it’s mistakes.

If Discriminator label photos wrongly, Generator is tricked Discriminator to believe that generated photo is real like.

The main flow is same but of course there is planty of different ways to train this system. Normally people will train one epoch Discriminator and one epoch Generator. If one of them stays behind or ahead of other with respect to their loss values, system will not learn because it’ll be like giving quantum mechanics class to a 4 year old children. Important thing here is that these two networks must be learning together. If one of them stays behind, you can change the ratio of training them.

Also as you can see that we learn these patterns with the help of discriminator loss and generator loss using backpropagation.

System looks good but how discriminator loss and generator loss are computed?

Discriminator Loss

Here is the discriminator loss.

Discriminator Loss

1- The summation from one to m, represents doing this process to all of m images in our batch.

2- Left side of the summation inside square brackets represents Discriminator’s output on real image. If Discriminator outputs one for real image, log(1) will be zero. If it outputs zero for real image, log(0) would go to minus infinity.

3- Right side of the summation represents Discriminator’s output on generated image. The z inside G function represents the noise we give to the Generator. If Discriminator outputs one for generated image, log(1–1) would go to minus infinity. If it outputs zero for generated image, log(1) will be zero.

So if Discriminator makes mistake, our result would go to minus infinity, but if all of our predictions are true, we will get zero loss.

Since Loss gets smaller as we make mistake, we will apply this with a gradient ascent.

Generator Loss

Here is the generator loss.

Generator Loss

1- The summation from one to m, represents doing this process to all of m images in our batch.

2- There is only one log term in this equation and that is representing the prediction of the Discriminator on the generated image. If Discriminator outputs one for generated image, we get what we want and log(1–1) would go to minus infinity. But if Discriminator predicts it truly as zero, log(1) would be zero.

So if Generator makes mistakes, our result would get closer and closer to zero. If Generator makes really good looking images, our result gets closer and closer to minus infinity.

Since our loss gets smaller as we make good performance, we will apply this with a gradient descent.

Why GAN’s Learn?

Answering why any neural network learns something is a difficult question because we don’t even exactly know how humans learn. What I meant with this question is, why GAN’s learn something in this unique way?

This minimax game that played between Discriminator and Generator is the real think we need to talk about. In normal neural networks, we define our loss with respect to test data and measure the quality of a training with respect to loss values. But there is a difference with GAN’s because as I’ve write it below, we define our loss with respect to Discriminator’s predictions of real and generated data. Which also means that we can train a Generator without giving it any real data. And the good thing here is that Discriminator starts to predict randomly because of the given random weights. So the Discriminator plays a randomizing role here for Generator and Generator does not have any feedback besides that, which makes Generator to find more creative ways to trick Discriminator. Bear in mind that since Discriminator has access to real data, it’ll learn to discriminate from real data and that ensures Generator will eventually approximate to real data.

So this minimax game gives us great approximations to real data. Here are some applications that can be done with GAN’s. One of my favourites are https://thisartworkdoesnotexist.com/ and https://thisurldoesnotexist.com/. You can look this website for more: https://thisxdoesnotexist.com/

For more information about GAN’s please check these websites:

https://neptune.ai/blog/gan-loss-functions https://developers.google.com/machine-learning/gan

--

--