Generative Adversarial Network (GAN)

Published in

Teknopar Akademi

5 min readOct 18, 2023

A Generative Adversarial Network (GAN) is a deep learning architecture that consists of two neural networks competing against each other in a zero-sum game framework. The goal of GANs is to generate new, synthetic data that resembles some known data distribution.

Why were GANs developed in the first place?

Most common neural networks can easily make mistakes when even a small amount of noise is added to the input data. Surprisingly, when these networks make errors, they often appear very confident in their incorrect predictions. This happens because many machine learning models are trained on limited data, which makes them more likely to overfit. It might seem like these models are making straightforward linear distinctions between different classes, but in reality, these distinctions are made up of various linear components. Even a small change in a point in the input space can lead to data being classified incorrectly.

How do GANs work?

Generative Adversarial Networks (GANs) consist of three parts:

· Generative: To learn a generative model, which describes how data is generated in terms of a probabilistic model.

· Adversarial: The training of a model is done in an adversarial setting.

· Networks: Use deep neural networks as artificial intelligence (AI) algorithms for training purposes.

In GANs, there is a Generator and a Discriminator. The Generator generates fake samples of data(be it an image, audio, etc.) and tries to fool the Discriminator. The Discriminator, on the other hand, tries to distinguish between the real and fake samples. The Generator and the Discriminator are both Neural Networks and they both run in competition with each other in the training phase. The steps are repeated several times and in this, the Generator and Discriminator get better and better in their respective jobs after each repetition. The work can be visualized by the diagram given below:

In a Generative Adversarial Network (GAN), there are two key components: the generative model and the discriminator. The generative model aims to learn and replicate the data distribution, maximizing the chances of fooling the discriminator. Meanwhile, the discriminator assesses whether a given sample is from the training data or the generator.

GANs are structured as a minimax game, where the discriminator tries to minimize its reward function V(D, G), and the generator seeks to minimize the discriminator’s reward, effectively maximizing its loss. This relationship can be mathematically described as follows.

where,

· G = Generator

· D = Discriminator

· Pdata(x) = distribution of real data

· P(z) = distribution of generator

· x = sample from Pdata(x)

· z = sample from P(z)

· D(x) = Discriminator network

· G(z) = Generator network

Generator Model:

The Generator is trained separately, and the Discriminator is not involved. Once the Discriminator has been trained using the fake data generated by the Generator, its predictions are used to further improve the Generator’s performance, making it more adept at deceiving the Discriminator.

Discriminator Model:

The Discriminator is trained independently while the Generator is not active. During this phase, the network only performs forward propagation, with no back-propagation. The Discriminator goes through several training epochs on real data to learn how to accurately classify it as real. Additionally, in this phase, the Discriminator is trained on the fake data generated by the Generator to assess its ability to correctly identify it as fake.

Here are various types of GAN models:

Vanilla GAN: The simplest GAN, using basic multi-layer perceptrons for the Generator and Discriminator. It optimizes a mathematical equation with stochastic gradient descent.

Conditional GAN (CGAN): Introduces conditional parameters. The Generator receives an additional parameter ‘y’ to generate specific data, and the Discriminator takes labels into account to distinguish real data from fakes.

Deep Convolutional GAN (DCGAN): Utilizes Convolutional Neural Networks instead of perceptrons, avoiding max pooling and employing convolutional stride. Layers are not fully connected.

Laplacian Pyramid GAN (LAPGAN): Utilizes multiple Generators and Discriminators at different levels of a Laplacian Pyramid. This approach is known for producing high-quality images by upscaling and downscaling images iteratively.

Super Resolution GAN (SRGAN): Combines deep neural networks with adversarial networks to upscale low-resolution images, enhancing details and minimizing errors.

These various GAN models cater to different applications and have distinct architectures for generating and discriminating data.