What is Generative Adversarial Network (GAN)?

Chingiz Seyidbayli
5 min readMar 30, 2023

--

GAN, or Generative Adversarial Network, is a type of neural network architecture that consists of two main components: a generator network and a discriminator network. The purpose of a GAN is to generate realistic data that mimics the input data distribution.

The generator network takes in a random noise vector as input and generates a new data point that is intended to resemble the input data distribution. The discriminator network takes in both the generated data point and real data points from the input distribution and predicts whether each input is real or generated.

During training, the generator network generates a data point and the discriminator network predicts whether it is real or generated. The generator network then receives feedback on how realistic its generated data is based on the discriminator’s output. This process is repeated until the generator network is able to produce realistic data that the discriminator network cannot differentiate from real data.

The training process of a GAN can be described as a two-player game, where the generator and discriminator networks are constantly trying to outsmart each other. The generator network aims to generate data that is realistic enough to fool the discriminator network, while the discriminator network tries to correctly identify whether a given data point is real or generated.

Once trained, the generator network can be used to generate new data that is similar to the input data distribution. GANs have been successfully used in a variety of applications, including image and video generation, text generation, and music generation. However, GANs can also be challenging to train and prone to issues such as mode collapse, where the generator network produces a limited range of outputs.

One example of a GAN application is image generation. In this scenario, the generator network takes in a random noise vector and generates a new image that resembles the input image distribution. The discriminator network takes in both the generated image and real images from the input distribution and predicts whether each image is real or generated.

During training, the generator network generates an image and the discriminator network predicts whether it is real or generated. The generator network then receives feedback on how realistic its generated image is based on the discriminator’s output. This process is repeated until the generator network is able to produce realistic images that the discriminator network cannot differentiate from real images.

Once trained, the generator network can be used to generate new images that are similar to the input image distribution. For example, a GAN could be trained on a dataset of celebrity faces and then used to generate new, realistic-looking celebrity faces. GANs have also been used for other image-related tasks such as image-to-image translation, where a GAN is used to translate an image from one domain (e.g., day-time) to another domain (e.g., night-time) while maintaining the image’s content.

Let’s write a pseudo code for GAN Network

Initialize the generator network G with random weights
Initialize the discriminator network D with random weights
Set the learning rate for both networks
Set the number of training epochs
Set the batch size

for epoch in range(num_epochs):
for batch in data:
# Train the discriminator network
Sample a batch of real images from the training data
Generate a batch of fake images from the generator network
Train the discriminator network on the real and fake images
Compute the discriminator loss

# Train the generator network
Generate a new batch of fake images from the generator network
Compute the generator loss based on the discriminator's output
Backpropagate the loss and update the generator's weights

# Update the discriminator's weights
Backpropagate the loss and update the discriminator's weights

# Generate a sample of fake images from the generator
Save the generator's weights

GAN Coding with Python

To write a complete Python code for a GAN, it would require significant amount of time and resources. However, I can provide a brief outline of the steps involved in training a GAN using the PyTorch library:

import torch
import torch.nn as nn
import torch.optim as optim
import torchvision.datasets as datasets
import torchvision.transforms as transforms
from torch.utils.data import DataLoader

Define the generator and discriminator networks using PyTorch’s nn.Module:

class Generator(nn.Module):
def __init__(self):
super(Generator, self).__init__()
# Define the layers of the generator network

def forward(self, z):
# Define the forward pass of the generator network

class Discriminator(nn.Module):
def __init__(self):
super(Discriminator, self).__init__()
# Define the layers of the discriminator network

def forward(self, x):
# Define the forward pass of the discriminator network

Define the hyperparameters:

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
batch_size = 128
num_epochs = 100
learning_rate = 2e-4
latent_size = 100
image_size = 28*28

Load the MNIST dataset and create data loaders:

train_dataset = datasets.MNIST(root='data/', train=True, transform=transforms.ToTensor(), download=True)
train_loader = DataLoader(dataset=train_dataset, batch_size=batch_size, shuffle=True)

Define the loss functions and optimizers:

criterion = nn.BCELoss()
d_optimizer = optim.Adam(discriminator.parameters(), lr=learning_rate, betas=(0.5, 0.999))
g_optimizer = optim.Adam(generator.parameters(), lr=learning_rate, betas=(0.5, 0.999))

Train the GAN:

for epoch in range(num_epochs):
for batch_idx, (real_images, _) in enumerate(train_loader):
# Train discriminator with real images
real_images = real_images.view(-1, image_size).to(device)
real_labels = torch.ones(batch_size, 1).to(device)
fake_labels = torch.zeros(batch_size, 1).to(device)

# Train discriminator with fake images
z = torch.randn(batch_size, latent_size).to(device)
fake_images = generator(z)
d_real_loss = criterion(discriminator(real_images), real_labels)
d_fake_loss = criterion(discriminator(fake_images), fake_labels)
d_loss = d_real_loss + d_fake_loss
d_optimizer.zero_grad()
d_loss.backward()
d_optimizer.step()

# Train generator
z = torch.randn(batch_size, latent_size).to(device)
fake_images = generator(z)
g_loss = criterion(discriminator(fake_images), real_labels)
g_optimizer.zero_grad()
g_loss.backward()
g_optimizer.step()

Generate new images using the trained generator:

z = torch.randn(64, latent_size).to(device)
generated_images = generator(z)

Note that the above code is just a brief outline, and additional steps and modifications may be required for specific use cases of GANs.

Let we fill the blanks in the code :)

import torch
import torch.nn as nn
import torch.optim as optim
import torchvision.datasets as datasets
import torchvision.transforms as transforms
from torch.utils.data import DataLoader

# Define the generator network
class Generator(nn.Module):
def __init__(self, input_size=100, output_size=784):
super(Generator, self).__init__()
self.input_size = input_size
self.output_size = output_size

self.fc1 = nn.Linear(input_size, 256)
self.bn1 = nn.BatchNorm1d(256)
self.fc2 = nn.Linear(256, 512)
self.bn2 = nn.BatchNorm1d(512)
self.fc3 = nn.Linear(512, 1024)
self.bn3 = nn.BatchNorm1d(1024)
self.fc4 = nn.Linear(1024, output_size)
self.activation = nn.Tanh()

def forward(self, x):
x = self.fc1(x)
x = self.bn1(x)
x = self.activation(x)
x = self.fc2(x)
x = self.bn2(x)
x = self.activation(x)
x = self.fc3(x)
x = self.bn3(x)
x = self.activation(x)
x = self.fc4(x)
x = self.activation(x)
return x

# Define the discriminator network
class Discriminator(nn.Module):
def __init__(self, input_size=784, output_size=1):
super(Discriminator, self).__init__()
self.input_size = input_size
self.output_size = output_size

self.fc1 = nn.Linear(input_size, 1024)
self.activation = nn.LeakyReLU(0.2)
self.fc2 = nn.Linear(1024, 512)
self.fc3 = nn.Linear(512, 256)
self.fc4 = nn.Linear(256, output_size)
self.sigmoid = nn.Sigmoid()

def forward(self, x):
x = self.fc1(x)
x = self.activation(x)
x = self.fc2(x)
x = self.activation(x)
x = self.fc3(x)
x = self.activation(x)
x = self.fc4(x)
x = self.sigmoid(x)
return x

# Define the hyperparameters
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
batch_size = 128
num_epochs = 50
learning_rate = 0.0002
input_size = 100
image_size = 28 * 28

# Load the MNIST dataset
train_dataset = datasets.MNIST(root="./data", train=True, transform=transforms.ToTensor(), download=True)
train_loader = DataLoader(dataset=train_dataset, batch_size=batch_size, shuffle=True)

# Initialize the generator and discriminator networks
generator = Generator(input_size).to(device)
discriminator = Discriminator().to(device)

# Define the loss functions and optimizers
criterion = nn.BCELoss()
g_optimizer = optim.Adam(generator.parameters(), lr=learning_rate)
d_optimizer = optim.Adam(discriminator.parameters(), lr=learning_rate)

# Train the GAN
for epoch in range(num_epochs):
for batch_idx, (real_images, _) in enumerate(train_loader):
real_images = real_images.view(-1, image_size).to(device)
batch_size = real_images.shape[0]

# Train the discriminator network
d_optimizer.zero_grad()

# Train on real images
real_labels = torch.ones(batch

--

--