Creating People That Never Existed — Generative Adversarial Networks

Bagavan Marakathalingasivam
The Startup
Published in
9 min readJan 17, 2021

--

An introduction to Generative Adversarial Networks and how you can implement one yourself

Do you know who any of these people are?

Images of people you might know

I’m going to take a wild guess and say that you don’t know who any of these people are. How do I know this? Well, this is because these “people” do not exist! Cause these images were created by artificial intelligence! 😳

Now I don’t know about you, but it’s really scary to see how far AI has come. Not just because it can create these images, but the fact that this task is almost impossible for humans to do!

Now you’re probably wondering, how AI was able to do this.

Well, the answer is in Generative Adversarial Networks (GANs)!

Generative Adversarial Networks, or GANs for short, are a type of machine learning framework used in AI. You give in a bunch of different images to train the model.

This is done with two neural networks that fight with each other in a game. These networks are called the generator and the discriminator.

How do these networks work? And why are they fighting?

Generator: This is what takes in this training data and tries to recreate it by looking at various features and patterns. (The Artist)

Discriminator: This is what looks at the generator’s output and compares it to the real images to see if it looks fake or not. (The Critic)

This is a basic illustration of how the GANs learns to create perfect images

When you’re training a GAN, the generator slowly starts to become better at creating images that look real, while the discriminator gradually becomes better at telling them apart. The GAN would be done training once it reaches equilibrium. This is when the generator is so good at its job that the discriminator can no longer tell the difference between real images and fakes.

Now, the generator needs to extract features from the images to make its image.

But how does it do this?

Introducing, Convolutional Neural Networks (CNNs)

Convolutional Neural Networks, or CNNs for short, are neural networks specifically designed for analyzing visual imagery. These networks can recognize the different complex patterns within an image.

How the CNN Extracts features

This is perfect for training GANs because the generator will easily find features that they can replicate to generate their image.

This form of GAN is called a Deep Convolutional GAN (DCGAN). This is when you use deep learning and convolutional layers to train your GAN.

Now that you know all about GANs let’s implement our own!

How to implement your first GAN!

Before we start running our program, I would recommend using Google Colab and connect to its Cloud GPU for free!

This will save a lot of time and is easy to set-up.

Now, in this project, we will be building a GAN that will generate images from the CIFAR-10 Dataset. This is perfect for building your first GAN because you won’t need to download any data. This dataset is already built in the library that we will be using (Keras).

Now let’s get on with the code!

Import Libraries

from keras.layers import Input, Dense, Reshape, Flatten, Dropout from keras.layers import BatchNormalization, Activation, ZeroPadding2D 
from keras.layers.advanced_activations import LeakyReLU
from keras.layers.convolutional import UpSampling2D, Conv2D
from keras.models import Sequential, Model from keras.optimizers import Adam,SGD import keras
import matplotlib.pyplot as plt
import
numpy as np

First, we are going to import the layers that we will be using, from the keras.layers module. As you can see, we will be using Convolutional layers, which means that we will be building a DCGAN.

We will then use matplotlib.pyplot for visualizing our data.

Finally, we will import NumPy because we will be working with arrays.

Load Dataset

# Load CIFAR10 data
(X_train, y_train), (_, _) = keras.datasets.cifar10.load_data()

# Select a single class images (birds)
X_train = X_train[y_train.flatten() == 2]

The CIFAR-10 dataset includes 10 different images/labels. We are only going to choose one label for our GAN to generate.

In our case, we are going to be generating birds from this dataset.

Setting Variables

# Input shape
img_rows = 32
img_cols = 32
channels = 3

img_shape = (img_rows, img_cols, channels)
latent_dim = 100

These are the dimensions of our input shape.

latent_dim is the number of nodes used as inputs for our generator.

Now that we’ve done that, let's start building our generator!

The Generator

def build_generator():        model = Sequential()

model.add(Dense(128 * 8 * 8, activation="relu", input_dim=latent_dim))
model.add(Reshape((8, 8, 128)))

model.add(UpSampling2D())#upsamples to 16*16*128

model.add(Conv2D(128, kernel_size=3, padding="same"))
model.add(BatchNormalization(momentum=0.8))
model.add(Activation("relu"))

model.add(UpSampling2D()) #upsamples to 32*32*128

model.add(Conv2D(64, kernel_size=3, padding="same"))
model.add(BatchNormalization(momentum=0.8))
model.add(Activation("relu"))

model.add(Conv2D(channels, kernel_size=3, padding="same"))
model.add(Activation("tanh"))

#outputs an image of 32*32*3

noise = Input(shape=(latent_dim,))
img = model(noise)

return Model(noise, img)

This is a function that we are going to use to build our generator.

We will need to use Sequential() because it allows us to add layers to our neural network.

This is our model for the generator. The dense layers act as hidden layers in our model. We are also using Convolutional layers for our generator.

In order to produce images, we will be using upsampling2D.

The Discriminator

def build_discriminator():

model = Sequential()

model.add(Conv2D(32, kernel_size=3, strides=2, input_shape=img_shape, padding="same"))
model.add(LeakyReLU(alpha=0.2))
model.add(Dropout(0.25))
#no normalization for the first layer

model.add(Conv2D(64, kernel_size=3, strides=2, padding="same"))
model.add(ZeroPadding2D(padding=((0,1),(0,1))))
model.add(BatchNormalization(momentum=0.8))
model.add(LeakyReLU(alpha=0.2))
model.add(Dropout(0.25))

model.add(Conv2D(128, kernel_size=3, strides=2, padding="same"))
model.add(BatchNormalization(momentum=0.8))
model.add(LeakyReLU(alpha=0.2))
model.add(Dropout(0.25))

model.add(Conv2D(256, kernel_size=3, strides=1, padding="same"))
model.add(BatchNormalization(momentum=0.8))
model.add(LeakyReLU(alpha=0.2))
model.add(Dropout(0.25))

model.add(Flatten())
model.add(Dense(1, activation='sigmoid'))



img = Input(shape=img_shape)
validity = model(img)

return Model(img, validity)

This function will build our discriminator.

The Discriminator is also a CNN with LeakyReLU activations. Many functions will work fine with this basic GAN architecture. However, this method is very popular because they help the gradients flow easier through architecture.

Finally, the Discriminator needs to. We use the Sigmoid Activation for that.

Build and Compile our model

# Build and compile the discriminator
discriminator = build_discriminator()
discriminator.compile(loss='binary_crossentropy',
optimizer=Adam(0.0002,0.5),
metrics=['accuracy'])
# Build the generator
generator = build_generator()
# The generator takes noise as input and generates imgs
z = Input(shape=(latent_dim,))
img = generator(z)
# For the combined model we will only train the generator
discriminator.trainable = False
# The discriminator takes generated images as input and determines validity
valid = discriminator(img)
# The combined model (stacked generator and discriminator)
# Trains the generator to fool the discriminator
combined = Model(z, valid)
combined.compile(loss='binary_crossentropy', optimizer=Adam(0.0002,0.5))

We are going to save our functions into variables discriminator and generator.

In order for the generator to generate images, it needs to have something to start off with. This is why we are creating noise.

This will just go into our generator like a canvas for our generate to create images.

Since we are making a combined model, we only need the generator to learn. We are mainly doing this to use less power and to speed up the process.

Next, we are using the discriminator to determine whether the generator’s image is real or fake.

Finally, we are going to compile our model by using 'binary_crossentropy' as our loss, and Adam as our optimizer with a learning_rate of 0.0002 and 0.5

Create functions to show images

def show_imgs(epoch):
r, c = 4,4
noise = np.random.normal(0, 1, (r * c,latent_dim))
gen_imgs = generator.predict(noise)
# Rescale images 0 - 1
gen_imgs = 0.5 * gen_imgs + 0.5
fig, axs = plt.subplots(r, c)
cnt = 0
for i in range(r):
for j in range(c):
axs[i,j].imshow(gen_imgs[cnt, :,:,])
axs[i,j].axis('off')
cnt += 1
plt.show()
plt.close()

When this function is called, it will show us the images that the generator creates as it trains.

def show_losses(losses):
losses = np.array(losses)

fig, ax = plt.subplots()
plt.plot(losses.T[0], label='Discriminator')
plt.plot(losses.T[1], label='Generator')
plt.title("Training Losses")
plt.legend()
plt.show()

When this function is called, it will implement a graph that will show the discriminator and generator’s loss.

Training our model

Now it’s time to train our model!

epochs=15000
batch_size=32
display_interval=1000
losses=[]
#normalizing the input
X_train = X_train / 127.5 - 1.
# Adversarial ground truths
valid = np.ones((batch_size, 1))
#let's add some noise
valid += 0.05 * np.random.random(valid.shape)
fake = np.zeros((batch_size, 1))
fake += 0.05 * np.random.random(fake.shape)
for epoch in range(epochs):# Train Discriminator# Select a random half of images
idx = np.random.randint(0, X_train.shape[0], batch_size)
imgs = X_train[idx]
# Sample noise and generate a batch of new images
noise = np.random.normal(0, 1, (batch_size, latent_dim))
gen_imgs = generator.predict(noise)
# Train the discriminator (real classified as ones and generated as zeros)
d_loss_real = discriminator.train_on_batch(imgs, valid)
d_loss_fake = discriminator.train_on_batch(gen_imgs, fake)
d_loss = 0.5 * np.add(d_loss_real, d_loss_fake)
# Train Generator# Train the generator (wants discriminator to mistake images as real)
g_loss = combined.train_on_batch(noise, valid)

# Plot the progress
if epoch % display_interval==0:
print ("%d [D loss: %f] [G loss: %f]" % (epoch, d_loss[0], g_loss))
show_imgs(epoch)
if epoch % 1000==0:
losses.append((d_loss[0],g_loss))

First, we are going to set some of our variables.

We are going to make our epochs 15000 when we are training. Next, our batch size will be 32. The display interval is going to be 1000, which means that it will show images for every 1000 epochs.

We will then normalize our input.

Finally, we will be iterating through each epoch and do the following:

  1. Select a random half of the images
  2. Sample noise and generate a new batch of images
  3. Train the discriminator. (The real images will be classified as ones, and the generator images are classified as zeros.)
  4. Train the generator to try and “fool” the discriminator
  5. Plot its progress (printing the loss of the generator and discriminator and show the images for every 1000 epochs)

After this is finished running, your output would look something like this:

Generator’s output images with the loss

Show Loss

show_losses(losses)

This code calls the show_losses function that we defined previously.

The output should look something like this:

Graph output of our function

And we’re done!

We did it! You just created your first DCGAN which can generate pictures of birds. Now, if you want to see what your birds look like you can use the following code:

noise = np.random.normal(size=(40, latent_dim))
generated_images = generator.predict(noise)
generated_images = 0.5 * generated_images + 0.5
f, ax = plt.subplots(5,8, figsize=(12,8))
for i, img in enumerate(generated_images):
ax[i//8, i%8].imshow(img)
ax[i//8, i%8].axis('off')

plt.show()

This code will show you the images that the generator output.

Play around with the code and try running it for more epochs to train it better!

After running this code for 30000 epochs, the generator generates images like this:

Generator after 30000 epochs

Like I mentioned before, doing a task like this is almost impossible for humans, this is why our images don’t look perfect. However, keep training your model, it will eventually look like birds!

After that, you can experiment with other datasets such as the celeb_A dataset. This one has over 200k images of celebrity faces, which means that you can train your own GAN to generate images of faces.

We’re at the end already?

We are sadly already at the end of the article. I hope you learned at least one thing about GANs and I hope you enjoyed making your own!

Thank you for reading this article! :)

--

--