What the hack is GAN?

Pratik Parmar
6 min readDec 15, 2018

--

Robot, een zelfportret schilderend, Johan Scherft

Generative Adversarial Networks aka GAN is very hot (and controversial too) topic in tech world. If you Google about GAN, you’ll come across a Wikipedia post about a language spoken in the China. We’re not gonna talk about that, of course. In this post, I’ll try to explain it in simple terms.

Even if you’re not working on Machine Learning you must have heard about a portrait sold for $432500, which was actually generated by a GAN , based on opensource code(art-DCGAN) written by Robbie Barrat.

Art created by GAN

Now, you must be wondering what the hack is GAN? Well as per Wikipedia:

GANs are a class of artificial intelligence used in unsupervised machine learning, implemented by a system of two neural networks contesting with each other in a zero-sum game framework.

It doesn’t make much sense to me(not sure about you though, intelligent readers 😎). So, let’s take an real life example.

You’re a fan of Queen(band), like me. Let’s consider you’ve thousands of songs of Queen with lyrics. Now you want to create an artificial singer which can create Queen’s songs based on given lyrics using artificial intelligence.

Well, the algorithm used to generate songs is GAN. It’s brain child of Ian Goodfellow. Yes, he’s the author of famous book aka bible of deep learning: “Deep Learning Book”.

In order to learn about GAN, we have to understand about what generative models is. There are two main types of models in machine learning: Generative and Discriminant.

A discriminator model is a classifier model— which discriminates between two ( or more ) different classes of given data.

A generator model is a model which generates data that fool the discriminator.

Minimax game alternates between training discriminator and generator.

In other words, discriminative models map feature to labels. It’s focused solely with that correlation. On the other hand, we can think about generative models as opposite. Instead of predicting label based on given features, it predicts features based on given labels.

Let’s take one more example to clear these things more.

Source: Getty

Let’s imagine that you’re sipping wine sitting at a beach wearing Ray-Ban. You brag about it on Instagram after the vacation gets over, adding a garnishing that it was Maldives or Bahamas (how’s my choice? 😉). One of your eagle eyed Sherlock fan makes it clear in comments on Instagram, that the beach and the wine was photoshopped. When such stuff happens in machine learning, it’s called a discriminant model. This model determines whether an image is fake or natural. In machine learning, being natural means does it belongs to the training dataset.

Coming back to the main point,another friend of yours is holding your back. He helps you generate an another similar image of yourself, this time you were sipping a Virgin Bloody Mary. In machine learning, this friend of yours in generative model. This model tries to create natural image similar to the original dataset. For nerds like me, get this straight: a generative model is North Korea (see this article) and the discriminant model is the FBI.

Wanna see the results produced by recent GAN models? Checkout the image below, and figure out how many of them is generated by a GAN.

Source

Sorry guys, all of these images are created by a GAN. I call them, real-fake images. 😎

How GAN learns?

Okay, it’s time to see under the hood of this shiny luxury car.

The generator model takes vector of random numbers(z) as input and transform it into the data it intends to copy. While discriminant takes a set of data as input and determines the probability whether it’s natural or not.

Actually, it can’t do anything more than predicting real or fake with accuracy of only 50%. It’s similar to coin flipping and guessing, whether it’s head or tail. Both functions are optimized on the batches of real and generated data alternatively, until the GAN slowly converge to producing realistic data.

Training

The image above shows the continuous progress of training, updating models and discriminate the data. GANs are trained by simultaneously updating the Discriminative model(blue dashed line) so that it discriminates between samples from the data generating model (black dotted line) from those of the Generative model(green solid line).

Challenges

— Soft and noisy labels

— Large kernels and more filters

— One class at a time

and many other challenges are there in GAN. But that’s a topic for another article.

It takes a long time to train a GAN. It might take hours on a decent single GPU, and more than a day on a single GPU. It’s difficult to tune and hence to use, GAN has catapulted a lot of interesting research projects.

Applications

What’s point of all of these nerdy talk, if we can’t put it into something useful. Here are the some of the most interesting use-cases of GAN.

  1. GANPaint
    Ever faced a situation where you don’t have your Photoshop master friend around you need to remove an object from your photograph, well savior is here, GANPaint. Watch video below for more understanding.
    It’s created by researcher at MIT, GANPaint help visualize and understand the GANs.
    Source-code: https://github.com/CSAILVision/GANDissect
GAN Dissection using GANPaint

2. Image-to-Image translation
While GAN were being used to create fake images, now we can use CycleGAN to convert object to another object in videos too.

A horse converted into a zebra using GAN (Source)

Source-code: https://github.com/junyanz/CycleGAN
Related projects: pix2pix, iGAN

3. PixelDTGAN
Based on DCGAN, this GAN model creates clothing images and styles from an image.

Source-code: https://github.com/fxia22/PixelDTGAN\

4. NeuralFace
NeuralFace uses Deep Constitutional Generative Adversarial Network(DCGAN) developed by FAIR(Facebook AI Research). It creates face images, just like the image we saw earlier.

Source-code: https://github.com/carpedm20/DCGAN-tensorflow
Demo: https://carpedm20.github.io/faces/

Checkout this GitHub repo for more GAN applications.

Hope you found this post interesting and easy to understand.

Signing of, Pratik Parmar

Resources and recommended reading:

Original paper on GAN by Ian Goodfellow at NIPS 2014 (now NeurIPS)

List of Papers published on GANs

From GAN to WGAN (covers math behind GAN and why it’s hard to train)

DCGAN (Deep Convolutional Generative Adversarial Network)

Deep Learning Book by Ian Goodfellow

--

--