I just recently went to an Algorithmic Art talk, organised by Tariq Rashid, who went through a very well structured introduction to GAN (generative adversarial networks) and how it can be used to fool the human eye — in this case by creating very realistic faces.
Have a go at guessing which face is real and which is created by GAN at this website: http://www.whichfaceisreal.com/
Very difficult, it turns out.
I am limiting my article here to present a couple of key concepts behind GAN and if you are interested to learn more, you can find an entire presentation here, including the PyTorch* code: Algorithmic Art GAN presentation
Let’s start with a basic description of how machine learning works.
Neural networks are trained by feeding them data and comparing their actual output to the known output, in this case they are fed pictures of faces. The difference between the actual output and the known output is used to adjust the link weights inside the network to subsequently produce a slightly better output.
Neural networks can be trained to perform tasks like classifying images — and a common tutorial example is to learn to classify images of human handwritten digits, known as the MNIST dataset.
A machine learning model has its internal parameters adjusted in response to the error it produces:
The following picture describes the GAN architecture:
The picture above shows a learning model, a discriminator, which is trained to separate real data from fake data — just like a typical machine learning system. We then have a second machine learning model, a generator, which learns to generate data with the aim of getting it past the discriminator.
If this architecture works well, the discriminator and the generator compete to out-perform each other.
As the discriminator gets better and better at telling real from fake data, the generator also gets better and better at generating fake data that can pass as real.
A common analogy is of an arts forger learning to get better at fooling a detective and the detective learning to get better at discriminating a real painting from a forged one.
GANs are hard to train, with common pitfalls like mode collapse where only one of many possible solutions is found by the generator.
While in a normal network the error rate is expected to fall to zero, but with a GAN the discriminator should find it harder and harder to tell apart the real data from the generated data — and so the error should approach 1/2 (or 1/4 if we’re using mean squared error).
This is the fine balance you need for a GAN to work — a successful GAN is a fine equilibrium between the generator and the discriminator.
In this experiment, after many hours of training, this is how the generated faces appeared:
Much further training led to a degradation of image quality, which suggests we’ve reached the limits of the intentionally simple architecture of this experiment.
To get to the high quality of the generated faces at www.whichfaceisreal.com you need more sophisticated equipment, including expensive hardware. But this experiment was an excellent way of describing to someone not familiar with GAN, what it can do and how it is done.
If you want to have a go, this blog post describes how to build your own neural network step-by-step including the code used: Make Your Own Neural Network.
Thanks again to Tariq Rashid for organising this fascinating and insightful talk! Algorithmic Art is a Meetup organising talks in London and Cornwall open for anyone to join.
*PyTorch is a Python-based scientific computing package targeted at two sets of audiences:
- A replacement for NumPy to use the power of GPUs
- a deep learning research platform that provides maximum flexibility and speed