How AI got its imagination

Dave Tai
Dave Tai
Jul 3, 2018 · 4 min read

Picture a bird. Completely red with black wings and a pointy beak. What do you see?

The above image was how an AI saw it based on the very same text you just read — almost like it’s capable of imagination.

Imagination in AI: An Unimaginable Task

Imagination seems intuitively simple to us. Pink Elephant Dancing on a Boat — close your eyes and there it is in your head. But training an AI to think like us can be frustratingly time consuming.

The usual method to train a neural network (a computer system modelled after our nervous system) is shockingly labour intensive. Take when Microsoft wanted to train a neural network to recognize 91 objects easily recognizable by a 4-year-old for instance. They had to create a database with 328,000 images, each one painstakingly gathered and annotated by humans, resulting in grand total of 2.5 million labels. The man hours needed just to make that happen — 70,000 hours.

70,000 hours just for a neural network to recognize simple objects. What if, like the red bird example, you wanted it to not only recognize but also create?

Oobah vs Paris Fashion Week on Repeat

Enter Generative Adversarial Network (GAN), a new way to train neural networks, innovated by Ian Goodfellow in 2014. Instead of an army of humans creating an extensive database, GAN pits two neural networks against each other.

To understand how GAN works, let’s take a look at Paris Fashion Week 2018. Oobah Butler, a writer based in the UK, faked his way into closed exhibitions and after parties. He mingled with the top names of the industry, got influencers to try a brand he picked up at a street market and even had them endorsing it.

Oobah 1 : Fashion Week 0

This could be seen as a game of Oobah vs Fashion Week. Oobah’s goal is to dress, talk and walk like a fashion designer even though he isn’t one, while Fashion Week’s goal is to ensure that only actual fashion designers are allowed in.

Imagine if Oobah and Fashion Week went up against each other a million more times. Each time Oobah infiltrates, Fashion Week tightens its security and every time Oobah fails to pass off as a fashion designer, he learns how to better emulate them. We will then end up with a very sophisticated Oobah who can fool most people and a very strict Fashion Week that knows just what to look out for.

GAN pits neural networks in the same game where one neural network plays the Generator (Oobah) and the other plays the Discriminator (Fashion Week). Using the neural networks to train each other would require a much smaller database and less human supervision. This opened the doors to futures that were not previously possible.

Pictured: Generator partying with Discriminators

A Post-GAN World

Just as GAN has created red birds based on text, it has many other applications in the field of AI where a spark of imagination is required.

We now have AI that is capable of turning a blurry image into a high resolution one by making smart guesses on what the missing pixels should be.

AI that even created entire galleries of fake celebrities.

And recently at Facebook — AI that could ‘open’ your eyes in photos where your eyes are closed.

This is only the beginning of GAN’s application for AI. With more methods currently being developed for applying adversarial networks, we can expect the coming decade to be an interesting one for AI.

It is no wonder, Facebook’s AI research director Yann LeCun called GAN “the most interesting idea in the last 10 years of Machine Learning.”