Will DeepMind use GANs to write the next Harry Potter?

Eric Martin
Jun 18, 2018 · 6 min read
Photo source

First, what are some of the things that Google’s DeepMind has already done?

They’ve created an artificial intelligence (AI) that can beat Atari games: beating some of the games more handedly than human experts. But they’ve done much more than that since then. In this article, I will try to explain AI as I understand it, which is in layman’s, non-technical terms. The specifics may not be accurate, but I do think the generalities are. To the techies, please feel free to correct me if I’m wrong.

Some of DeepMind’s AI magic works using GANs, or generative adversarial networks, or something similar. This type of AI pits two neural nets against each other. They compete against each other at the same task… when one shows itself to be better than the other the loser is forgotten and the winner keeps competing. A copy of the winner is then tweaked, automatically, and another competition ensues between the previous winner and its tweaked copy. Then another winner is declared and replicated, and the loser forgotten. In this way the neural network improves (and possibly grows or shrinks) so that eventually we have essentially the most network that the setup and the hardware can muster. This effective neural network can then be great at playing a particular game, or it can do other things that it might have trained to do, all according to how its human creators designed it.

Using GANs or similar methods, DeepMind has now:

  • Mastered Go

When DeepMind mastered Go, experts in the field thought that feat might be eight or ten years away. DeepMind’s AIs that have mastered these games are now better than the best humans (as far as I know this is still true), and its best Go and Chess programs are probably better at playing these games than any other programs as well.

Now, DeepMind has taken one 2D image or three 2D images and then accurately created the 3D world that those images were derived from using an AI’s “imagination”. Google’s DeepMind did this by using GQNs, or Generative Query Networks: these must have similarities to GANs given the name. In other words, DeepMind is pushing the limits of what AI can do, and now AI can visualize and understand the world in a way that modern science thought was limited to humans and perhaps some very smart animals.

Second, what are some of the things that DeepMind could do?

Perhaps an even more important question would be, “What can’t Google’s DeepMind division do?” What are the limits? I do not have an answer to that question, but it’s one I’m wrestling with. Maybe there are no limits for DeepMind with regards to “intelligence”.

DeepMind could hypothetically create a program to beat almost any board game and perhaps any card game…. almost any game at all.

It seemingly can do almost anything with human input, meaning if it has humans to help it train its neural nets, it can improve the performance of those networks. An example of this is where humans label all pictures containing cats as such and those not containing cats as such, and from many of these examples the network can then start labeling cats without any more human input.

But what can DeepMind do without human input? GANs are where a neural network competes against itself and in these cases there might be no human input beyond an initial set of constraints, rules, laws, and/or goals. The reason that question is important is because without human input an AI can scale well beyond the capabilities of a human, as seen with DeepMind’s AlphaGo AI… and it can get to human-level capabilities much faster than if human input is required, as seen with DeepMind’s AlphaZero.

At what point will a DeepMind created AI be able to understand human language? At that point it can do almost anything.

Will a new breakthrough in AI, beyond GANs, GQNs, and the like, enable stuff like natural language understanding and Artificial General Intelligence (AGI)? AGI is when when AI is so smart it “thinks” in a way that is analogous to how humans think, being able to learn organically, such as through experience, trial and error, or observation.

Or can DeepMind get to natural language understanding and AGI without another breakthrough, doing it with techniques that are already in use?

I think GANs may be enough. Google can feed a dictionary (every word and its definition), then millions of books, then billions of internet articles, into one of its AIs. With that baseline, an AI can have a rudimentary “understanding” of language, where it can spit out words, phrases, and sentences that may or may not make much sense to a human.

Then the GAN can get busy. There would be some human ranking involved in this hypothetical, but hopefully realistic, example. The end product would be a GAN that generates an AI that writes an amazing novel.

Here’s how it might be achievable:

Using a group of expert, human rankers: there would be one hundred individual novels that are ranked as classics (the best rank), one hundred that are ranked as mediocre (the middle rank), and one hundred that are ranked as bad (the worst rank). The GANs would start competing, creating novels that are unique, but then comparing them to the classics, the mediocre novels, and the bad novels. But it wouldn’t be able to compare these novels directly, and if it could, we wouldn’t want it to. Because we wouldn’t want it to be the judge of its own novels (that’s like grading your own test). We’d use another AI for that.

There would be a separate AI that uses a GAN that ranks novels into one of the three troughs: classic, mediocre, and bad. That GAN would need to be very accurate and granular in ranking into one of those three troughs…. perhaps it would train until it was 95% accurate as compared to the trough the human rankers put the 300 books into.

The book ranking GAN would give out a score for each novel: bad novels would score from 0 to 50, mediocre novels would score from 50+ to 100, and classic novels would score from 100+ to 139, and there might be 7 outliers scoring in the 140’s.

Using this ranking GAN as the judge for the initial GAN that is writing novels, we would be pitting two neural networks against each other. Each would come up with a unique novel and they would be judged by the static, pre-trained ranking GAN that gives the novel a numerical value ranking on the bad to classic scale. The network with the higher score wins, and that network is the new baseline. In time, novels should be able to rank higher and higher, eventually surpassing the highest ranked human classic.

At this point it will be interesting to see what the AI is writing at the various levels. I assume that human “bad” novels will constrain themselves to scores along the lines of 20 to 50, and the GANs first novel may score at a 0.0001. But in time it should improve. Seeing novels at various ranks, such and 10, 20, 30, 40, etc. and then into the outlier human level ranks of 140, 145, and 149 will all be interesting.

It may be even more interesting to review novels in the 150’s, 160’s, or higher. I think there will be diminishing returns at some point, where the AI can achieve no higher rank without unrealistic additional computation. Perhaps the GAN can only score into the 160’s and not improve any more. Or perhaps anything above 165 will be incomprehensible, silly, or too abstract or inhuman. It may be a balance. Perhaps 155 will be the rank at which the AI produces its best work. 155 may produce 1,000 of the next Harry Potters, or perhaps the lack of a real human will mean that all of these AI creations ultimately fall flat. But I’d still like to see them try.

A GAN from DeepMind that creates a great book could perhaps help AI creators create an AI that can talk gracefully with humans.

This method of creating a GAN-based AI that can write an amazing book should be able to be used in other fields, if it works. It could create great music, great art, great movie scripts, and perhaps eventually great movies and great video games.

p.s. If you’re curious, I wrote an article on Moore’s Law, now with a crazy chart containing the new, gargantuan Cerebra processor. It blows Moore’s Law away. Enjoy!


where the future is written