Learning on its own

Matt Gross
Blackbelt.ai
Published in
3 min readDec 21, 2017

If you’re tracking the rise of artificial intelligence, you should mark December 7th, 2017 on your calendars. It’s the day when machines started learning on their own.

Of course, machine learning and artificial intelligence have been steadily advancing for years now, in impressive and increasingly useful ways. In most tasks, however, artificial intelligence has only been approaching human intelligence. In general, the only algorithms that have bested human analytical methods have had heavy human assistance, with humans manually including what indicators are important or how to evaluate different kinds of data.

That’s definitely been true with teaching computers to play chess. Computers have been better at chess than people for a while now, ever since Deep Blue beat Garry Kasparov in 1997. It’s interesting to note what algorithms those computers use, however. The strongest chess programs have been built on the chess knowledge humans amassed for hundreds of years, combined with the brute force of a machine that can examine millions of positions per second. Humans were still better at recognizing chess patterns, but at some point encoding basic human patterns combined with a machines’ attention to detail was better than a human alone. Computers have only gotten stronger since 1997, of course. Today, the free chess program on my iPhone could easily defeat the world chess champion.

It’s important to note, however, that those chess programs all have the same basic structure as the one that proved its dominance in 1997. That is, humans design the methods of evaluating positions, and of selecting positions to evaluate, and the computer only brings its superior speed to bear in running through those methods.

On December 7th, Google Deepmind achieved what is, then, a truly remarkable result. They designed a chess-playing algorithm — Alpha Zero — run on a gigantic supercomputer, that convincingly defeated the strongest existing chess program — Stockfish — without getting any human help at all. Basically, the computer was given only the rules of chess, and then the algorithm figured out how to evaluate positions on its own, and got good enough to beat — not only humans — but the strongest computers that humans had taught how to play chess. Oh, and it learned this in only four hours. As if that weren’t enough, it also learned Shogi and Go from scratch, and defeated the strongest programs to play each of those games. Within a 24 hour period. It used the same generalized algorithm for learning each game.

Google published this result, along with ten games showing Alpha Zero beating Stockfish, and the games were fascinating. Alpha Zero often violated principles that humans use to play chess well, but over the course of each game demonstrated that those principles were incorrect in the given position. As someone who has seen a lot of chess games, it was truly fascinating. I’ve played chess against Stockfish a lot. Stockfish plays a lot like a human — albeit one that notices essentially everything about a position. In the games Google published, Alpha Zero played almost like an alien intelligence. It had figured out its own patterns and rules to optimize its play, and they were obviously superior to what the humans had come up with. It wasn’t using brute force calculation, but rather pattern recognition. In fact, Alpha Zero actually evaluated far fewer positions per second than Stockfish does — about 80,000 positions per second compared to 70,000,000 per second — but evaluated them more thoroughly.

The implications of this result are fairly staggering. People didn’t teach the Alpha Zero algorithm anything about chess but the rules, and it figured out more about the principles of chess in four hours than we had in 500 years. Then it figured out Go and Shogi, just as a chaser. Games like these are closed systems, and easier to figure out than many of the systems humans deal with practically, so this result doesn’t immediately generalize to more significant human problems — but the indications are clear. A self-learning automated system has, for the first time in my knowledge, clearly bettered the best automated system using human knowledge. In a sense, the algorithms don’t need our help anymore.

--

--