How Machine Learning Beat the World’s Best Go Player

Lukas Biewald
5 min readMar 14, 2016

--

Go is the oldest board game that people still play today. Interestingly, it’s also the last game where the best human players could beat the best computers–until last week. That was when DeepMind’s Alpha Go beat Lee Sedol, the best Go player in the world.

I watched Deep Blue beat the world chess champion around the time I really started to get interested in Go. And while Deep Blue already had the ability to beat Gary Kasparov, at the time, Go programs played worse that an above average 7-year old. And it didn’t look like there was any easy way to improve them. Wondering how a computer could learn to play Go was what pushed me to study artificial intelligence.

Go almost feels like it was designed for humans to beat algorithms. Players take turns placing black and white stones on a grid. It’s an incredibly intuitive game that small children can learn to play decently well before they can count the score.

Every other board game was solved by computers with a variation on the same algorithm developed in the fifties: first, the computer looks at all the moves it can do and then all the possible counters its opponent can do, and then all its possible counters, and then all their possible counters, etc. Then the machine estimates a score and chooses the move that gives it the best outcome assuming that its opponent plays the best possible moves. The number of moves that it needs to look at grows exponentially, but as computers became exponentially faster, eventually, computers became better than humans by brute force.

By the time I started working in the Stanford AI lab, no one thought that games were that interesting anymore. Who would have guessed that recognizing a face in a photograph would be considered a much deeper, harder problem than playing grand-master level chess?

But playing Go is much closer to recognizing a face in a photograph than anyone realized.

There are two things that made Go hard for computers. The first is that the number of possible moves is higher than in most games, so it’s not as practical to consider every move. The second thing is that it’s really hard to make a simple algorithm to tell who is winning. In Poker, the person with more chips is winning by definition. In Backgammon, whoever has more pieces close to the end is probably ahead. In Chess or Shogi or Checkers if you look at who has more pieces, you can pretty much guess who has an advantage.

Go is all about surrounding territory. But expert level Go players never completely surround territory until the very end. They sketch out their territory and much of it is ambiguous. There is a concept of “aji” — the literal English translation is flavor — that comes up in the analysis of any Go game. It means how much looseness is left in the position. So you might say Black should consider playing in White’s territory in the lower left corner because there’s still some aji, or flavor there. Good Go players are fantastic and intuiting the level of “aji” in a position and can at a glance tell who is ahead. I’m convinced that this analysis uses the same part of my brain that recognizes a friend in a dark bar from a dim outline of part of their face.

So, if playing a game of Go at the highest level isn’t simply a matter of “who has the most pieces” and if the possible amount of games is as massive as it actually is, how did Google’s DeepMind actually beat the world’s best player? It’s actually a lot like how aspiring Go experts train themselves.

Traditionally, aspiring Go experts learn to play Go by studying thousands of professional games and intuiting the subtleties and strategies that led to victory. In the same way that Google’s search algorithm is trained on millions of hand labeled search results or the way Facebook’s facial detection algorithm is trained on millions of hand tagged photos, DeepMind trained its algorithm on hundreds of thousands of expert level Go games.

In the same way that an expert human player only considers a few moves based on their experience of playing and studying thousands of games, DeepMind uses all of the games that it has watched to build a human-like intuition to narrow its search. This is markedly different from the algorithms that “solved” chess. DeepMind isn’t looking at every possible move and every possible counter and every possible counter to that and so on. It’s actually using machine learning to pick the promising moves based on its experience watching professional games and playing against itself. In other words, DeepMind plays like a human.

What’s interesting to a lot of us in the AI field is that DeepMind learned to play Go in the same way most commercial machine learning products gain intelligence: by looking at massive training datasets. In other words: it’s not a handcrafted system like a chess algorithm; instead it uses a more general machine learning approach. These days I run a startup, CrowdFlower, that helps companies train artificial intelligence for commercial purposes, so I get an inside view into how businesses use machine learning for real world applications. And the way they all do it is by looking at giant training sets, finding patterns, and making decisions based on the patterns they see. And that’s how DeepMind beat Lee Sedol.

People working on artificial intelligence talk about the “AI effect.” Essentially, that means once a computer can do something it’s no longer considered real “intelligence.” I think in some cases this is true — the way Deep Blue approached chess was pretty different than how a human would do it and you could see it in its playing style. But I think Deep Mind’s approach to Go is much more similar to the way our brain works and consequently it plays much more like human. From my vantage point, the strategies it chooses are indistinguishable from an extremely strong human Go player.

We shouldn’t need to focus on what’s considered “intelligence” when we talk about AI. Instead, we should realize that the closer AI comes to mirroring how people learn, the better it will get at applications older algorithms simply don’t do well. It’s why machine learning has made such strides in image recognition, self-driving cars, and disease diagnoses. AI no longer needs a person to set up narrow rules and handcrafted algorithms. And that’s a truly exciting thing.

--

--

Lukas Biewald

I'm the founder of Weights & Biases. Previously founder of Figure Eight (formerly CrowdFlower).