Mastering the game of Go with deep neural networks and tree search [Miniseries: 2016’s top 100 journal articles]

RealKM Magazine
Dec 27, 2016 · 2 min read

This is part 1 of a miniseries reviewing selected papers from the .

In March 2016, Google’s AlphaGo artificial intelligence (AI) program made when it beat Lee Sedol, the strongest Go player in the world, four games to one.

The ancient Chinese game of Go looks straightforward enough, with players taking turns placing black or white stones on a board, trying to capture the opponent’s stones, or surrounding empty space to make points of territory. However, as CEO and co-founder of Google DeepMind Demis Hassabis , Go is actually profoundly complex, with more possible positions than the number of atoms in the universe.

The complexity of Go makes it an irresistible challenge for AI researchers, with games such as checkers and chess having been used as a testing ground since the 1950s. Hassabis advises that cracking Go required a different approach:

Traditional AI methods — which construct a search tree over all possible positions — don’t have a chance in Go. So when we set out to crack Go, we took a different approach. We built a system, AlphaGo, that combines an advanced tree search with deep neural networks. These neural networks take a description of the Go board as an input and process it through 12 different network layers containing millions of neuron-like connections. One neural network, the “policy network,” selects the next move to play. The other neural network, the “value network,” predicts the winner of the game.

We trained the neural networks on 30 million moves from games played by human experts, until it could predict the human move 57 percent of the time (the previous record before AlphaGo was 44 percent). But our goal is to beat the best human players, not just mimic them. To do this, AlphaGo learned to discover new strategies for itself, by playing thousands of games between its neural networks, and adjusting the connections using a trial-and-error process known as reinforcement learning. Of course, all of this requires a huge amount of computing power, so we made extensive use of Google Cloud Platform.

Further details can be found in Google’s paper [1], which is article #9 of the .

References:

  1. Silver, D., Huang, A., Maddison, C. J., Guez, A., Sifre, L., Van Den Driessche, G., … & Dieleman, S. (2016). Mastering the game of Go with deep neural networks and tree search. Nature, 529(7587), 484–489.

Originally published at .

RealKM Magazine

Syndicating the articles published on realkm.com

RealKM Magazine

Written by

RealKM brings you the findings of high-value knowledge management (KM) research in concise, practically-oriented articles.

RealKM Magazine

Syndicating the articles published on realkm.com