AlphaGo Zero: Revenge of the Algorithms?

Published in

Software Is Eating the World

2 min readOct 25, 2017

Steven Sinofsky, Sonal Chokshi, and I recorded an a16z podcast back in March 2016 when AlphaGo got its first burst of publicity after defeating Korean professional Go player Lee See-dol. It was a spectacular technical achievement, and we wondered whether this was set a new trajectory for AI (and specifically deep learning).

Fast forward to today, and the DeepMind team has a—you guessed it: better, faster, winning-er—Go-playing system called AlphaGo Zero. Compared to previous systems (which DeepMind has now affectionately named based AlphaGo Fan, AlphaGo Lee, and AlphaGo Master), AlphaGo Zero:

Trained an order of magnitude faster (3 days versus “several months”) to hit AlphaGo Lee’s skill level
Played using an order of magnitude fewer machines (1 machine with 4 TPUs versus a distributed cluster of servers running 48 TPUs)

Here’s a graph from Deep Mind’s blog post about the system that show how much power the systems draw (I think) while playing:

And crucially, AlphaGo Zero is the first of these systems to start with no training data. Zero. See what they did with the name there? In order words, rather than get bootstrapped with strategies humans have learned over the years playing Go, AlphaGo Zero learned all its strategies from playing games against itself and learning what works.

This is another super impressive technical achievement, and we wanted to revisit some of the same questions in a follow-up podcast:

Where are we in AI?
Does this mean startups should stop trying to get labeled training data? Is this really a “revenge of the algorithms” moment?
Is this a major step towards Artificial General Intelligence, or just another impressive demonstration of narrow AI? (We’re mindful of the long history of AI forecasting which typically gets way overheated after “solving” one of these games—as it did for chess, checkers, Othello, Texas Hold-’Em Poker, etc. In fact, I tweet-stormed about this when AlphaGo Fan press hit in March 2016).

A few other other resources that you might find useful:

I created a 45-minute primer on the history of AI. It’s still one of our more popular videos.
Also, listen in on the podcast I did with Cameron Schuler about the Alberta Machine Intelligence Institute, which has built systems beating the best human players at checkers and Texas Hold-’Em Poker.

Enjoy the podcast:

a16z Podcast: Revenge of the Algorithms (Over Data)... Go! No?

There are many reasons why we're in an "A.I. spring" after multiple "A.I. winters" - but how then do we tease apart…

a16z.com

AlphaGo Zero: Revenge of the Algorithms?

a16z Podcast: Revenge of the Algorithms (Over Data)... Go! No?

There are many reasons why we're in an "A.I. spring" after multiple "A.I. winters" - but how then do we tease apart…

Written by Frank Chen