2017: When AI came of age at game playing
I’m sure that in years to come, 2017 will be seen as the time when artificial intelligence came of age in terms of game playing. At least, in terms of board games.
In fact, it was back in March 2016 that Deepmind’s AlphaGo system won a Go match against a top professional player, Lee Sedol. That was followed up with a 3–0 victory in a match against the world №1 ranked player, Ke Jie, in May 2017.
You could be forgiven for thinking that’s the end of the road and there’s little more to do. However, Deepmind stepped this up another level and created AlphaGo Zero. The difference here is that AlphaGo used machine learning techniques to derive a game playing strategy from examples of human play, whilst AlphaGo Zero did not. Instead, AlphaGo Zero was given the rules of play and then left to play itself, learning from it’s own successes and mistakes.
The result: in October 2017, after 40 days of training, AlphaGo Zero beat all of the previous iterations of AlphaGo.
That’s with no prior knowledge. It’s also worth noting that Zero beat the March 2016 version after just 3 days of training — a sobering thought, when you consider that the March 2016 version had the benefit of starting with the accumulated knowledge of thousands of years of human play.
For me, this is hugely impressive. The game of Go is massively complex and has the interesting property that human masters are playing moves based on intuition rather than calculation. The neural networks within AlphaGo and it’s successors are obviously employing considerable computing power, but are in some sense able to replicate the intuitive play of humans.
Although I’m not a Go player, I did watch some of the coverage of the Lee games. It was very interesting to see the commentators trying to make sense of key moves made by the computer. These moves were unexpected and creative, leaving the humans scrambling to make sense of these new ideas.
This is reflective of chess, where computers have been better than humans for a couple of decades, after DeepBlue beat World Champion Garry Kasparov in 1997. Modern chess grandmasters now routinely use chess “engines” for analysis and to improve their own play. In fact, to some extent this has changed the face of chess competitions, where grandmasters are far more likely to play “blitz” matches (matches with short time limits of 5 minutes per player). After all, if the machines are better at the game in absolute terms, there is less attraction for humans to play long (4+ hours) matches to push the boundaries of a skill which could be surpassed by machines.
Chess brings me to the final triumph of AI in 2017.
Although the AlphaGo matches were a fantastic demonstration of AI game playing, one caveat was that the AI was narrow — in the sense that it was just tasked with one job, playing Go. In fact, at Udacity Intersect 2017, in March 2017, Sebastian Thrun was asked what he thought of AlphaGo and what the future of AI would bring. Sebastian, as the founder of GoogleX, is as good as anyone out there at envisioning the future — so I was interested to hear his view that AlphaGo was impressive, but narrow and “well, it can’t play chess”.
But in December 2017, Deepmind announced that a new iteration of their AI system, now dubbed AlphaZero, had beaten the current state-of-the-art chess playing program, StockFish. Again, the approach was to start from just the rules and learn by self-play. The really scary part — this took just 4 hours of training. And note that StockFish is better than the current human world champion by some margin. As one chess grandmaster put it, “I always wondered how it would be if a superior species landed on earth and showed us how they played chess, now I know". Again, there was a scramble for humans to figure out new moves and ideas from the AlphaZero vs StockFish games.
AlphaZero can also play Shogi, as well as both Chess and Go, which is a major step forward for this kind of generic reinforcement learning algorithm.
OK, so there are some criticisms as to whether the AlphaZero matches against leading chess and shogi programs were conducted in a fair way. For example, in the chess match, AlphaZero and StockFish were restricted to one minute per move and StockFish was running on lesser computing hardware than AlphaZero. But that doesn’t diminish the achievement of having one algorithm which can learn this wide range of game-playing tasks — something which would have been unimaginable … well, less than 12 months ago.
So 2017 was the year in which an AI algorithm was able to surpass humans in multiple disciplines and without even needing human inputs to train the system — just the rules and the space to self-play. I think that in years to come, this will be seen as a critical advancement.