Why AlphaGo is a milestone but it still not achieved AGI

Congrats Google Deepmind. You did it! Top rank player of Go is no longer a human. Should we rejoice or fear this formidable feat? Is this milestone the start of the end game in the quest for Artificial General Intelligence (AGI)? My short answer is yes and no.

First, some context. Go is an ancient Chinese board game with very simple rules: black and white stones, played in alternate moves by two opponents, fight to conquer the most territory on an 19x19 square. Despite it’s simple rules, the game is of an incredible complexity. As chess, Go is a Markov complete game (the information for the next move is completely contained in the present state of pieces on the board).

However, contrary to chess, Go is a much more fluid game, where a single move can change drastically the fate of the game. In chess, after some moves the fate of the game is determined (for a good enough player). Go is much harder to predict the outcome (for two top players), only by the very end of the game — which may take 200 or more turns. On the other hand, the number of possible moves is beyond imagination: for example, if we want to predict 50 turns ahead, we get about 3x10¹⁰⁰ , which is a higher number than the number of atoms in the entire universe.

So, how did AlphaGo did it? How it was able to navigate through this gigantic space of possibilities within seconds? Brute force will not work. The simple answer is that it used algorithms that are close to human level intelligence. What are they? We don’t know for sure, but some sort of high-level heuristics and shortcuts, like the ability to see that certain configurations are more defendable than others. This takes years to master by humans, but AlphaGo learned them from scratch — initially playing against other computer programs and later against experienced go players. It learns from its mistakes and was able to device, what may be called, strategies — pretty much the same way as human brain uses heuristics.

In this respect, this is a remarkable feat as it relies on a completely different approach from the one used by Deep Blue to beat Kasparov on chess. The more technically the answer is that AlphaGo is taking advantage of one of the most powerful machine learning algorithms ever built: deep recurrent neural networks (DRNN) trained with Long-Short Term Memory (LSTM) — a technique proposed in 1997 by J. Schmithuber of IDSIA, Switzerland. Behind it are more bizarre techniques used for reinforcement learning, most notable Monte-Carlo tree search. But the core concept is DRNN.

Hardly the capability to generate heuristic or neural networks are new concepts. Neural Networks were around for more than 50 years. They never got much respect from the academic and AI community as the way they work and why they work is pretty obscured — somehow as our brain. They only become popular, though for a short period, in late 80’s when Hinton and others propose an algorithm, called back-propagation that made possible training of networks with hidden layers (a layer between the input and the output). Those architectures gain some relevance as it allow to extract non-linear relationships between inputs and outputs (like the XOR problem) that previous networks couldn’t. They got some visibility in problems, like OCR (recognition of digits — typeset or hand written). However, after the 90’s enthusiasm decline in favor of more “elegant” and mathematical sounded machines like SVM, especially the kernel based learning.

(To be continued)