[Article] Why did DeepMind choose the AlphaGo game rather than some real world problem?

The content below is extracted from my answer to a quora question here.

Multiple games are reasonable approximations of reality; multiple games are reasonable to train on, because reality has far too many “pixels” (reality’s “resolution” is too high), for the current level of learning aligned hardware/software.

In simpler words, when a single Atari q model (alpha go’s basis) is trained on multiple games, those multiple games approximate reality by being varied and quite complex, but cheap to train on in comparison to the dense input space that is reality. (Games are also safe scenarios where machine learning agents can be trained, without being subject to the dangers of the real world; i.e. a self driving truck can be trained in a virtual world, absent the possibility of real accidents)

The model then is the first indication of general artificial intelligence, as one model can be used to approach multiple game scenarios, which are really cheap, but still favourably/reasonably complex approximations of reality.

This model is quite general, and this is why google bought a game learner, at 500 million pounds. As more scalable/powerful deep reinforcement learning models and hardware are made, a point will occur when real “pixels” in the world, or something very close, will be very efficiently trained. (i.e. robots + RL — reinforcement learning)

So, in the end, games are in fact, real world problems.


I am a casual body builder, and software engineer.