Why not AlphaVehicle Zero?

2 min readApr 6, 2019

As in, why not train autonomous vehicle AIs without human data as DeepMind did with AlphaGo Zero / AlphaStar? With a simulation advanced enough, why not let it gain a hundred lifetime’s worth of driving experience just by providing the rules and self-play.

I would imagine such an AI would be much better equipped to handle extreme situations. With a well-defined reward system, it could figure out how to accelerate and decelerate comfortably for the passengers, but could also learn to “understand” physics of crumple zones to greatly increase the odds of survival in collisions. Say there is no way to avoid a head-on collision with an oncoming vehicle, it could adjust the angle of impact, maybe even spin the car around, in an effort to reduce the risk of serious injuries.

It could be subjected to the same kind of domain randomization as OpenAI’s robot hand, and learn to deal with flat tires, damaged breaks, missing sensor data, glitches, changes to vehicle weight due to passengers and cargo, etc. Again, with a properly adjusted reward system, such an AI would know when to break the traffic laws in an effort to save lives.

I imagine for most cases, the data from human drives is sufficient, but it is precisely those remaining cases where we humans fail that we should be better equipped to avoid, and I’m not convinced learning from human behavior will get us there.

Why not AlphaVehicle Zero?

Written by Balázs Suhajda