The world is not enough: Building imagined cities for cars to learn from

By Professor Philip Torr

Team Five
Five Blog
3 min readOct 30, 2018

--

Five’s cars need to be able to see like us humans, so they can successfully avoid potential hazards on the roads. That’s a given for self-driving vehicles, and its something computer vision makes possible, through contributing to the creation of a range of smart sensors. But computer vision also unlocks less obvious benefits, like the ability to build virtual worlds and have cars learn from them.

Imagine you’re training a car to learn about the world: the forms our cities’ roads take, how others drive, what pedestrians look like, and how they behave. If you were to do this solely in the real world, it wouldn’t be possible for the car to learn fast — it would need to learn in real time. The process would also be unsafe. You’d need to cover billions of road miles which, human driven, would contribute to hundreds of serious injuries and deaths.

Simulation is vital. This is where computer vision can really contribute. It’s possible to run a car around London and capture a 3D likeness of the capital, taking in all sorts of ‘objects’, and recording the many ways in which they move.

You can then use this data — the simulated world you’ve built — to test new test cases. You could simulate a road scene in which an ice cream van is doing a roaring trade and a kid runs out into the street in search of a treat. Within a simulation, you can run this scenario a billion times, taking all possible movements and interactions into account, so the car learns everything there is to know about how kids behave and move. You can train the car to deal with all situations, commonplace and unexpected, without having to do it in reality.

This is a new field, and this kind of work comes under the umbrella of what we call ‘Deep Generative Models’. I’m helping Five put this into action. If we scan in the entire city of London with our sensors, we have a wealth of data: from traffic lights, to cars. Cameras can also capture how people act at junctions, zebra crossings, and more. Once we have this core data, we can ask the computer to generalise, then dream up new situations and solutions, including never-before-seen kinds of accidents, and how to deal with them. The computer can also imagine new environments, including new road layouts and alternative objects. Entirely new, imagined cities unfold, for totally novel situations to be played out in.

In short, the world is not enough. When we give computers the power to generalise, and then develop new worlds with endless permutations, we can train cars quickly, effectively and safely, and ensure they have the knowledge and skill to drive autonomously with an extremely high level of safety.

As this technology rapidly advances and simulations become more and more realistic, their applications become both deeper and more diverse. As well as helping us build safe, smart self-driving cars, they’ll also help us to understand smart cities and design ones that function efficiently and sustainably, meet the needs of many, and enable everyone to thrive.

The world is not enough but, with computer vision, new worlds are already within our reach.

About Professor Philip Torr, BSc S’ton, DPhil Oxf
Philip runs the world-leading computer vision research group, Oxford University’s Torr Vision Group. In 2018, he was awarded a Royal Academy of Engineering Research Chair — the Five Chair in Computer Vision. Philip is Five’s Chief Scientific Advisor.

--

--

Team Five
Five Blog

We’re building self-driving software and development platforms to help autonomy programs solve the industry’s greatest challenges.