Building a car with vision: Getting urban localisation right

By Team Five

Published in

Five Blog

6 min readAug 8, 2018

Five is one of the only companies tackling the challenge of complex urban driving. We’re beginning in London — a complex city full of stimuli, from pedestrians and cyclists to roadworks and countless road types.

To build an autonomous vehicle that can drive safely and accurately on London’s roads, we need to solve a deeply exciting problem. Localisation. If you can get localisation right, you know where you are. And if you know where you are, you can draw on huge amounts of useful information from a map.

First thing’s first. When I say ‘map’, I’m not referring to a tattered old A-Z, or to Google Maps. I’m talking about a bank of broad, contextual information. What are the area’s traffic rules? Do people follow them? How do drivers and pedestrians behave in this city? Localisation unlocks the map, so the car can access all this knowledge and use it to make the right decisions, in the right moments.

Cities are challenging terrains

Achieving localisation in a complex urban environment is far more challenging than mastering it in non-urban areas, or on US roads. The highway driving scene is simple. It’s characterised by long lanes and the task of the car is mainly to correctly position itself within a given lane. In a non-urban environment, you can use GPS, and you can get it pretty accurate. In a city, it’s an entirely different story. GPS just doesn’t cut it.

The complexity of the urban sprawl throws plenty of spanners into the works. For GPS to function as it should, you’ll ideally have a straight line from your GPS receiver to a satellite. Many buildings block satellite signals, and reflections from buildings interfere with signals also, making it impossible to get an accurate location. You could use GPS to tell which street you’re on, but that’s nowhere near close enough.

All this means that GPS alone won’t do. Far greater levels of accuracy are needed.

Vision is the way

To crack localisation in a dense and busy urban environment, you need to use something other than GPS. The solution? Vision. To understand vision, think about how humans navigate a space. We match what we see to what we know, overlaying what’s in front of us in a given moment with existing knowledge — a mental map, and/or a real map.

At Five, we’re making it possible for our cars to do something similar. We’re giving our car the power of sight and using sources with absolute reference (such as map coordinates) to enable the vehicle to match what it’s seeing to an exact location on a map, blending both sets of data to ensure accuracy and safety.

It’s all about symbiosis — we combine a smart, dynamic interpretation of the scene and key information from a map. This mix is vital. A map, of course, only includes static information. A car needs to be aware of pedestrians, cyclists, other vehicles and many, many other stimuli. By looking at what you see and comparing it to what you expect, i.e. your map data, you can successfully detect these dynamic ‘objects’. You can know what’s a building and what’s a person, what’s a road sign and what’s a cyclist.

Crucially, too, maps date. In fast-evolving cities, roadworks and new developments make it tricky to keep maps current and exact. Even if you conduct daily updates, the data won’t be up-to-the-second. Sudden road closures and engineering works won’t be accounted for.

Vision fixes this. And the potential of vision is realised by maps, too. When you’re able to identify what you see against a scene in a map, you know where you are in the world.

So what exactly is ‘vision’?

We’re using a broad definition of ‘vision’. It’s not just about cameras, it’s about sensors too. It’s everything the car perceives, and everything it uses to perceive. Cameras are vital, but LiDAR makes a huge difference as well, giving us accurate depth measures.

If you’re new to LiDAR, let me give you a simplified summary. Mounted on top of the car, the LiDAR spins, sending out light beams that then journey back to the box, allowing you to know how far the beams have travelled and, as a result, to know the space. It’s like a technically advanced bat! This equipment allows the car to know how far it is from potential obstacles and dangers, so it can act on this information.

Multiple modalities are a must

Every sense is valuable, but they all have their issues. Sight is compromised in the dark, for example. And LiDAR can become sparse and diluted in a dense urban environment. At Five, we strongly believe that to make vision truly robust in a city you need to use multiple modalities and fuse them all, to arrive at the most accurate estimate of the car’s location.

Quantifying uncertainty is an important part of our approach, too. A robust system needs to be able to know when it’s doing well and when it’s not, through sophisticated self-checking capabilities.

It’s in our sights

Combining multiple modalities to create a smart, multi-sensory system that’s ultimately safer than a human is a challenging task, but it’s in our sights. Our team is made up of incredibly talented individuals, including world leaders in approaches to combining multiple modalities, across cameras, LiDAR and more.

We’re proud to have Professor Philip Torr and Professor Andrew Blake amongst our advisors. Philip runs the largest European computer vision research group, Oxford University’s Torr Vision Group, and Andrew is Founding Director of the Alan Turing Institute, the UK’s leading AI research centre. It’s a pleasure to work with so many diverse and brilliant thinkers.

Freedom and sharing make a difference

Importantly, our dedicated Test & Integration team has the freedom to approach the challenge in the way we believe is best scientifically — by building an algorithm that thinks the way we think. We’ve got the business support, sensor data and the people we need to succeed. It’s thrilling to create something then see it in action on the car, discovering how it plays out, and refining as needed. We work closely with our dedicated testing team in Millbrook to do exactly that.

Five’s shared personality is key to our success, too. Everyone here is remarkably bright — they use their initiative and knowhow to solve problems themselves. The team structure is also very flat, meaning it’s always a group effort and everyone can perform at the top of their game, bringing their unique insights, skills and backgrounds to the table. This diversity is a must. Building an autonomous car involves solving lots of complex problems, in many different areas of science and tech. To make it happen, you need to unite a wide range of leading experts.

If all that sounds like hard work, it is. But we have a huge amount of fun while we’re at it, and we find time to relax and unwind together too. We host a weekly games night across our offices when we have a drink, share food and unleash our competitive sides!

Want to know more about work and play at Five?
Email talent@five.ai

Building a car with vision: Getting urban localisation right

By Team Five

Written by Team Five