How many nines does that make?

Published in

May Mobility

4 min readSep 4, 2018

I came across an old photo of me during the 2007 DARPA Urban Challenge, when I worked with my MIT colleagues to develop a self-driving vehicle. Having just heard of an unexpected feature on the course, we were concerned that our car might screw up. So with minutes before launching our vehicle into the competition, I pulled out my laptop, made a few changes to the source code, compiled, and — with no time for testing — pushed it to the car. With nobody onboard, the car drove off and ultimately was one of only six vehicles that finished the race.

At the time, I joked with my advisor (MIT Professor John Leonard) that I’d “just added another 9 to our system”, referring to its reliability. (A 99.999% reliability would be “five nines”). John asked “How many nines does that make?”, to which I excitedly responded “one!”

Today, my company May Mobility is building self-driving cars that will carry everyday people — from home to work, to lunch, and anywhere else they want to go. (We’re averaging two hundred rides a day, making us the only commercially operating self-driving car company in the country.)

Safety for a self-driving vehicle is our top priority: we aim to exceed human level performance which is staggeringly good: humans drive 100 million miles for every fatality. On a fatality-per-mile basis, that’s eight nines. To put that in perspective, a typical driver might drive 13,500 miles per year. Waymo, probably the leader in autonomous miles driven, has racked up 8 million miles. Both of those numbers pale in comparison to 100 million.

RAND Corporation published a study showing that it would take 8.8 billion miles of real-world testing to show that a self-driving car drove as well as a human with high confidence. That makes sense — think of it as running 88 different trials of 100 million miles each. Some trials might have zero fatalities, some might have more than one… but with 88 trials, you can average out the noise.

It’s not practical to demonstrate the safety of a self-driving vehicle by driving real-world miles alone. It is possible to drive billions of miles in simulation, but it is entirely unclear how valuable a billion simulated miles is. Some might argue that simulations are more valuable than real-world miles because you can systematically put the robot through scenarios that don’t come up in the real-world very often. Others argue that the limitations of simulations mean that they’ll never match the awful complexity and variety of the real-world. (I believe that both are right: simulations are incredibly valuable for the first few thousand miles or so, after which they become less useful than real-world testing.)

Frequently lost in the discussion about real-world versus simulated miles is the absolutely critical role of a rigorous design process. Smart engineers, crawling through the design of a system, can perform a FMEA: Failure Mode Effects Analysis. They consider all of the components in the system, how they fail, and how individual failures could conspire to make the whole system fail. This process is old news to the traditional automotive and aerospace industry, of course.

There’s a critical connection between FMEAs and testing. FMEAs require engineers to assess the likelihood of failures and other events. Where do these probabilities come from? Did the engineers think of all of the possible failures and interactions? This is where testing comes into play: real-world and simulated testing can be used as an independent check on the FMEAs. The engineers might estimate that the LIDAR will lose a beam every million miles — what does the real world say about that? The beauty of this is that the FMEA allows engineers to reason over much more likely events, and to validate that their models match the empirical testing.

In short, we see safety as a process that combines functional safety (based on FMEAs) with simulation, log playback, and real-world testing. And of course, following best practices on software development, regression testing, and continuous integration. The result of this process is both high quality of each component, but also built-in redundancies for when something does break. My co-founder, Alisyn Malek, put it well: functional safety is about making the system safe even when it’s broken.

There’s an old saying in security systems that “you can’t add security to a software product — security isn’t a feature, it’s a process”. The same is true for safety.

How many nines does that make?

Written by Edwin Olson