Uber Car Crash: What went wrong?

DreamVu
Articles & Posts
Published in
5 min readApr 11, 2018

Recently, an Uber self-driving car in Tempe, Arizona crashed into a woman at night on a multilane road, leading to her death. We tried to understand, what went wrong or rather, what could have possibly gone wrong?

Disclaimer- This has been a topic of interest for the past 20 days and has been deliberated upon by many. From our analysis, we have additional inferences which have come through. The content is purely a representation of our observations based on publicly available data about the accident, and what it may tell us about the accuracy and utility of the sensor system and the AI system behind it.

Graphical representation of the incident — ©DreamVu

As shown in the image (location image from Google Earth)

  • Length of 1 white stripe on lane marker: 10 feet (source-Google Earth)
  • Length of 1 blank space on lane marker: 28.5 feet (source-Google Earth)
  • Total distance: 280 feet (calculated from video)
  • Total time(car position 1 to 3): 4.59 seconds (calculated from video)
  • Road width: 27 feet (source-Google Earth)

The image has been created upon a scrupulous analysis of the video released by the Tempe police, Arizona, and other available data:

Video: From news sources

Sensors on Uber’s autonomous cars are most likely the following :

  • Top-mounted LiDAR (1 Laser)
  • Front and Back-mounted radar( For 360 degrees coverage)
  • Short and long-range optical cameras (7 Cameras)
  • Inertial Measurement Unit

Some of the assumptions that were made in order to analyze the crash:

  • Braking distance: Calculated based on the data of braking distance of an average car with a standard control system and reaction distance as per human drivers.
  • Woman’s velocity: Calculated based on her country of origin and age.
  • Range of LiDARS: Range 120m-200m (based on the LiDARs available in the market).

Observations (from the video)-

  1. There was a change in speed of the car over the course of the video, especially a few seconds before the crash.
  2. The woman was hit by the right side of the bumper (based on photographs of the car post the accident)
  3. Seven white stripes and six blank spaces were crossed from the beginning of the video till the time of the collision.

Using this information, following is our analysis:

Analysis 1: Walking speed of the woman- Two seconds before the collision, the woman had crossed almost three fourths the lane before being hit by the right bumper of the car. Also, the estimated distance covered by the woman in 2 seconds, before the crash, is 9 feet. Using this data, the walking speed of the woman comes to around 4.5 feet per second which is consistent with normal walking speed as per research(the average walking speed of a 49-year-old US woman is 4.45 foot per second)

Analysis 2: Car Speed- Starting from position 1 the car covers two blank spaces and one white stripe in the first second, covering a distance of 67 feet. Therefore, the speed of the car at the beginning of the video is 45.7 mph (67 feet per second). 1 second prior to the collision, the car covers two white stripes and 1.3 blank spaces; adding up to around 55 feet, implying that the speed of the car 1 second before the collision was around 37.5 mph (55 feet per second).

Braking distance of the car: By running a regression on publicly available reference data typical braking speed and distance have been calculated taking the speed of the car at 45 mph. The overall stopping distance, which is a sum of the reaction distance and braking distance, comes to 169 feet.

Analysis 3: Road- The time taken by the woman to walk across the road was 21/4.45= 4.7 seconds which implies that she started crossing the road 4.7 seconds before the collision, at which time the car was at least 288 feet away from her. Assuming the LiDAR’s range to be 120m, the woman should have been detected from a distance of 360 feet. The video footage (exterior view) released is 4.6 seconds long which shows that the woman was well within the LiDAR’s range.

Based the above analysis, following are some of the inferences-

  • LiDAR might have been faulty, else the person should have been detected by the time the video starts(given its range and accuracy). Typically, a LiDAR uses previously created maps for sensing dynamic environments via localization. Failure to do efficient localization in this rare scenario possibly led to marking the woman as a false positive. Problem with the LiDAR’s calibration is also a possible explanation for its faulty behaviour.
  • As per observation 1 and analysis 3, the car slowed down 1 second prior to the collision. It is highly probabilistic that this was due to the detection done by camera sensors. In proper lighting conditions the camera sensors easily detect the person from a longer distance. If this was the case the car would have come to a halt and avoided the collision. Since the ambient lighting in the current scenario was insufficient, the detection of the camera system was delayed significantly.
  • It is quite possible that the alignment of the sensor setup was distorted. Auto re-calibration using vision algorithms could have solved this problem. But it was not possible to do it in this case due to unavailability of feature points on the highway.
  • Considering that the sensors captured the scene accurately, the recognition algorithm was probably unable to classify the object as a person/pedestrian. Clutter in the captured data, caused by the presence of additional objects (bag and the bicycle) and the pose of the person, may have resulted in this failure to distinguish among these objects.
  • Another plausible contributing factor is the overall complexity of the sensing and processing system. This means that the processors were busy performing computationally intensive tasks for fusing the inputs from multiple sensors which led to a delay in understanding and reacting accordingly.
  • It can also be inferred that the AI was incapable of responding to a new, unexpected situation, due to lack of appropriate training data and/or the inability of current approaches to dynamically respond to such situations better.

Current self-driving car technologies are trapped in a whirlpool of escalating complexity turning the end-to-end system to find a human-replaceable solution into a vicious cycle. Such occurrences, or worse, become more likely as the technology scales up and we move towards different geographies and challenging contexts. The pursuit towards a system with superhuman capabilities is turning into a rat race — flawed, endless and self-defeating. Incidents such as these should become eye-openers and push us to do a reality check, and come up with technologies that are human inspired rather than spiral down the proverbial rabbit hole.

--

--

DreamVu
Articles & Posts

DreamVu has developed the world's first omni-stereo camera hardware and software platform for unifying human and machine vision.