Tesla, MobileEye, and deep learning
Elon Musk has previously stated that Tesla only needs a camera to achieve autonomous driving. The first principal reasoning behind this assertion is that’s the only sensor input that humans use to drive. Recently, however Tesla released version 8.0 their software that uses radar as a primary controller. From their blog post it states:
After careful consideration, we now believe it (radar) can be used as a primary control sensor without requiring the camera to confirm visual image recognition.
Notice they call the radar a primary controller, not the primary controller. The phrasing has contradictory implications and it’s not clear to me what role the camera still plays in AutoPilot. It’s being portrayed by many media outlets as a major downgrade in importance for the camera. I suspect the camera still plays a large role, but let me explain why I assume Tesla is making such a big deal out the radar system.
Tesla and MobileEye, which supplies artificial vision technology for driving, had a public falling out. I recently watched a video where MobileEye CTO, Amnon Shashua, says end to end deep learning for autonomous driving is not possible. He claims it’s relatively easy to get a demo that works 90% of the time, but exponentially more difficult to get something that works 99.999% of the time. He claims essentially that it would require an infinite amount of data to reach a reasonable level of safety. He talks about corner cases such as rare or unusual vehicles that will be difficult to detect and identify with a camera. He also cites 3 problems deep learning is unable to solve: n-bit parity problem, multiplication, and a shape recognition problem. That last issue seems irrelevant, those are active areas of unrelated research, but the aforementioned problem about corner cases and rare and unusual vehicles is a legitimate challenge.
In contrast, deep learning is widely accepted as the state of the art in many machine learning applications such as speech and image recognition. NVidia recently released a paper titled: End to End Learning for Self Driving cars. They also demonstrated at CES an object detection and image segmentation application for autonomous driving. Watch it below, it’s impressive. I can confirm the problem. If an object has never been been seen before in the training data it’s unlikely to be recognized by a system such as this.
So Tesla is asserting that this problem with the rare or unusual vehicle or object in the road is solved by the radar system. This is presumably the reason that Elon Musk stated that the radar could identify a UFO. Tesla seems to be mitigating the corner cases by using the radar as a way to override ambiguous visual signals such as unusual objects or when there is glare from the sun.
MobileEye seems biased against deep learning. One would think that a company with as innovations such as adaptive cruise control would be more open minded. I suspect they are too invested in the old way of doing things. Deep learning is very different from “semantic abstraction.” It requires less hand crafted features.
I’m not alone in speculating that Teslas will soon upgrade autopilot hardware. I would be surprised if MobileEye is still used. I think it’s likely they will use NVidia hardware, perhaps the Drive PX2.
Update: Here is a more recent talk from Amnon Shasua. Starting around 22:46 he grudgingly asserts they don’t use deep learning for feature detection, like object or pedestrian detection. But they do use deep learning for image segmentation for detecting the drivable paths. He demonstrates their system and it is impressive. At 20:38 he takes some shots at NVidia’s demo asserting that autonomous driving requires 3d bounding boxes, not 2d bounding boxes. Part of what is going on here is that MobileEye has done painstaking work for years to build a sophisticated driver assist system and now NVidia and deep learning researchers are getting similar results in very little time.
Update 2: Tesla is shipping hardware for self driving cars. Mobile Eye is out and NVidia has confirmed Tesla is using the PX2 for their self driving system. Also Elon Musk confirmed on the Q3'16 conference call that vision is still a primary sensor on the self driving system.
The blog that I wrote was very clear that radar is moving from a supplemental to also a primary sensor. It is not to the exclusion of vision, but it is also a primary sensor. Vision is still the main thing, but radar, instead of merely being, like, a cross-check against vision is really, when done well, and we’re very confident at this point that it can be done this way; it can be a primary sensor such that you can take actions based on radar information alone.
Check out Tesla news feed.