George Hotz on Tesla, Waymo and the self-driving industry

Branko Blagojevic
ml-everything
Published in
8 min readNov 9, 2020

George Hotz is the founder of comma.ai, a company that sells a device that you install in your car that will drive for you. The device supports over 50 cars and has driven over 30 million miles. It leverages the adaptive cruise control and lane assist features and essentially hacks the system to run its own self-driving hardware.

I haven’t tried it yet. My Nissan isn’t supported although I am encouraged to port it over myself.

Your ride is here

I have a lot to say about the idea, but for now I’ll just say that I’m glad people like George exist in this world. And the idea of un-bundling the software of self-driving cars from the hardware makes a lot of sense. But here I’ll discuss parts of his second interview with Lex Fridman. Unless otherwise noted, the quotes below are from that interview, and lightly edited.

The interview has also inspired me to build an app that hosts interview transcriptions (source available here). You can read/watch/listen to the full interview here.

Hiring bishop guys

Comma has a very different approach to other organizations trying to solve self-driving cars. Here’s Fridman explaining the traditional approach to the problem:

Fridman: …the driving task is a machine learning problem, and the way Tesla’s approaching it is with the multitask learning when you break the task of driving into hundreds of different tasks and you have this multi headed network that’s very good at performing each task. And there’s presumably something on top that’s stitching stuff together in order to make control decisions, policy decisions about how you move the car. There’s a brilliance to this because it allows you to master each task like lane detection, stop sign detection traffic, light detection…

You’ll often see images like this when self-driving is discussed.

A visualization of traditional self-driving cars

The general idea is that you first identify vehicles, lane markings, pedestrians, etc. Then you figure out their velocity or position over time. Finally you hand that off that information to a decision model. The benefit of that is that its more explainable than a black box and it makes for sweet video clips on pitch decks.

To that, Hotz replies:

Hotz: If you were to start a chess engine company, would you hire a bishop guy?

Hotz argues that driving, at least level 5 fully autonomous, will require an end to end approach rather than viewing the problem as a series of smaller controllable problems. This is a long fought argument in machine learning: what layer of abstraction is appropriate?

Originally researchers tried to encode grammatical logic into natural language processing systems. But later they realized that grammar and appropriate word usage can be more easily induced from large data sets rather than explicitly defined. Similarly, modern facial recognition doesn’t have a nose module and an eye module. And MuZero showed that the state of the art model in game play begins with no prior knowledge of the internal game mechanics or rules.

But Hotz does admits that it may not be necessary for level 5.

Fridman: I mean that that is very compelling notion that we can learn the task end to end like the same compelling notion you might have for natural language conversation. But I’m not sure, because one thing you sneaked in there is the assertion that it’s impossible to get the level 5 without this kind of approach. I don’t know if that’s obvious.

Hotz: I don’t know if that’s obvious either I don’t actually mean that. I think that it is much easier to get the level 5 with an end to end approach. I think that the other approaches doable, but the magnitude of the engineering challenge may exceed what humanity is capable of.

Feature engineering

He later goes on to criticize feature engineering in general, the practice of using domain knowledge to encode more meaningful state representation:

Hotz: it’s slightly better feature engineering, but it’s still fundamentally this feature engineering. And if anything about the history of AI has taught us anything, it’s that feature engineering approaches will always be replaced and lose to end to end. To be fair, I cannot really make promises on timelines, but I can say that when you look at the code for Stock Fish and the code for AlphaZero, one is a lot shorter than the other. Ah, lot more elegant. Required a lot less programmer hours to write.

Fridman pushes back that self driving cars are almost certainly more difficult than chess or anything else that has been solved by machine learning. But the fact remains that practically all state of the art machine learning models in complex domains use a very light abstraction. They certainly don’t try to orchestrate the decision making process as much as those in the self-driving space.

The only reason we’re approaching self-driving so differently than other domains in terms of feature engineering, is that we fundamentally don’t trust the process enough on such a critical task.

Reinforcement learning vs supervised learning

Hotz also prefers reinforcement learning to supervised learning:

Hotz: that is the definition I like of reinforcement learning versus supervised learning in supervised learning. The weights depend on the data, right? But the in reinforcement learning the data depends on the weight.

In supervised learning, you’re given a state and an output and try to create a model that gives you f(state) = output. If you have no data where your car is in a dangerous situation, the model won’t know what to do. The weights (model) depends on the data (states that it was exposed to).

Reinforcement learning explores the state, makes decisions that act on the state and explores subsequent states. The model then periodically evaluates the states that it finds itself in (e.g. dangerous situation), and trains itself to avoid those states. The data (states that it explored) depends on the weight (model).

Broader self-driving industry criticisms

Hotz has broader criticisms of self-driving industry. He believes there are a lot of snake oil salesmen out there. Companies such as Zoox (recently purchased by Amazon for 1.2bn) promise to deliver not only a self driving future, but one that’s built from scratch and also carbon neutral. Meanwhile comma.ai has over 30 million miles of data from their users and is profitable with a few million in sales and a few dozen engineers.

But the most interesting criticism is about Waymo and the whole self-driving taxi market.

Hotz: I think that the product that they’re building doesn’t make sense… benchmark a Waymo against, an Uber driver, Uber drivers faster… I like when my Uber driver doesn’t come to a full stop at the stop sign. And so let’s say the Waymos are 20% slower, right? You can argue they’re gonna be cheaper and I argue that users already have the choice to trade off money for speed. It’s called Uber Pool. I think it’s like 15% of rides on uber pools. Right? Users are not willing to trade off money for speed. So the whole product that they’re building is not going to be competitive with traditional ride sharing networks

I think he’s a little unfair to Waymo. Sure they’re corny, but they clearly have the best technology. Their disengagement rate is once every 13,000 miles, is about once every 100 miles. And Hotz admits the best technology wins. The problem is they have raised over $10 billion, and there’s no way you can right that ship and satisfy those that invested.

He then connects the self-driving market to the scooter market:

Hotz: I think that the level four autonomous ride sharing vehicles market is going to look a lot like the scooter market if even the technology does come to exist, which I question who’s doing well in that market?

It’s something I hadn’t thought about. I bought into the narrative that the self-driving taxi market is going to upend entire neighborhoods. I figured that an Uber driver probably takes home about 50% of the fare, and with self-driving, you can pass on a good chunk of that 50% to the consumer. And that’s before adding in efficiencies like redesign of the car and more optimized routes.

But its also true that self-driving cars will almost certainly be considerably slower. I’ve been watching the Tesla full self driving videos, and the most common mistake they make was being too timid. Often times the driver has to press the gas to get the car to go at a reasonable pace.

However, it’s a little unfair to compare it to Uber Pools. The primary reason I don’t take Pools is the awkwardness of having to potentially share a car with a stranger and the unpredictability. And the other benefit of a driver-less ride is having some privacy and more predictability, even if the ride is a little slower.

As for Waymo, what would Hotz do as CEO?

Hotz: I would get Anthony Levandowski out of jail and I would put him in charge of company

Levandowski is the former Waymo exec now serving 18 months in jail for allegedly stealing trade secrets when he went over to Uber. Hotz still sees self-driving as a problem that is not yet solved, and Waymo and all these other companies in this space got corny and corporate way too quickly:

Hotz: If you are pretty revenue and you’ve raised $10 billion I have no idea. This just doesn’t work. No, it’s against everything Silicon Valley. Where’s your minimum viable product? You know, where’s your users? Was your growth numbers? This is traditional Silicon Valley. Why do you not apply it? What? You think you’re too big to fail already like?

He does think Tesla is pretty much on the right track, however. He claims comma.ai will be 2 or 3 years behind Tesla, and that fine. He’ll be Android to Tesla’s iOS. He figures they’ll come around to his end-to-end system. The only real criticism he has about Tesla is the lack of real driver monitoring, rather than the touch sensors on the steering wheel.

But overall, it was a great interview and I highly recommend it. It opened up my eyes to some much needed skepticism on the self-driving market, from someone actually working in the field. I love the idea of software being separated from the hardware and it’s incredible that a product like that exists in the market today.

I have a lot more to say about his interview, and I may write a few more posts on this topic in the future.

--

--