An Open Source Learning Platform For Self-Driving Car Engineers

TareeqAV
6 min readNov 6, 2019

--

Ulbrich et al. Towards a Functional System Architecture for Automated Vehicles https://arxiv.org/pdf/1703.08557.pdf

This platform will provide practical hands-on knowledge and real-world experience that is affordable; yet comparable to the real challenging scenarios of self-driving, to help engineers master the techniques and skills required to be competitive in this very exciting and fast moving industry.

Most of us; and by us I mean students of online self-driving car programs, are itching to become a part of this incredibly fascinating and very challenging technology but understandably it is very difficult to do so.

Many of us have logged close to a hundred hours or probably more in simulators like CARLA and Baidu’s Dreamland.

Some of us can — and probably have — picked-up an Nvidia TX2 or a Nano, a simple robot car chassis with a camera and made some very good attempts.

However none of that has quite lived up to the expectations of what working on an “actual” self-driving car should feel like!

So this is an open call for all those interested in becoming Self-Driving Car Engineers to come together and build “an open source learning platform” that anyone can use to create an affordable “consumer grade” proof of concept on which to practice the real techniques and challenges of self-driving.

What does “consumer grade” mean in this context? We’ll get to that in a minute.

Existing Open Source Platforms

There is already a number of great open source platforms, like Baidu’s Apollo, or Autoware 2.0.

Better simulators like CARLA, or Baidu’s Dreamland.

Amazing datasets, like KITTI, Cityscapes, and most recently and perhaps notably Berkly’s DeepDrive.

However the scale and complexity of these platforms and datasets can be somewhat daunting for anyone who is working on them outside of a team or University setting.

If it is not yet evident, I speak from personal experience on most of the above points. I have a pretty good grasp of the perception module from Baidu’s Apollo and already had several Pull Requests accepted.

But I want to do more!

Therefore, once again, this is an open call for all those interested in becoming Self-Driving Car Engineers to come together and build an open source platform for self-driving cars using easily obtainable consumer grade hardware.

“Consumer grade” here does not mean this platform will drive your actual car. It does however mean that we should be able to get real hands-on experience with many of the tasks of self-driving.

For example: the control module in a self-driving car typically models vehicle dynamics with a PID controller to actuate steering, acceleration and braking based on inputs from the planning and perception modules.

You can only claim to understand these concepts after you’ve implemented and tested them on something that has actual tire treads, on a bumpy street, going downhill, in the rain— i.e. not a simulator nor a differential drive robot moving around your living room.

Similarly you cannot truly appreciate the difficulty of perception until you’ve taken your cameras outside in bright sunlight or on a rainy day — or in the snow!

These are things you just can’t do with a simple robot chassis from Amazon!

And since most of us cannot afford to acquire the hardware necessary to run the Baidu Apollo platform, or the Nvidia DriveWorks or any of the other “open source” platforms, we need an alternative — and not a simulator.

Luckily this can be “mostly” achieved with any Go-Kart, Peddle-Car, Golf-Kart or similar vehicle that satisfies the following properties:

  • Uses Ackerman Steering.
  • Uses tires with treads.
  • Large enough to drive down your neighborhood without blocking traffic.

You Maybe Asking Yourself: Is this guy stupid, crazy, or just naive?!

Andrej Karpathy. Multi-Task Learning in the Wilderness. https://slideslive.com/38917690/multitask-learning-in-the-wilderness

Probably stupid, Not Yet Crazy and Definitely Not Naive!

The proof is in the above diagrams and image taken from Andrej Karpathy’s presentation on Multi-Task Learning in the Wilderness. In this talk he discusses the network architecture, training and team work required to solve for the “one hundred” or so vision tasks, and how each of these has its own subtasks.

I am including it because I don’t want you thinking I am naive about how challenging this work is. At the same time it’s also a testament to how ill-prepared the Udacity and Coursera courses actually leave us once we’ve “graduated” them.

These are very daunting tasks ... and there isn’t even a language for them yet in the literature.

The image on the left; with the arrows, only depicts about one tenth of what’s required from the vision/perception module, above which are — at minimum — the localization, planning, and control modules.

But… if we start by building something that drives down the street of your home, stops at the intersection, turns onto the cross street, and makes its way around the block back to your front door, I would consider that a pretty successful start.

And if you repeat that same trip on rainy or — dare I say — a snowy day and learn about the difference between them, then you’ve learned a lesson more valuable than anything Coursera can teach you!

Side Note: Even though I have referenced material from the head of AI at Tesla, I am not interested in the LiDAR vs no-LiDAR argument. I don’t like people who speak in absolutes. While I do favor a camera only vision solution, if we are professional, smart enough and careful enough, we can design the architecture in such a way that if you want to use LiDAR, go ahead, and if you don’t, it’s up to you! Even if this kind of design comes at the cost of some code repetition and inefficiencies during inference time, I think that’s OK! This is a learning platform!

Maybe Now You’re Asking: What Do You Have So Far?

  • A Slack Workspace. (By Invite Only)
  • A Github Org And Repo:
  • — Using BAZEL as build system
  • — Using Docker for distribution
  • — 2D Object Detection using YOLOv3 and pytorch C++ bindings.
  • — 3D object detection.
  • An Agile Board.
  • An Nvidia Xavier.
  • A Peddle Car. (please hold your mockery until you join Slack :D )
  • An AWS Account. (By Invite Only)
  • A whole lot of research into:
  • — Self supervised monocular depth perception: from [1] and [2]: [video1] and [video2]
  • — Stereo Depth Perception from [3]
  • — 3D Object Detection from [4] using Pseudo-LiDAR and [5] using LiDAR: [video5]. Note: author of [5] currently works at Waymo.
  • — Autonomous Systems Software Architecture from [6]
  • — Coursera’s Self-Driving Car Specialization

Are You Asking: What will this cost me? Why Are You Asking Me For Help?

This will cost you nothing! This is not a company nor a teaching platform. This is an open source project. Contribute what you want/can and take what you need.

I am asking you because I am not George Hotz and I can’t do it alone!

In seriousness though, I am very passionate about this technology as I am sure anyone who has made it down to this paragraph must be. You must also fully believe that the only way to truly learn something is by doing it.

However the self-driving problem is not something anyone can do alone.

There are 1000s of graduates from Udacity and Coursera who have toiled away nights learning and loosing sleep while working full time jobs and having families, hoping they’ll get to work on this amazing technology.

But it’s just not easy to break into this industry without either having meaningful real world experience, or a PhD.

I hope that together, we can get the real world experience that we need!

If You’re Asking Yourself: What Skill Sets Do I Need?

That’s the wrong question to ask. This is a learning platform. There should be enough documentation, links and references (and a little hand holding in the beginning) to help even the newest coders get started.

That being said, if your goal is to become a self-driving car engineer, there’s no escaping having to learn the following:

  • Python
  • C++
  • Some basics of Computer Vision
  • Deep Learning (at a minimum you’ll need Convolutional Neural Networks, and LSTM Networks)
  • Deep Learning Frameworks: Pytorch / Tensorflow
  • A willingness to keep hammering away at a problem until it starts to make some sense.

I hope Now You’re Asking: How Can I Join?

You can reach me at the Gmail I setup for the Github Org. Please allow a day or so to get back to you with the Slack and Github org access.

--

--