Use Rydesafely to know weaknesses of your Neural Network Systems

Sanjay Thakur
Rydesafely
Published in
6 min readMay 11, 2021

TL;DR

Rydesafely provably helps automotive clients save upto 90% of the cost at 10x faster speeds for delivering Neural Network (NN) based software for autonomous systems across all input-modalities, use-cases, ODDs and geographies. Rydesafely does that proactively figuring out the failure modes of your NN systems without which these systems are set for failure, which could sometimes mean a difference between life and death. Such failures manifests itself in the form of mindless collection of millions of miles of driving data, exorbitant labelling costs, 1000s of wasted engineering hours and still with no guarantee of safety.

If you want to know more or book a demonstration session, reach out to Tom through his email (tom@rydesafely.com ) or calendly (https://calendly.com/thomas-stuart/30min).

The problem:

Ancient wisdom says that not knowing and acknowledging one’s weaknesses is only going to bring doom. Well, quite literally in the year 2021 this has turned true for the driving automation stakeholders who are currently going through a trough of disillusionment. The weaknesses in this context are something called edge-cases that are almost impossible to foresee as neural networks are considered black-boxes. Result being millions of dollars spent on mindless gathering of millions of miles of driving data without any guarantee of safety.

Why is this problem:

The new wave of neural network based software for automation, let’s call it software-2.0 has unlocked untapped potential for autonomy but has also opened floodgates for unprecedented problems in software testing and validation — something that is quite mature for software-1.0. If you wonder why the industry does not up the ante and solve this problem, then know that software testing traditionally works where logic is known beforehand, and not mystically obscured in the weights of a neural network. And if you haven’t seen how that might end up for the homo sapiens, watch this video.

Not accounting for the unique failure modes of the NN systems is only going to break them with no warnings

A potential solution

Well, let’s take a step back and think about what made software-1.0 testing so mature — first principles philosophy essentially. It is the ability to know proactively where the logic could falter. Now that’s some progress in the right direction. Now if we keep thinking more, we realize that knowing the weaknesses of a neural network system opens the door to validate it in a guided manner, protect it from failure-modes and strengthen the entire system. No wonder why the ancient wisdom in our context has passed the test of time. Now if your question is that’s all a nice theory and figuring out a realistic way to do that is the real question, then look further because Rydesafely’s first product is all you have been waiting for.

Rydesafely to the rescue:

Rydesafely’s first product offers an off-the-shelf on-premise platform to find out the failure-modes in your raw data with respect to your NN system in a way that works across input-modalities, use-cases, geographies, real-time, and processes TBs of data overnight. These features are intended to make the lives of our clients easier astronomically. Enough words for now, let’s dive into some demos. We will explore three reasons why failure-modes show up.

Demo set-up: We use 2D object detector trained on California roads as our NN system. Green boxes mean safe and red mean a failure mode.

Green means safe, red means a failure mode in our demonstrations
  • Case one: Concept Interference
  • Case two: Unmodelled concepts
  • Case three: Cultural variants on going from one geography to another.

Note that, we only cover three cases here to keep the post succinct. To know the full-spectrum of reasons behind NN system failure-modes, reach out to Tom either through his email (tom@rydesafely.com ) or calendly (https://calendly.com/thomas-stuart/30min).

That’s impressive, does it work with other input modalities like LiDAR and tasks?

A resounding YES. We repeat that our product works with other input-modalities and tasks. Let’s show you similar demos with LiDAR point clouds quickly without going too deep as well. Again green box means safe and red means a failure mode.

In what form is your product available?

Our product comes in three forms for our clients:

  1. ROS package,
  2. Python package, and
  3. A full-fledged front-end.

All versions of our product offer similar features. We offer a sneak peek of our front end below with some of the features.

Full front-end demonstration is available on request. Reach out to Tom either through his email (tom@rydesafely.com ) or calendly (https://calendly.com/thomas-stuart/30min) for it.

That’s quite a lot to digest, do you have a summary table?

Again, a resounding YES. Afterall, Rydesafely is all you have been looking for.

I am intrigued, this is like magic, how do you do it, tell me more

We do that using our patented technology to understand the intricacies of knowledge hidden in a neural network in a way that can be used to compare against the incoming raw-data to separate failure modes with safe cases. The best part, it doesn’t need the ground data to do that at all.

That looks good, but how do you tell these red-boxes are real failure-modes?
A very good question indeed. We cross-examine what our product tells is a failure-mode and ground data to see if the neural network system actually goes on to commit the mistake that our platform told you is a failure-mode. For this purpose, we segregated the failure modes and the safe-cases that our platform tells us into two different sets. We then gather the evaluation metric mAP which is the standard for 2D object detection using ground-data in these two different sets. We use Berkeley Deep Drive training data for creating the NN system and validation data to come up with the numbers below.

Analysis suggests a clear correlation between what our product says are the weaknesses and the weaknesses suggested by the ground data.

Can you prove any commercial value for your clients right away:
Some of our clients are actively using our product to achieve data-efficiency to make their existing NN systems work on new datasets, use-cases, geographies and ODDs. In order to give more confidence to our clients, we did another experiment where we checked if the weaknesses also represent the minimalistic data needed to do well on a new setting. For this purpose, we trained a COCO base model for 2D object detection and checked the data requirements to achieve performance on Berkeley Deep Drive. The results are in the table below.

Analysis again suggests, training on mere 14% data that our platform suggests yields as good of a performance as if it were obtained using all the data.

The commercial value here is the savings on reduced annotation and training requirements.

Is anyone using your product already?

Don’t trust us, trust our customers. Unrolling, customer testimonials.

Endnotes:

In this blog post, we walked you through the major reason why the driving automation industry has hit the trough of disillusionment and how Rydesafely is playing a major role in overcoming it. We showed you demonstrations of our work and publicly provable numbers to back up our claims of our patented technology. This helps you not only save upto 90% of your costs but deliver your software at 10 times the speed otherwise. And we do this through our off-the-shelf platform that can run on your premise entirely, preempting any chance of our data and IP leaking out.

If you want to know more or book a demonstration session, reach out to Tom through his email (tom@rydesafely.com ) or calendly (https://calendly.com/thomas-stuart/30min).

--

--