jamyn
4 min readJul 12, 2018

Under the hood — the dirty truth about machine learning

Dash’s new predictive maintenance functionality, built with machine learning and human labor

You may not know it, but every time you type the blurred numbers from a Google Street View screenshot to prove you are not a bot, you are training Google’s computer vision models. Each crowd-sourced keystroke is one more human-generated data point to train their machine learning models — and they are getting it for free.

In all the excitement around machine learning and AI, it is sometimes overlooked that — at this early stage in their evolution — they are dependent not just on computing power, but manual human labor. It takes millions and millions of hours of human effort to generate data, tag, and train the models. Then more to test, iterate, reinforce. Human labor, human error, human bias, all reflected in a model that will never be complete.

And sometimes, that labor can land you in hospital, as the startup team from Dash found out.

Jamyn Edis, CEO / co-founder at Dash with resident mechanic, Ray

Dash, founded in 2014, was previously known for its ‘Fitbit for driving’ products. Today, while we continue to offer their consumer apps to over 450,000 users worldwide, we are focused on how this driving data can be leveraged in the enterprise for predictive maintenance and training risk models for the autonomous future. The company has been quietly building out its Vehicle Intelligence Platform for enterprise business, working with organizations from Ford to the Department of Transportation, across the automotive, insurance, fleet, and smart city verticals.

Last year, Dash was approached by Johnson Controls, a $35bn Milwaukee-based power solutions company that manufactures 150 million car batteries every year. Johnson has recently invested heavily in IOT, including the launch of Glas, its smart thermostat powered by Microsoft’s Cortana. Johnson challenged the Dash team to build an algorithm to predict when car batteries will fail, by using Dash’s remote diagnostics vehicle data and machine learning capabilities.

Within three months, the team had built a hybrid Bayesian machine learning model that gave an over 85% level of accuracy (versus industry standard tools with a 60% confidence level).

Brian Langel, CTO / co-founder at Dash

So, how did we do it?

“Hours of back-breaking manual labor,” laughs Dash’s Chief Data Scientist, Professor Sam Hui.

To build the model, the team needed to source hundreds of car batteries, in varying sizes and states of health (from brand new to nearly dead) then test them in hundreds of vehicles from different makes, models, and years. The test involved driving tightly scripted 20-minute routes, both with and without a full electric load (e.g. radio, seat heater). And they did this in both hot and cold climates, in Europe and North America. As shared by Brian Langel, CTO and co-Founder, “It included driving in the dead of Wisconsin winter with AC on full blast. Not fun. Especially when coupled with the manual labor of adding and removing hundreds of 40-pound car batteries.”

That’s how a Dash CEO and co-Founder ended up at the hospital, getting steroid injections into their back and hours of grueling physiotherapy. “Training machine learning models can be painful,” said Dash’s CEO and co-founder, Jamyn Edis.

The hard work paid off and now the team at Dash is working on half a dozen more predictive maintenance models around filtration, spark plug replacement, tire wear and more. In addition, we have built algorithms using vehicle data which can predict both actuarial risk for insurance, as well as to identify individual drivers based on driving style, which is useful for fleets.

And we are just getting started on the opportunity. What most excites us is the renewed focus on recruiting enterprise drivers, to increase the velocity of data gathering, as well as using that data to train autonomous vehicles on how to react to human drivers. The latter is especially important, as human and robot drivers will likely share the roads for two decades before full Level 5 automation is ubiquitous.

“While we wait for the future, we have to service cars on the road today”, said Edis.

Dash’s Vehicle Intelligence Platform