5 Data Engineering challenges in the Connected Car industry

Laurence Hubbard
toyotaconnected
6 min readJul 29, 2019

--

Data is the new oil and it will fuel the next generation of mobility solutions, but what roadblocks lie in our path to success? #CarPunsAreCool

Photo by Sam Loyd on Unsplash

Toyota Connected Europe has recently come into being, but the Connected Car industry has been in existence, in some way or another, since 1996. Starting as a bonus feature to a traditional vehicle-ownership model, the ability to plug transportation into your digital world is quickly becoming an overwhelmingly powerful concept — perhaps even more than owning a car and something that Toyota acknowledges is a “once-in-a-century transformation”.

Europe, in particular, is seeing a revolution in the mobility space with an invasion of scooters, a change in energy consumption patterns and plummet in vehicle ownership forcing factories to shut.

Leveraging data is clearly going to play a huge part in this evolving industry. We need to understand the customer demands and to provide the best possible mobility service instead of just the best possible vehicle. To do this we need to detect, gather and process that data into information — the trade of Data Engineering — but we’ve discovered it won’t be easy.

Here are 5 of the data engineering challenges we have identified in the world of connected cars:

  1. Balancing value with costs
  2. Software engineering principles in a hardware focused ecosystem
  3. Data Quality & IoT maturity
  4. Personalisation
  5. Security & Privacy

Balancing Value with Costs

We have a chicken and egg problem with regards to optimising our data point selection.

Even with only a handful of sensors, connected vehicles can produce huge swathes of data. Unlike some parts of the IoT industry these signals are being produced from a piece of hardware always on the move about a city and even country. This means to collect the data it needs to go over a 3rd party telecoms network. And this means it’s extremely expensive.

So, we just need to limit the data points and frequency of updates to what is absolutely necessary and needed to produce value? But how do we know where the value lies if we haven’t collected the data at scale and analysed it?

This puppy is considering our conundrum

One of the impressive ways we’ve decided to tackle this problem is to take a subset of cars that happen to have a wider than normal range of sensors and very enthusiastic owners — and running an end to end proof of concept.

This is, of course, Toyota’s World Rally Championship team and our aim of maximising their chances of winning rallies through the use of connected car data. This has allowed us to go through multiple phases of data analysis, with race-by-race updates acting as sprint cycles. Unfortunately, this doesn’t cover all mobility product use-cases in Europe, but it has pushed us to explore the data to a greater depth and to gain invaluable experience in this connected world.

Software Engineering Principles in a Hardware Focused Ecosystem

Both manufacturing cars and writing software are complex tasks but are traditionally tackled in significantly different ways. Building a connected car involves both. Successful companies organise themselves and their processes around these challenges, but where are the overlaps and where are the gaps between building cars and writing the software?

Engineering principles significantly overlap, with value placed in quality, security, usability, reducing repetition… and the list goes on.

One of the key gaps however lies in delivery schedules and the resulting planning and workflow methodologies. For example, for 2019 BMW have already planned our their delivery schedule, whereas Uber are continuously releasing updates to their platform. Adjusting an organisation to adapt to this new way of thinking will take time and is recognised as a key to success in this space.

Stuck right in the middle of this hardware schedule and mobility product backlog is the flow of data, and subsequently the task of engineering that data. The ability of an OEM to build feedback loops around the data quality, range of data points and even required sensors will be a major litmus test for success and the velocity of these feedback loops will need to be defined by the customer.

Data Quality & IoT maturity

The IoT space is gaining momentum, but there is still a long way to go and lack of maturity of the devices, particularly in the automotive industry, inevitably resulting in poor data quality and unreliable assumptions.

A large number of connected features being released rely on proactively asking for the information from the customer instead of sensing it, such as the Tesla dog mode.

“Dog mode” activation simply results in a Boolean data point, which is easy to process if not reliable. GPS accuracy issues are harder to overcome, particularly in crowded cities with tall buildings and underground car parks.

Consider the simple problem of defining an accurate location for a vehicle as part of a car-sharing scheme. The customer needs to find a vehicle that they didn’t park themselves. Cross-referencing a GPS path with a real-world road layout enables you to correct each GPS point and apply a correction coefficient to predict missing points, but this needs specific tools and significant processing power / time.

Personalisation

Personalisation is a powerful tool for engaging customers and enhancing their experience with your product. Allocating identifiers and linking together data sets to enable this personalisation lies within the art of data engineering. But while most of the tech industry is taking personalisation to the “next level”, the car manufacturers don’t even know who is driving to start with.

Unlike when logging into your favourite social media platform or checking out an online shopping basket, unlocking a car door and starting an engine doesn’t require user-level authentication. Of course, this is slowly changing with mobility solutions coordinated by mobile apps and smart key boxes, but this underlying core issue remains for majority of vehicles.

There is a possible solution though: using machine learning to take into account the nuanced differences in how we interact with the vehicle controls. This can be mapped out into a unique driver fingerprint, in a similar way to a walking gait.

Security & Privacy

Two key principles which are more important than product quality are security and privacy, and they impact all of the above challenges as well.

From a security perspective, every responsibility that an IoT device is allocated is one which must then be protected. Smart keys, remote air conditioning and remote engine ignition are all examples of features, which if hacked, could not just cause damage or financial loss, but be a threat to life. We must ensure every physical device and every network jump is secured to the highest standards.

From a privacy perspective, the governance landscape is evolving quickly with GDPR in Europe, CCPA in California and surely more to come. A connected car has the ability to collect an overwhelming large variety of potentially intrusive data points about a customer and with Toyota Connected Europe’s products quickly becoming of interest to customers all over Europe, ensuring the highest implementation standards possible for customer data rights is an essential part of our platform growth. There is huge interest into how this is being implemented across mobility solutions in Europe and we hope to be best in class.

This emerging industry is providing us with a massive and exciting challenge, not just in the area of Data Engineering and we can’t do it alone.

Are you inspired by challenge?

Want to find out more?

Join us! → https://toyotaconnected.teamtailor.com/jobs

--

--

Laurence Hubbard
toyotaconnected

Technical Lead, Data Engineering — Toyota Connected Europe