Automotive Companies are Tragically Wrong About Their Approach to Data (or: The 7 Steps to AI Epiphany)

Ophir Sarusi
Acerta
Published in
6 min readMar 19, 2018

The advancements in vehicle technology in past decades has put a significant strain on the effectiveness of quality control tools (even Statistical Process Control and six sigma tools), which still rely heavily on manual analysis. Looking at the recall rates and the costs of warranty claims over the past few years, it is evident that with the growing system complexity, engineers don’t have the right tools to meet the ever-increasing quality and safety standards. This is where AI comes in.

By now, AI has affected almost every industry. Journalists and experts have largely hailed it as “the next step in the information age”, marketers are using it to reach ever-more-specific target audiences, pharma companies are using it as a new tool for drug development and studying the human genome, and investors are throwing their money at it. Clearly, AI is revolutionizing various industries, providing never-before-seen capabilities to automate complex processes using large amounts of data. In order to overcome the challenges they face with current processes, automotive companies must take advantage of this technology, and introduce AI into their manufacturing facilities.

Image: iStock.com/BeeBright

More often than not, as part of our process at Acerta, we encounter clients who seek to implement AI capabilities but do not have the rigorous data capture infrastructure in place to allow for it. This is surprisingly common, as many of them do not realize how important data is for proper implementation of AI capabilities; from its quality, quantity, and even down to its very structure. This prompted us to develop a procedure by which we support our clients in creating the right process and architecture for data collection and storage, including a practical and tailored benchmark. While in many cases it is possible to deliver some value with existing infrastructure, only after achieving this crucial milestone does it become feasible to harness the full power of AI.

So, what should the first steps be for companies looking to benefit from the power of AI?

  1. Get a Data Champion.

Creating an infrastructure for data collection and maintenance is an arduous process. Too often the structure, content, and nature of a company’s data is spread amongst a huge number of groups and minds throughout the organization. No-one speaks on behalf of the data, on behalf of its quality, fidelity, or organization. Give your data a champion, a person responsible for managing what is sure to be a crucial resource, for understanding the diverse needs of each data-consuming group in the company, from marketing to engineering, and ensuring that these needs are met. Give this champion an important title (like, say, Chief Data Officer — CDO) to truly emphasize their importance to your organization.

2. Store Data on the Cloud.

Many companies have a Bob or an Alice — that person, typically from IT or engineering, who happens to have data from different sources stored locally on their workstation and might just have what you’re looking for, or know where to find it. People from various departments then manually request the data they need from Alice, and hope that that data was stored, and not simply deleted to make space. To avoid these data silos, store all the collected data from all sources in a centralized location on the cloud. Since cloud storage is very cheap, even companies that maintain a large local data center (and diligently store all types of data in it) will see significant cost savings and redundancy advantages on their data storage when moving to the cloud. Lastly, since cloud storage is also extremely secure if done properly (subtle nod to your CDO here), there is absolutely no excuse to not use the cloud to store everything you can get your hands on.

3. Make Sure the Database is Properly Structured.

A key component of data collection is to define a cold storage architecture that maximizes data and information capture across sources over time while minimizing cost. The database must be highly standardized with very rigorous structure, in order to reduce data duplication, improve data consistency, integrity, and reliability, and allow for quick and efficient queries. The infrastructure should support integration of historical and new data sources, including necessary handling of data inconsistencies (which is quite common when aggregating data from different collection systems). Data scientists spend a significant portion of their time “cleaning up” data and getting it ready for “machine consumption”, so this step will not only make data more discoverable but will also save an enormous amount of time and significantly accelerate automation.

To explain the importance of this step; consider this as your “compartmentalized” attic — everything you collect over time you simply shove in there even if you’re not sure what you’re going to do with it yet. The only twist here (hence “compartmentalized”) is that you make sure that everything is labeled and put into a proper compartment. The idea is that when the kids are older and you recall you once shoved something in the attic, you’ll know what’s in each compartment and where it came from. And to bring it back to a case in point, when you are ready to implement AI capabilities, you’ll recall that time when you collected a heap of data that will now give you a significant advantage.

4. Get Data. More Data. More.

It’s simple; no data, no AI. Get as much data as possible, from as many sources as possible, and make sure it is all labeled properly. If bandwidth is limited, prioritize the data needed most frequently by the majority of data consumers. The data you are collecting will be used to train artificial neural networks, so make sure you collect a variety of examples; normal system behavior, deteriorating performance, systems with abnormal behaviors such as various failure modes, and any other relevant cases (don’t forget to label them!).

5. Make the Data Easily Accessible.

Make this data accessible to every department in your company that may benefit from it. Whether it’s engineering, marketing, manufacturing, or customer service, data discovery should be made as simple as possible by breaking silos and eliminating the bottleneck that is the human element. Even without AI in sight, there is still so much to be gained just by implementing this simple step. No more data silos!

6. Create a Hot Storage.

The number 1 objective of hot storage architecture is to support quick querying of data. This is an extremely critical step for implementing any type of AI, which allows scientists to train models properly and in a reasonable amount of time. Where cold storage prioritizes a high degree of data structure normalization, hot storage does the opposite. Hot storage prioritizes the use of highly specialized data structures and should readily trade space for computation time. The hot storage architecture must define how data should be transferred from cold storage to the myriad of different structures inevitably needed by the different data consumers in a large organization, including AI. If you haven’t yet done so in the previous steps, it is recommended that you contract an AI solution provider (preferably one that specializes in solutions for automotive) to help you take care of the subtleties of this step.

7. It’s AI Time.

If you implemented the previous steps properly, you should by now be ready to finally implement AI capabilities in your company’s process. You have enough data, it is properly stored, structured and labeled, and is easily accessible for fast querying by a machine.

Following these steps might take some time and effort, but it’s necessary. The growing complexity of vehicle systems, along with tools that have reached their peak capabilities, creates a product quality gap. Large amount of complex data poses a serious challenge for current methods such as SPC, which rely heavily on manual analysis both for fault detection and for root cause analysis.

AI algorithms, on the other hand thrive on such abundance, and typically perform exponentially better as data volume increases. With proper data-management practices, these algorithms can churn through massive amounts of data in search of insight in a way that is simply impossible for humans to do, providing engineers with never-before-seen tools to obtain invaluable insight. But for this to be achievable, companies must have enough data in a rigorously defined structure to allow for any sort of machine intelligence application. Putting this infrastructure in place will enable you to capture the value of your data using machine intelligence, which is without a doubt crucial to remaining relevant in this extremely competitive industry.

--

--