Synthetic data to train and validate in-cabin monitoring systems

Published in

Anyverse™

4 min readJun 10, 2022

Why should you seriously consider synthetic data to train and validate in-cabin monitoring systems? What are the advantages of synthetic data versus real-world data to train these systems? And why are many DMS/OMS developers already implementing synthetic data in their data generation pipelines?

If these questions have caught your attention, read on because all these are about to be answered.

Why is real-world data not viable for training automotive interior monitoring systems?

As we have learned in the previous articles of this series, there are several in-cabin monitoring system use cases, and likewise, they have different data needs.

Crafting the data necessary to develop each use case entails various challenges of a certain magnitude: privacy issues, lack of variability, not enough data accuracy… This makes developing DMS/OMS with real-world data inefficient, expensive, and in many cases, impossible…

Privacy: Problem number one is directly tied to filming real people in real-life everyday situations, that is, privacy.
GDPR: Every system that uses personal information needs to be GDPR compliant and meet all its strict requirements.
Children: Strongly related to the previous two. Filming children is not going to be easy. The protection of children’s privacy is stronger, and most likely, parents will not be happy to allow their children to ride thousands of miles and potentially dangerous situations in a car…
Setup variability: target position, camera locations, sensor stack…
Car and environment variability: car brand and interior, material and colors, environment, time of day, weather conditions…
Driver and occupants variability: driver and occupants pose, occupant distribution, age or ethnicity…
Objects and pets variability: parcels or groceries in the front seat/back seat, unsecured pet…
Corner or edge cases: strange or wacky situations that are low probability, but must be recognized by your system if they ever occur. The data you can get from the real world is limited…
Expensive, and time-consuming: how many resources do you need to ride thousands of miles and recreate with enough precision and variability all the scenes needed to train a worldwide working system?
Not enough data accuracy: real-world data needs to be manually annotated.

Advantages of training your in-cabin monitoring system with synthetic data

Make the privacy issue go away

Training and validating your system with synthetic data means privacy issues don’t affect you. You don’t need to fleet cars with real drivers and occupants, and you don’t need to worry about filming children in potentially risky situations… because you can recreate synthetically any in-cabin scene.

Customize variability

Add automatic, customized, or random variability. You can define the data you need to generate depending on the use case and its data needs.

Choose camera locations, car interior, type of objects in the scene, its distribution and how the occupants interact with them, environmental conditions, light in the scene… Reproduce any corner case, or situations unlikely to occur… The range of possibilities is endless.

Avoid bias in your system

This is an advantage directly inherited from the previous one. Having wide variability helps you avoid introducing bias into your system. This matters in order to develop a competitive interior monitoring system that equally performs worldwide, no matter the ethnicity of the occupants or the environmental conditions.

Build a more trustworthy and reliable DMS/OMS

The in-cabin monitoring use case is a critical case. Human safety is at stake so there is no room for error, and pixel-accurate synthetic data with ground truth can make a difference here.

This means generating physically correct data, automatic annotations (avoiding non-free-from-error manual annotations), and sensor-specific data. This is close-to-real data that helps the system always understand or be able to interpret the images coming from the real sensors and optical systems, and ultimately, become safer.

Save time, money, and resources

Synthetic data is unlimited, cheaper, requires less production time, and brings the added value of automatic and perfect annotations.

Synthetic data facilitates the development of in-cabin monitoring systems

Synthetic data is widely used in the autonomous vehicle and ADAS industries due to its countless advantages, but in the case of automotive interior monitoring, it is even more blatant… Privacy problems, lack of variability, high costs, and the difficulty of obtaining images of many different people performing different poses and scenes, make synthetic data the best option to train and validate a trustworthy in-cabin monitoring system.

About Anyverse™

Anyverse™ helps you continuously improve your deep learning perception models to reduce your system’s time to market applying new software 2.0 processes. Our synthetic data production platform allows us to provide high-fidelity accurate and balanced datasets. Along with a data-driven iterative process, we can help you reach the required model performance.

With Anyverse™, you can accurately simulate any camera sensor and help you decide which one will perform better with your perception system. No more complex and expensive experiments with real devices, thanks to our state-of-the-art photometric pipeline.

Need to know more?

Visit our website, anyverse.ai anytime, or our Linkedin, Instagram, and Twitter profiles.