Autonomous driving bus at the Weltenburg monastery in Bavaria, Germany (Unsplash)

System Design for Autonomous Vehicle Systems — Part 1

8 min readDec 31, 2023

In various parts of the world, city streets are witnessing an exciting evolution with the presence of Autonomous Vehicles (AVs), or driverless cars. These futuristic vehicles are a marvel of technology, relying on an array of sophisticated sensors, cameras, and radar. They harness the power of artificial intelligence (AI) and draw upon vast pools of data from image recognition systems, machine learning, and neural networks. This innovative integration enables them to navigate from one destination to another with complete autonomy, eliminating the need for a human driver. It’s a glimpse into a high-tech future, happening right now on our city streets.

Suppose we have a hypothetical client, which we’ll refer to as SWV (SmartWheels Ventures), a name I’ve created solely for illustrative purposes in this discussion. SWV is an imaginary company specializing in the development of autonomous systems for vehicles 🚘. Their current objective is to develop a robust computer system capable of reliably collecting telemetry data from their vehicles and effectively displaying information about these vehicles.

Telemetry data is an automatic recording and transmission of data from a remote source to an IT system in a different location for monitoring and analysis purposes.

Gathering telemetry data from vehicles is crucial for SmartWheels to gain insights into issues like malfunctions, breakdowns, accidents, etc. By analyzing this data, they can enhance vehicle design and functionality, leading to the creation of even more advanced autonomous vehicles.

This computer system holds significant importance for SWV, as they will depend heavily on it for crucial insights and operations. Therefore, in this article, we will focus on developing a system design or software architecture tailored to this essential system.

Understanding the requirements

The first step we will take in designing this system is to understand what SWV wants.

Functional Requirement
SWV System Analyst has outlined the essential requirements for the system, which include:

The system should be capable of gathering real-time telemetry data from cars. This is a range of information, such as vehicle breakdowns, speed, location, accidents, and more.
All received data must be stored in a persistent storage system.
There should be a dashboard to effectively summarize and present the collected data.
The system is expected to perform a detailed analysis of the data, extracting key metrics like the car’s speed, frequency of breakdowns, etc.
The end users should be able to view the data in the system via a web-based platform, primarily used through browsers.

The system won’t have a large user base. Its primary users will be data analysts, who will examine the incoming data and derive valuable insights from it.

Non-functional Requirement
Based on the functional requirements identified, it’s clear that the system will handle extensive data. We’re not just dealing with telemetry data from a single vehicle; there will be a significant influx of data from multiple vehicles. This implies a substantial amount of data that needs to be stored and processed efficiently.

Additionally, the requirement to display real-time data about the vehicles highlights the critical importance of performance in this system. High-speed processing and data handling capabilities will be essential to meet this demand.

To make sure we thoroughly understand the requirements, we also made sure SWV provided us with the following information:

What is the number of users using the system at the same time?
SWV answered: 12
(Which isn’t so much, and that is good for us 🙂.)
How many telemetry data is received every second?
(A very important question to accurately gauge the system’s load.)
SWV answered: 8,000
(That is a significantly high volume.)
What is the average size of each data?
SWV answered: 1 KB
Is there a specific structure of the data?
(To help us know how to access the data and what kind of queries we need to do on the data.)
SWV answered: No
If there is some data lost in the process of transporting or saving the data, can that be tolerated?
SWV answered: Maybe
What should the uptime of the system be?
SWV answered: Highest possible

Based on SWV’s responses, the fact that there are only 12 concurrent users is relatively manageable, which works in our favor 👍.

However, receiving 8,000 telemetry data points per second is a considerable amount. This brings to us, the need to carefully plan on how to effectively handle such a volume of data without overburdening the system.

Then, the lack of a specific data structure implies that we cannot predetermine the exact format of the incoming data. Therefore, our system must be flexible enough to accommodate any data structure that it encounters.

In the context of transmitting telemetry data every second, if a single data point is lost during transmission, it typically does not pose a significant issue. This is because we already know the car’s last known position or data from one minute or one second before the loss.
Moreover, we will receive new data in the subsequent second or minute. So the loss of a single data point for a brief period like a second or a minute is not considered substantial, and such minor data losses are generally tolerable.

As for the question about system uptime, SWV’s desire for the highest possible uptime is realistic and beneficial. It’s such an advantage that they understand the impracticality of a 100% uptime guarantee. Instead, they are requesting that we strive to design the system so that it delivers the highest uptime achievable.

Data Volume
Given the responses from SWV to questions 2 and 3, we need to calculate the expected data volume based on their specifications.

The average size of each data point = 1 KB
Data per second = 8,000 data pieces-> 8 MB
Data per hour = 28,800 MB -> approximately 28.13 GB
Data per day = ~ 675 GB
Data per year = ~240.60 TB

To elaborate:

With 8,000 pieces of data being received every second, we accumulate 8MB per second.
This means in one hour, the data volume is approximately 28.13 GB, totaling 28,800 MB.
On a daily basis, we expect to handle around 675 GB of data.
Over the course of a year, this would amount to approximately 240.60 TB.

That’s an astonishingly large amount of data! 😲 No normal database at this stage will be able to deal with such an amount of data without any special dedicated tuning. Therefore, we must consider the data retention period before planning to store such massive quantities. Large data such as this in a database can affect query performance and be difficult to maintain, which can affect the system immensely.

The concept of a retention period is key in determining the duration for which data should be stored in a database. Once this period elapses, the data can either be deleted or transferred to an archival storage system, which is not used for day-to-day operations. The specific approach to handling data post-retention depends on the organization’s needs and the nature of the data. Cloud platforms like AWS, Google Cloud, and Azure offer support for configuring data retention, allowing organizations to manage their data lifecycle effectively.

For our scenario, to establish the data retention period, we need to know the kind of data we’re storing. According to the requirements provided by SWV, they store two kinds of data: telemetry real-time data and aggregated(ready-for-analysis) data. While the former is real-time, the latter isn’t, but it is utilized in various Business Intelligence third-party tools.

Following our discussions with SWV about the effects of storing substantial data volumes on system performance, we arrived at these decisions:

- Real-time telemetry data should be kept for one week.
- Storing data for more than a week is unnecessary and not needed.
- Aggregated data will be stored indefinitely.

The rationale for indefinitely preserving aggregated data is that SWV requires comprehensive data for precise analysis and future predictions. Therefore, our system design must ensure that the storage of aggregated data does not hinder the system’s performance.

Revising our calculations to accommodate the new one-week retention period significantly reduces our storage requirements. Here’s a revised breakdown of the data volumes:

The key adjustment here is:

Over a week (no longer considering a yearly calculation, as telemetry data won’t be retained beyond a week), the data volume will be approximately 4725 GB, equating to around 4 TB.

This is a far more manageable scenario. While 4 TB is still a sizable amount of data, it represents the maximum we need to accommodate, making it a technically feasible challenge.

With this new storage estimate in hand, let’s proceed to map out the system components.

Mapping the components

Because of the nature of this project, we would map a single service to a single entity, a single service or component shouldn’t handle various entities because that will make the application difficult to manage and maintain.

According to the diagram, the Vehicles are the source of the data. We have no control over the vehicles, so the vehicle component is greyed out.

The Telemetry Gateway component simply collects the incoming data and feeds it into the pipeline, doing nothing beyond this basic handling. Because of the time, size, and types of data coming, the load on the system will be heavy, so to enhance performance and prevent crashes due to overburdening, this load will be applied to the Telemetry gateway. So this component will do nothing else, no data validation, processing, or storage, it will only receive data from vehicles and pass it down to the pipeline.

The Telemetry Pipeline component acts as a queue system, organizing and lining up the telemetry data in preparation for further processing.

The Telemetry Processor component monitors the queue and, upon detecting an update, retrieves data from the queue. It then validates each row of data before storing it in the operational database.

The Telemetry Viewer component accesses the operational database to retrieve data, which it then presents in formats specified by the end users.

The Data Warehouse serves as the storage location for the aggregated data we previously mentioned.

We store two types of data in different databases because the aggregate data, intended for analysis and report generation, places a significant load on a database. By segregating it, we aim to prevent this load from impacting the operational data.

The Business Intelligence application is designed to interact solely with the data warehouse, extracting data to produce reports and analyses. It is not linked to the operational database and operates exclusively with the information stored in the data warehouse.

The Archive Database is designated for storing data that has exceeded its retention period. It is expected to accommodate large volumes of data. Performance concerns are minimal for the archive database since it is not directly queried. Instead, if there is a need to get data in the archive database, the relevant data is picked and transferred to the operational database for querying, there are no direct query interactions with the Archive database.

In the next part, Part 2, we will discuss what communication protocol will be used for the Telemetry Gateway, Pipeline, and Processor. We will delve into what their application type and technology stack will be. How we would manage their architecture and ensure redundancy. All these for this project will be covered in Part 2.

Part 1: Current
Part 2: Read
Part 3: Read

System Design for Autonomous Vehicle Systems — Part 1

Understanding the requirements

Mapping the components

Written by Anayo Samson Oleru