Vehicle Telematics

nipun Agarwal
TVS Motors technology blog
4 min readJun 20, 2019

This is the second part of the telematics series where I will be concentrating on vehicle telematics architecture. You can go through part-1 for a brief introduction on the telematics design if not already gone through. Below are few factors I have kept in my mind before designing a detailed architecture

  • Scalability
  • Security
  • Storage
  • Monitoring
  • Management
  • Ease of access
  • Display

I will be elaborating on each and every point but let me first show you a detailed architecture diagram

Fig1 : Detailed Design Architecture

As you can see in the Fig 1, I have divided the design in 5 different layers as sources, ingestion, storage, process and monitoring.

Sources

Data is produced by various sensors in the vehicle and sent directly to the application server using the telemetry unit or using the mobile app which is connected to the vehicle using bluetooth. The data can come in real time in high volume from EV bikes, cars, charging stations etc. To cater to the high velocity telematics needs, a new light weight protocol was designed named MQTT (Message Queuing Telemetry Transport). It uses TLS to encrypt the messages exchanged between the client and the broker. Authentication mechanism can also be setup to authenticate packets before writing to the MQTT broker.

Ingest

For high ingestion, MQTT broker needs to be highly scalable to support large number of connections at a time. MQTT is not built for high scalability, longer storage and easy integration, this brings Kafka into the picture which is highly scalable. MQTT is placed behind a load balancer to scale to high number of connections and the data is transferred from MQTT broker to Kafka broker using Kafka connect. This makes this solution highly scalable and fault tolerant. If you look at the Fig 1 there is a telegraf agent present in the ingestion layer to accumulate monitoring data and send it to our monitoring system to monitor the server. I will be talking about this in more detail in the Monitoring section below.

Store

Storage is a place where you will be storing in all the raw and processed data. Lot of connectors/consumers that can be used to send the data from Kafka to HDFS. You can save raw data, use KSQL to process or convert that data to a particular format or enrich that data with other data streams in Kafka and store it in your HDFS data lake. We can use a database to store processed data here or send telematics data from Kafka to a time series database like InfluxDB to do some real time processing and analysis. A good BI tool can be used to connect to this processed data for nice visualisations. We can also extend it to a data warehouse with star or snowflake schema for analysing, reporting and dashboarding purpose

Process

Data can be processed in real time with Kafka connectors or in batch mode using data stored in HDFS. There are a lot of open source tools in this space like Spark, Hive, Presto ….. Again this data is saved to HDFS for further analysis. My personal favourite is Spark but you are open to use any of the Hadoop ecosystem processing tools. I am not going in detail about the processing frameworks as there are lot of articles that have been written in this space

Monitor

At various layers we can add the monitoring component by adding a telegraf agent to each servers. This agent collects the data and send the data to Influxdb which is a time series database. Further this is connected to Grafana to setup alerts on top of it. Alerts can be configured with slack, pagerduty or emails. Grafana provides pretty cool plugins and visualisations. It has a lot of data connectors like InfluxDB, Prometheus, Cloudwatch etc to fetch data from these source systems and show it in its console. Also telegraf also supports statsd, a very light weight protocol to send messages.

There are alternatives to all the open source that I have mentioned not only in the open source community but also in the cloud space. If manageability is a problem, you can choose any of the cloud server and it will provide you all the managed solutions around these open sources.

All the layers apart from sources can be kept inside a VPC with only the load balancer being publicly available for ingestion. This makes the whole system highly secured. In terms of scalability, every component chosen is horizontally scalable, which makes the entire system to scale at unbounded levels. Monitoring component is an important piece here to constantly track the health of the systems and servers.

I have tried to make the architecture as clean and easy to understand as possible. Would love to hear you feedback on this.

--

--