Realtime Temperature Analytics using Kafka Streams

Appu V
NeST Digital
Published in
3 min readSep 14, 2021

Life is a series of natural and spontaneous changes. Don’t resist them -that only creates sorrow. Let reality be reality. Let things flow naturally forward.
— Lao-Tzu, 6th–5th century BCE

Was playing around with Kafka and an interesting use case for tracking and storing temeperature readings from electronic sensor devices at realtime came through. After evaluating a couple of different approaches and directions, Kafka Streams emerged as the most suitable framework.

Even with Kafka, leveraging Apache Kafka, deploying zookeeper, maintaining schema registry etc. proved to be a hazzle. The better alternative was Confluent Kafka, as they had a subscription based model in all the major cloud providers. With Exactly-once semantics provided by Kstremas, it turned out to be the defacto choice in server side, while Spring boot was leveraged to support the user interface.

Architecture Diagram

Sensors deployed in the devices will be generating temeperature reading in avro format and will be pushed to kafka topic. Multiple sensors will be sending these readings. We wanted to maintain metadata for these. Schema registry is an excellent tool for solving this challenge. It wil act as a service layer for metadata, which would act as a centralized repository for schemas. Leveraging schema registry, we have more flexibility to interact and exchange data without the challenge of managing and sharing schemas between them.In future, the sensors would be changed and the corresponding schemas would be evolved(Schema evolution). This could be easily carried out using schema reigistry.

These streams of data will be saved as a Ktable. It represents the latest state of the data at a particular point in time. This data will be tracked in the Web UI.

There are static datas such as name, phone number etc., that are not real time values. These datas usually reside in databases or file systems. We need implement a Change Data Capture (CDC) to capture changes in these fields. Kafka connect helps us to tackle this. Kafka source connector could pull the data from file/table to a topic. It reduces the overhead of writing a producer/consumer duo to do the same. A Ktable join based on the key will capture this static change.

One of the challenges was to write unit test case for the application. We did not want to touch the existing cluster and wanted a solution that could run the tests without the need of kafka installation. kafka-streams-test-utils helps us to achieve that.

As we were only dealing with the server side for now, we have developed a python based framework for producing mock data from both sensor and connect. The framework can be configured to simulate various scenarios such as metadata update, schema evolution etc.

The latest flow of temeperature readings is displayed in the web ui using websocket and springboot using a line graph. It also displays the latest temperature readings.

Live Web UI

In future this framework could be expanded to do various use cases such as finding the average temeperature in a window period, to store reading in a database or to send alerts when temeprate is above/below a threshold and much much more.

Links :

  1. Data Generation : https://github.com/appuv/KafkaDataGen
  2. Temperature Analytics : https://github.com/appuv/KafkaTemperatureAnalytics
  3. Web UI : https://github.com/appuv/Live-Dashboard-using-Kafka-and-Spring-Websocket
  4. There is a recording of the working in my YouTube channel : https://youtu.be/Cj3BeA4bV1c

--

--

Appu V
NeST Digital

Cloud Data Engineer | MCT | CCDAK | CKAD | AWS | Azure | GCP | Grand Master - Asia Book Of Records