Making Sense of Unbounded Data

A reference architecture for a real-time event processing system

Photo by Phil Goodwin on Unsplash

In reality, a lot of data is unbounded because it arrives gradually over time: your users produced data yesterday and today, and they will continue to produce more data tomorrow. Unless you go out of business, this process never ends, and so the dataset is never “complete” in any meaningful way.

— Martin Kleppmann, Designing Data-Intensive Applications

Bounded and unbounded data

Events, messages, and streams

Benefits of real-time data and practical examples

  • Real-time customer 360 view implementations
  • Recommendation engines
  • Fraud/anomaly detection
  • Predictive maintenance using IoT
  • Streaming ETL systems

Making sense of streaming data

Photo by Cédric Dhaenens on Unsplash

Event ingestion and processing is challenging

A reference architecture for stream processing

Logical components of a stream processing architecture.

1. Event Sources

2. Ingestion system

Anatomy of a Kafka Topic. Source — https://sookocheff.com/post/kafka/kafka-in-a-nutshell/

3. Stream processing system

4. Cold storage

5. Analytical datastore

6. Monitoring and notification

7. Reporting and visualization

8. Machine learning

Conclusion

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Dunith Dhanushka

Dunith Dhanushka

Editor of Event-driven Utopia(eventdrivenutopia.com). Technologist, Writer, Senior Developer Advocate at Redpanda. Event-driven Architecture, DataInMotion