Completeness, Speed, and Cost — Three Knobs Controlling Your Data Analytics Strategy

How do speed, accuracy, and cost-effectiveness influence the choice between batch and streaming analytics?

Photo by yinka adeoti on Unsplash

Event-first thinking

A user with ID 1234 purchased item 567 on 2022/06/12 at 12:23:212

Data analytics — making sense of events

Data analytics system collection, stores, and processes business events to derive meaningful insights

Event time vs. processing time

Batch analytics systems

A typical batch processing system like Apache Spark, Hadoop, Hive, etc

Time to insights = Total (event ingestion time + event processing time + query time)

Streaming analytics systems

Typical streaming analytics system

Completeness, speed, and cost — know the three knobs

Speed — when the whole business depends on the speed of making decisions.

Completeness of source data — when the accuracy matters

Skewness in event time vs. processing time

Cost of generating insights

Conclusion

Typical analytics architecture like this helps organizations to adjust there analytics needs based on the speed, accuracy, and cost.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Dunith Dhanushka

Editor of Event-driven Utopia(eventdrivenutopia.com). Technologist, Writer, Senior Developer Advocate at Redpanda. Event-driven Architecture, DataInMotion