Completeness, Speed, and Cost — Three Knobs Controlling Your Data Analytics Strategy

How do speed, accuracy, and cost-effectiveness influence the choice between batch and streaming analytics?

Photo by yinka adeoti on Unsplash

Event-first thinking

A user with ID 1234 purchased item 567 on 2022/06/12 at 12:23:212

Data analytics — making sense of events

Data analytics system collection, stores, and processes business events to derive meaningful insights

Event time vs. processing time

  1. Batch or historical analytics
  2. Streaming or real-time analytics

Batch analytics systems

A typical batch processing system like Apache Spark, Hadoop, Hive, etc

Time to insights = Total (event ingestion time + event processing time + query time)

Streaming analytics systems

Typical streaming analytics system

Completeness, speed, and cost — know the three knobs

  1. Speed
  2. Completeness of data
  3. Cost
  1. Detect and block abnormal user login attempts.
  2. Provide a daily spending report for the marketing department on the ads displayed on the website.
  3. Provide a real-time dashboard for e-commerce ad performance.
  4. Train an ML model that makes product recommendations.
  5. Perform ad-hoc analyses on past sales data to detect trends.

Speed — when the whole business depends on the speed of making decisions.

Completeness of source data — when the accuracy matters

Skewness in event time vs. processing time

Cost of generating insights


Typical analytics architecture like this helps organizations to adjust there analytics needs based on the speed, accuracy, and cost.



EdU is a place where you can find quality content on event streaming, real-time analytics, and modern data architectures

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Dunith Dhanushka

Editor of Event-driven Utopia( Technologist, Writer, Senior Developer Advocate at Redpanda. Event-driven Architecture, DataInMotion