TDS Archive

An archive of data science, data analytics, data engineering, machine learning, and artificial intelligence writing from the former Towards Data Science Medium publication.

Member-only story

Introduction to Streaming Frameworks

Pier Paolo Ippolito
TDS Archive
Published in
6 min readNov 8, 2023

--

Photo by Joao Branco on Unsplash

Introduction

As data architectures are becoming more and more mature, streaming is no longer considered a luxury but a technology with a wide range of applications across different industries. Because of technical and resource limitations, batch processing was in fact always the preferred way to process and deliver applications, although with the development of micro-batch and native streaming frameworks in distributed systems based on Apache, high-scale streaming has now become much more accessible (Figure 1).

Some example applications for using streaming systems, can be processing: transaction data to spot anomalies, weather data, IoT data from remote locations, geo-location tracking, etc.

Figure 1: Batch vs Streaming (Image by Author).

Real-Time vs Micro-Batch Processing

There are two key types of streaming processing systems: micro-batch and real-time:

  • In real-time streaming processing, each record is processed as…

--

--

TDS Archive
TDS Archive

Published in TDS Archive

An archive of data science, data analytics, data engineering, machine learning, and artificial intelligence writing from the former Towards Data Science Medium publication.

Pier Paolo Ippolito
Pier Paolo Ippolito

No responses yet