Data Engineering concepts: Part 10, Real time Stream Processing with Spark and Kafka

11 min readMay 22, 2024

This is last part of my 10 part series of Data Engineering concepts. And in this part, we will discuss about Stream Processing.

Contents:
1. What is Stream Processing
2. Kafka features
3. Kafka configuration
4. Kafka services — Kafka Streams, ksqlDB, Schema Registry
5. Spark Structured Streaming API
6. Databricks Delta Lake
7. Practical project

Here is the link to my previous part on Data Security:

Data Engineering concepts: Part 9, Data Security

This is Part 9 of my 10 part series of Data Engineering concepts. And in this part, we will discuss about Data…

medium.com

What is Stream Processing?

Stream processing is a data processing technology used to collect, store, and manage continuous streams of data as it’s produced or received.

Batch processing is used for data processing at regular intervals, with chunks of data at a time and for use cases where you don’t need immediate insights. While, stream processing would be suitable for other scenarios that we will be be discussing in this article:

Data Engineering concepts: Part 10, Real time Stream Processing with Spark and Kafka

Data Engineering concepts: Part 9, Data Security

This is Part 9 of my 10 part series of Data Engineering concepts. And in this part, we will discuss about Data…

What is Stream Processing?

Written by Mudra Patel