Data Engineering concepts: Part 10, Real time Stream Processing with Spark and Kafka
This is last part of my 10 part series of Data Engineering concepts. And in this part, we will discuss about Stream Processing.
Contents:
1. What is Stream Processing
2. Kafka features
3. Kafka configuration
4. Kafka services — Kafka Streams, ksqlDB, Schema Registry
5. Spark Structured Streaming API
6. Databricks Delta Lake
7. Practical project
Here is the link to my previous part on Data Security:
What is Stream Processing?
Stream processing is a data processing technology used to collect, store, and manage continuous streams of data as it’s produced or received.
Batch processing is used for data processing at regular intervals, with chunks of data at a time and for use cases where you don’t need immediate insights. While, stream processing would be suitable for other scenarios that we will be be discussing in this article: