Efficient Event Streamlining and Dynamic De-Duplication across Message Brokers

A Technology-Agnostic approach

Hari Ohm Prasath
Geek Culture

--

Introduction

Nowadays, every modern application emits large amounts of data that needs to be processed and stored to derive meaningful business insights. The data from different sources (like web, mobile, etc.) through message brokers (like Kinesis, SQS, Kafka, etc.) can be unstructured with many duplicate data. All this data need to get processed in real-time and stored in a data warehouse for further analysis.

Photo by Zbynek Burival on Unsplash

As a developer, if you are tasked with building a central system that can process data from these different message brokers, run de-dups based on the individual dataset, and store them in a data warehouse, then the blog is for you.

As part of this blog, we will dive deep into the following topics:

  • What are the common problems when dealing with different message brokers?
  • What are some general solutions that people implement to solve these problems?
  • How to build a system that streamlines data processing across different message brokers?
  • How can we build a dynamic de-duplication window based on the dataset using a state store thats independent of the message broker used…

--

--