Wrong AI
Published in

Wrong AI

Re-thinking Data Pipeline Patterns

A Data pipeline is a sequence of transformations that converts raw data into actionable insights. In the past, the processing and storage engines were coupled together e.g., a traditional MPP warehouse combines both a processing and storage engine. With decoupled processing solutions (such as Spark, Redshift Spectrum, etc.) becoming mainstream in both open-source as well as the AWS Big Data Ecosystem, what are the popular data pipeline patterns? This post describes the data pipeline patterns we have defined in the context of decoupled processing engines.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Sandeep Uttamchandani

Sandeep Uttamchandani

Sharing 20+ years of real-world exec experience leading Data, Analytics, AI & SW Products. O’Reilly book author. Founder AIForEveryone.org. #Mentor #Advise