Homepage
Open in app
Sign in
Get started
Towards Data Engineering
Navigating the Path to Data Engineering Excellence
About
Follow
Trending
Fail Fast or Quarantine? Two Data Quality Patterns Every Spark Engineer Should Know
Fail Fast or Quarantine? Two Data Quality Patterns Every Spark Engineer Should Know
Learn when to fail fast or quarantine bad data in Spark pipelines.
Marcel Kennert
May 12
15 Common Spark Errors in the Big Data Industry — Causes, Detection & Detailed Fixes
15 Common Spark Errors in the Big Data Industry — Causes, Detection & Detailed Fixes
Apache Spark is widely used for building distributed data processing pipelines, but it frequently encounters operational and runtime…
Solon Das
May 11
KNN Algorithm in Data Mining with Example
KNN Algorithm in Data Mining with Example
Your Friendly Guide to Smarter Data!
Mayur Koshti
May 15
Latest
100 Days of Data Engineering Day 90: Dashboarding Insights with Databricks SQL
100 Days of Data Engineering Day 90: Dashboarding Insights with Databricks SQL
Driving Business Impact from ML + GenAI
THE BRICK LEARNING
May 22
100 Days of Data Engineering on Databricks Day 89: Unifying Machine Learning and GenAI in a Single…
100 Days of Data Engineering on Databricks Day 89: Unifying Machine Learning and GenAI in a Single…
A Databricks Solution Architect’s Perspective
THE BRICK LEARNING
May 22
100 Days of Machine Learning on Databricks Day 2: Why Machine Learning Still Matters in the Age of…
100 Days of Machine Learning on Databricks Day 2: Why Machine Learning Still Matters in the Age of…
The AI Landscape Has Changed… But ML Is Far From Obsolete
THE BRICK LEARNING
May 22
100 Days of Engineering on Databricks Day 88: Architecting Retail GPT Assistant Using LangChain
100 Days of Engineering on Databricks Day 88: Architecting Retail GPT Assistant Using LangChain
After building Generative AI pipelines for retail — such as product review summarization (Day 86) and campaign narration from SAP data…
THE BRICK LEARNING
May 22
🚀 Designing a Metadata-First Data Platform: Where Metadata Drives Every Action
🚀 Designing a Metadata-First Data Platform: Where Metadata Drives Every Action
“In the next-gen data platforms, metadata will not be documentation — it will be the orchestration.”
Mili Tripathi
May 21
100 Days of Data Engineering on Databricks Day 87: Using GenAI to Narrate SAP Campaign Outcomes
100 Days of Data Engineering on Databricks Day 87: Using GenAI to Narrate SAP Campaign Outcomes
Retail marketing teams run numerous campaigns across geographies and customer segments. These campaigns generate data captured in systems…
THE BRICK LEARNING
May 21
Say Goodbye to Dirty Data: Build Trustworthy Pipelines with These Pro Tips
Say Goodbye to Dirty Data: Build Trustworthy Pipelines with These Pro Tips
Non-members can access the full article through this Link.
Ritam Mukherjee
May 21
Accelerate your Analytics Journey from On-Prem to Databricks : Introducing Prism
Accelerate your Analytics Journey from On-Prem to Databricks : Introducing Prism
A new lens on SAP data — clear, intelligent, real-time
THE BRICK LEARNING
May 20
100 Days of Data Engineering with Databricks Day 86: Summarizing Product Reviews Using GenAI and…
100 Days of Data Engineering with Databricks Day 86: Summarizing Product Reviews Using GenAI and…
So far in our SmartRetail 360+ journey, we’ve analyzed structured data from SAP to build ML models for churn, CLV, recommendations, and…
THE BRICK LEARNING
May 20
100 Days of Data Engineering with Databricks Day 85: Introducing Generative AI in Retail Pipelines
100 Days of Data Engineering with Databricks Day 85: Introducing Generative AI in Retail Pipelines
Powered by Databricks ai_query()
THE BRICK LEARNING
May 20
About Towards Data Engineering
Latest Stories
Archive
About Medium
Terms
Privacy
Teams