Homepage
Open in app
Sign in
Get started
Towards Data Engineering
Navigating the Path to Data Engineering Excellence
About
Follow
Trending
Parquet is Good for OLAP but Not for OLTP Use Cases. But Why?
Parquet is Good for OLAP but Not for OLTP Use Cases. But Why?
Many engineers and data scientists praise Parquet for its efficient compression and fast query performance. While it’s a highly valued…
Ritam Mukherjee
Sep 28
Inside a Netflix Data Engineering Interview.
Inside a Netflix Data Engineering Interview.
A Real time Media — OTT Domain Use Case Question and How to Solve It Together.
Brahma, The Data Engineer.
Sep 18
A Python Library every Data Engineer should know
A Python Library every Data Engineer should know
As a data engineer in a large company, ensuring data quality is a key responsibility. Even if you perform your tasks diligently and rarely…
Robin von Malottki
Sep 25
Latest
Python Automation for Data Engineers: How to Save Time and Streamline Your Workflow
Python Automation for Data Engineers: How to Save Time and Streamline Your Workflow
Learn How to Eliminate Repetitive Data Tasks with Simple Python Techniques
Satyam Sahu
Oct 9
Cracking PySpark JSON Handling: from_json, to_json, and Interview-Ready Insights
Cracking PySpark JSON Handling: from_json, to_json, and Interview-Ready Insights
Master PySpark’s from_json(), to_json(), json_tuple(), get_json_object(), and schema_of_json() with practical tips for interview success
Pritam Deb
Oct 9
Fundamentals of Data Engineering Chapter - 01 Summary
Fundamentals of Data Engineering Chapter - 01 Summary
Join me in learning about data engineering. Getting the book Fundamentals of Data engineering by Joe Reis and Matt Housley is the first…
Adarsh Menon
Oct 9
A Look Inside a Dell Data Engineering Interview
A Look Inside a Dell Data Engineering Interview
Let us Solve It Together and learn Data Engineering Better.
Brahma, The Data Engineer.
Oct 8
The Rise of Data Lakehouses: Is Your Data Warehouse Becoming Obsolete?
The Rise of Data Lakehouses: Is Your Data Warehouse Becoming Obsolete?
Discover the Benefits, Challenges, and Migration Tips for Adopting a Data Lakehouse
Satyam Sahu
Oct 7
Why Engaging with Upstream Stakeholders Matters in Data Engineering
Why Engaging with Upstream Stakeholders Matters in Data Engineering
How often have unexpected job failures or data quality issues caught you off guard because of unplanned changes from upstream sources? Such…
Samhitha Poreddy
Oct 6
Next Stop After Data Analyst (Data Science or Data Engineering)
Next Stop After Data Analyst (Data Science or Data Engineering)
Graduating from business school and making a career transition into the data world, I hopped on the “Data Science” hype train dreaming of…
Harris Wan
May 24
Data Skew in Spark : Using Salting while avoiding common mistakes
Data Skew in Spark : Using Salting while avoiding common mistakes
Data skew occurs when the data distribution across partitions is uneven. Imagine you’re working with user transaction data, and two users…
Ritam Mukherjee
Oct 5
Scaling Apache Spark: Understanding Cluster Utilisation with a 50-Node Setup
Scaling Apache Spark: Understanding Cluster Utilisation with a 50-Node Setup
In this article, we will explore how resource management impacts performance in Apache Spark. We will use a 50-node Spark cluster setup to…
Ritam Mukherjee
Sep 22
A Beginner’s Guide to Apache Airflow with GCP Composer.
A Beginner’s Guide to Apache Airflow with GCP Composer.
An effective approach to orchestrating data workflows.
Adediwura Boluro-Ajayi
Sep 27
About Towards Data Engineering
Latest Stories
Archive
About Medium
Terms
Privacy
Teams