Homepage
Open in app
Sign in
Get started
Towards Data Engineering
Navigating the Path to Data Engineering Excellence
About
Follow
Trending
Why Parquet is the Best File Format for Big Data
Why Parquet is the Best File Format for Big Data
When working with big data, you want to store your data in a way that makes it easy to handle, quick to process, and doesn’t take up too…
Vishal Barvaliya
Aug 31
Mastering SQL Recursive CTEs: Key Concepts and Interview Questions
Mastering SQL Recursive CTEs: Key Concepts and Interview Questions
Unlock the Power of Recursive CTEs in SQL: Learn Key Concepts, Hierarchical Data Solutions, and Top Interview Questions to Boost Your…
Pritam Deb
Sep 6
Latest
Inside a Netflix Data Engineering Interview.
Inside a Netflix Data Engineering Interview.
A Real time Media — OTT Domain Use Case Question and How to Solve It Together.
Brahma, The Data Engineer.
Sep 18
Create Event Driven Airflow Pipeline with SNS, SQS and Lambda
Create Event Driven Airflow Pipeline with SNS, SQS and Lambda
How I set up a simple demo to use AWS SNS/SQS with Lambda to trigger Airflow DAGs with message content as parameters
Obinna Onyema
Sep 18
Integrating GA4 Data into Snowflake
Integrating GA4 Data into Snowflake
As a Junior Data Engineer, my first major project involved integrating Google Analytics 4 (GA4) data into Snowflake. This task was…
Vanessa Andersson
Sep 16
What to Do If There’s No Time Zone Information in Your Data: A Guide for Data Engineers
What to Do If There’s No Time Zone Information in Your Data: A Guide for Data Engineers
Time zones can be a headache for data engineers.
Aleh Belausau
Sep 16
Behind the Scenes of Spark Submit: How Spark Executes Your Code
Behind the Scenes of Spark Submit: How Spark Executes Your Code
Explore the inner workings of Spark Submit, from DAG creation to resource management, task execution, and performance optimization on YARN
Pritam Deb
Sep 14
Optimizing Spark: Strategies for Scalable and Efficient Data Pipelines
Optimizing Spark: Strategies for Scalable and Efficient Data Pipelines
Optimize Spark Performance: Tuning Garbage Collection, Memory Management, and Tungsten Execution for Efficient Data Processing
Pritam Deb
Aug 23
How to Trigger Airflow DAG Using REST API
How to Trigger Airflow DAG Using REST API
Sometimes you want to programmatically trigger an airflow DAG. Airflow allows you to enable this functionality when needed. In this…
Obinna Onyema
Nov 10, 2023
Create Event Driven Airflow Pipeline with Amazon SQS
Create Event Driven Airflow Pipeline with Amazon SQS
I have been thinking about making Airflow pipelines more event driven and wondering how Amazon’s SQS could facilitate that. Partly I would…
Obinna Onyema
Sep 14
Spark Out of Memory Issue: Memory Tuning and Management
Spark Out of Memory Issue: Memory Tuning and Management
A Complete Closeup.
RAKESH CHANDA
Sep 4
End-to-End AWS KMS Encryption and Decryption Tutorial
End-to-End AWS KMS Encryption and Decryption Tutorial
We’re excited to share our new tutorial on Keyper. Keyper v0.0.3 now supports AWS (in addition to GCP) for end-to-end data and file…
Lulu Cheng
Sep 10
About Towards Data Engineering
Latest Stories
Archive
About Medium
Terms
Privacy
Teams