Homepage
Open in app
Sign in
Get started
Towards Data Engineering
Navigating the Path to Data Engineering Excellence
About
Follow
Trending
The Most discussed Spark Questions in 2024
The Most discussed Spark Questions in 2024
Solon Das
Apr 27
Airbyte — Getting Started with Useful Extract and Load Tool
Airbyte — Getting Started with Useful Extract and Load Tool
Introduction to no-code data pipeline with PostgreSQL, BigQuery, OpenweatherAPI
Ihor Lukianov
May 26
Most Asked Questions on Data Pipeline Design
Most Asked Questions on Data Pipeline Design
These are the most asked questions :
Solon Das
Apr 15
Latest
Testing Apache Airflow DAGs locally with Testcontainers and LocalStack
Testing Apache Airflow DAGs locally with Testcontainers and LocalStack
This article presents a simple strategy for testing Airflow DAGs locally using LocalStack for mocking AWS cloud services.
Sebastian Daberdaku
Jun 9
Building a custom Apache Spark Docker image with AWS Glue Data Catalog support as metastore
Building a custom Apache Spark Docker image with AWS Glue Data Catalog support as metastore
AWS Glue is not supported out of the box by Spark. In this article we will see how to build the latest 3.5.1 version with Glue support.
Sebastian Daberdaku
Jun 8
Performing Delta Table operations in PySpark with Spark Connect
Performing Delta Table operations in PySpark with Spark Connect
Introduced with Spark 3.4, Spark Connect provides a decoupled client-server architecture allowing remote connectivity to Spark clusters…
Sebastian Daberdaku
Jun 7
Azure Storage Account : The Nuances
Azure Storage Account : The Nuances
Azure Storage Account is a cloud storage solution provided by Microsoft Azure. It offers a scalable and durable platform to store various…
Solon Das
Jun 6
What’s in a domain?
What’s in a domain?
John Hawkins
Jun 5
Kubernetes for Data Engineering: An End-to-End Guide
Kubernetes for Data Engineering: An End-to-End Guide
Yusuf Ganiyu
Jan 26
End to End Data Engineering for Data Lakehouse with Airflow, Minio, Kafka, Apache Spark, Apache…
End to End Data Engineering for Data Lakehouse with Airflow, Minio, Kafka, Apache Spark, Apache…
Data Lakehouses (in term and usage) have been around for more than a decade but the potentials are most recently being actualised with the…
Yusuf Ganiyu
May 3
End to End Data Engineering for Data Lakehouse with Airflow, Minio, Kafka, Apache Spark, Apache…
End to End Data Engineering for Data Lakehouse with Airflow, Minio, Kafka, Apache Spark, Apache…
In the last section we discussed the important details about Data Lakehouse, the what, how, and why it is important in modern data…
Yusuf Ganiyu
May 11
Building a Smart City: An End-to-End Big Data Engineering Project
Building a Smart City: An End-to-End Big Data Engineering Project
Yusuf Ganiyu
Feb 18
CI/CD For Modern Data Engineering
CI/CD For Modern Data Engineering
Yusuf Ganiyu
Dec 27, 2023
About Towards Data Engineering
Latest Stories
Archive
About Medium
Terms
Privacy
Teams