A Beginner’s Guide to Data Engineering — The Series Finale

From ETL Pipelines To Data Engineering Frameworks

Robert Chang
Jun 24, 2018 · 12 min read
Image credit: Well-designed Data Engineering Frameworks Can Open a Lot of Doors and New Possibilities :)

At Last, The Finale

A Common Scenario

Source: A Lot of ETL Hard Work Is Required to Power a Simple Dashboard like This (Referenced from Superset)

From Pipelines To Frameworks

Image credit: From ETL pipelines to ETL frameworks

Design Patterns For Data Engineering Frameworks

Source: From Max’s meet-up talk titled “Advanced Data Engineering Patterns using Apache Airflow”

1. Incremental Computation Framework

Reference: An illustration of Incremental Computation Framework

2. Backfill Framework

Reference: An illustration of the backfill framework

3. Global Metrics Framework

Source: Airbnb’s metrics framework talk, presented by Lauren Chircus during DataEngConf18

4. Experimentation Reporting Framework

Reference: An illustration of Experimentation Reporting Framework

Conclusion

Image credit: We Have Finally Reached the End of the Data Engineering Tunnel :)

Robert Chang

Written by

Data @Airbnb, previously @Twitter. Thoughtfully opinionated, weakly held. Opinions are my own.