TDS Archive

An archive of data science, data analytics, data engineering, machine learning, and artificial intelligence writing from the former Towards Data Science Medium publication.

Member-only story

Why you should try something else than Airflow for data pipeline orchestration

mehdio
TDS Archive
Published in
5 min readSep 20, 2021

--

Fan[Digital image] by rajat sarki, https://unsplash.com/photos/Gx2SU87s4WY

While Airflow has dominated the market in terms of usage and community size as a data orchestrator pipeline, it’s pretty old and wasn’t designed initially to meet some of the needs we have today. Airflow is still a great product, but the article's goal is to raise awareness on the alternative and what the perfect orchestration tool would be for your data use case. Let’s evaluate AWS step functions, Google workflows, Prefect next to Airflow.

So what are the criteria for a good data orchestrator tool nowadays?

API-First design ⚙

As the Cloud providers are API-First, you want your orchestrations tool to be the same. Ideally, you want to be able to do a couple of things through the API :

  • Create/delete workflows
  • Easy DAG serialization & deserialization for non-static /evolving workflows.
  • Run parameterized workflows
  • Handling access management
  • Deploy the orchestration tool (if not serverless) through IaC frameworks (Terraform/Pulumi)

--

--

TDS Archive
TDS Archive

Published in TDS Archive

An archive of data science, data analytics, data engineering, machine learning, and artificial intelligence writing from the former Towards Data Science Medium publication.

mehdio
mehdio

Written by mehdio

✍️ Data Engineering, Tech Career & Code | 🎥 https://www.youtube.com/@mehdio

Responses (7)