Deploying Apache Airflow with Docker Compose

Guillermo Barreiro
Gradiant Talks
Published in
6 min readJul 1, 2020

--

A workflow is an orchestrated sequence of steps which conform a business process. Workflows help define, implement and automate these business processes, improving the efficiency and synchronization among their components. An ETL workflow involves extracting data from several sources, processing it and extracting value from it, storing the results in a data warehouse, so they can be later read by others. ETL processes offer a competitive advantage to the companies which use them, since it facilitates data collection, storage, analysis and exploitation, in order to improve business intelligence.

Photo by Tyler Franta on Unsplash

Apache Airflow is an open-source tool to programmatically author, schedule and monitor workflows. Developed back in 2014 by Airbnb, and later released as open source, Airflow has become a very popular solution, with more than 16 000 stars in GitHub. It’s a scalable, flexible, extensible and elegant workflow orchestrator, where workflows are designed in Python, and monitored, scheduled and managed with a web UI. Airflow can easily integrate with data sources like HTTP APIs, databases (MySQL, SQLite, Postgres…) and more. If you want to learn more about this tool and everything you can accomplish with it, check out this great tutorial in Towards Data Science.

--

--

Guillermo Barreiro
Gradiant Talks

Born in the 90s in Vigo, Spain. Telecom Engineer and tech, sports, music, food and traveling lover. Based in Gothenburg, Sweden.