Run Airflow in Docker

Aşkın TAMANLI
4 min readJul 24, 2023

--

What is Docker ?

Imagine you are a software engineer and developing projects. You know that you need libraries and programming languages to develop software and library versions. You may need more than one version for the same library and programming language. For example, you may need Python 2.8 and Python 3.1 at the same time on your computer. You can fix this, but it will be complex and difficult to understand. This is the biggest reason we use Docker.

Docker is a software platform that allows you to build, test, and deploy applications quickly. Docker packages software into standardized units called containers that have everything the software needs to run including libraries, system tools, code, and runtime. Using Docker, you can quickly deploy and scale applications into any environment and know your code will run.

Docker works by providing a standard way to run your code. Docker is an operating system for containers. Similar to how a virtual machine virtualizes (removes the need to directly manage) server hardware, containers virtualize the operating system of a server. Docker is installed on each server and provides simple commands you can use to build, start, or stop containers.

What is Airflow ?

Apache Airflow is an open-source tool to programmatically author, schedule, and monitor workflows. It is one of the most robust platforms used by Data Engineers for orchestrating workflows or pipelines. You can easily visualize your data pipelines’ dependencies, progress, logs, code, trigger tasks, and success status.

With Airflow, users can author workflows as Directed Acyclic Graphs (DAGs) of tasks. Airflow’s rich user interface makes it easy to visualize pipelines running in production, monitor progress, and troubleshoot issues when needed. It connects with multiple data sources and can send an alert via email or Slack when a task completes or fails. Airflow is distributed, scalable, and flexible, making it well-suited to handle the orchestration of complex business logic.

Let’s run Airflow in Docker

Operating System of our computer is Windows. So, Operating System of our docker conteiner is going to be Windows.

  1. Download “Docker Community Edition” and “Docker Compose” from Airflow official website. You can get the all source code form that website. Than start the Docker Desktop application.
  2. Open cmd or powershell terminal on your computer. Let’s check our versions of Docker and Docker Compose.
docker --version
docker-compose --version

3. Everything seems alright. Create a folder on Desktop. Go to the file directory in terminal. My folder name is ‘run_airflow_in_docker’.

4. We need to install docker-compose.yaml file. We need to run this code if you are using Windows.

curl -LfO "https://airflow.apache.org/docs/apache-airflow/2.6.2/docker-compose.yaml"

5. Next step is create a fex folders. We will need this folders for working with Airflow. We need to run this code if you are using Windows.

mkdir .\dags 
mkdir .\logs
mkdir .\plugins
mkdir .\config

Check the folder.

6. Now we need to run database migrations and create the first user account.

docker compose up airflow-init

7. Most exiting step. Runing Airflow.

docker compose up

Open the Docker Desktop application and check the services.

8. Let’s open the Airflow Webserver. For that go this URL: http://localhost:8080/

Username: airflow
Password: airflow

And boom. We’re in. Now, you can create DAG’s, Task’s, connections, etc..

THANK YOU FOR READING

--

--