How to Install Custom Image Airflow on Docker (Ubuntu)

Muhammad Rivaldi Idris
3 min readJul 9, 2023

--

Introduction

Apache Airflow is a powerful platform used for orchestrating and scheduling complex workflows. I’m installing because it provides containerization, which ensures that Airflow and its dependencies are isolated from the host system and other containers. This helps in avoiding conflicts and ensures that the Airflow environment remains consistent and isolated. In my case, I have to customize image Apache Airflow because I need some python libraries to be installed as well.

In this tutorial, I will show the steps to install a custom Apache Airflow image on Docker in Ubuntu.

Prerequisites

  • Ubuntu operating system installed on your machine.
  • Text editor such a VS Code
  • Docker installed and running on Ubuntu system. You can follow the official Docker installation guide for Ubuntu (link: https://docs.docker.com/engine/install/ubuntu/) to set up Docker correctly.
  1. Create /airflow directory and inside /airflow directory create some directory such a /dags, /config, /logs, /plugins.

2. Create requirements.txt file, which contains the required libraries

pandas
google-api-python-client
oauth2client==4.1.3
google-api-python-client==2.45.0
google-cloud-bigquery==3.2.0
google-cloud-storage==2.5.0
fastavro==1.6.1
gcsfs==2022.10.0
yfinance==0.2.22

3. Create a Dockerfilecontains instructions for building custom Airflow image

FROM apache/airflow:2.6.2

COPY /dags ./dags
COPY requirements.txt requirements.txt

RUN pip3 install -r requirements.txt

USER airflow
EXPOSE 8080
EXPOSE 8793
EXPOSE 5555

4. Build the Docker image by terminal on vscode, with this command:

docker build -t apache-airflow:aldi .

This command builds the Docker image based on the Dockerfile in the current directory and tags it with the name apache-airflow:aldi

5. Fetch docker-compose.yaml with this command:

curl -LfO 'https://airflow.apache.org/docs/apache-airflow/2.6.2/docker-compose.yaml'

If you want to know more about docker-compose.yamlwith please refer the original documentation from Apache Airflow here

6. Modify docker-compose.yaml replace image name to docker image name that has been build before

7. Execute docker-compose.yamlwith this command:

docker-compose up -f docker-compose.yaml

8. After the deployment finished, verify the status of the containers using the docker ps command:

docker ps

installation Apache Airflow is done, and can view the airflow UI (webserver) by visiting http://localhost:8080 in your web browser.

Conclusion

In this tutorial, I learned how to install a custom Apache Airflow image on Docker in Ubuntu. By containerizing Airflow, can easily manage and deploy workflows in a consistent and reproducible manner. Docker provides an efficient and portable solution for running Airflow, making it a popular choice among developers and data engineers.

That’s for the article, free to connect with me on LinkedIn for any further questions.

--

--