Writing unit tests for an Airflow DAG

Jun Wei Ng
3 min readMar 7, 2022

Unit tests are the fundamental of the test pyramid. They should be fast to execute and cover as much of the code as logically sensible. The unit tests should also describe the code’s behaviour under the various use case so that others in the team can quickly understand the code.

The same goes for writing an Airflow DAG. There should be unit tests to cover the behaviour of how the Airflow DAG works under different use cases. This would make it easier for others to maintain, and add, or change behaviours.

In this article, we will explore how to write unit tests for an Airflow DAG.

Photo by Sigmund on Unsplash

Essential unit tests for an Airflow DAG

To ensure that our Airflow DAG has been coded correctly, we need to assert that:

  1. The DAG is created correctly (start_date, end_date, catchup, and etc.)
  2. The tasks are ordered correctly
  3. The tasks are triggering on the correct rules

We shall use the following example DAG, which represents an ETL pipeline, in the following sections:

Simple ETL example

Asserting that the DAG is created correctly

Sometimes, we need to create a DAG with specific date ranges, or to perform catchup runs, or to run on a specific schedule interval. These conditions can be explicitly captured by unit tests:

These unit tests make it easier for the next developer to understand how the DAG should be created, when the DAG should run, etc.

Asserting that tasks are ordered correctly

We can write unit tests for an Airflow DAG to assert that we have the pipeline ordered correctly as such:

Using the downstream_list API from the BaseOperator, we can assert if the tasks have been wired up correctly.

Asserting that tasks are triggering on the correct rules

It might seem obvious, but we want the downstream task to only run when its upstream task completed successfully; Airflow’s default task trigger rule is all_success, which is already modelling the behaviour we desire. But it doesn’t hurt to document this behaviour via unit tests. We can write our unit tests for the above DAG as such:

This way, the unit tests are explicitly describing how the ETL DAG should work. If there would be changes to this ETL pipeline, the unit tests will change, and this change will be captured in your source code versioning tool. This makes it easier for the next developer to understand changes to a DAG, thus improving the DAG’s maintainability.

Conclusion

We should write unit tests to cover the basics of our Airflow DAGs, such as DAG setup, and if tasks are wired up correctly. These unit tests should execute quickly to give developers fast feedback so that errors are fixed in a timely fashion.

Other articles that I wrote that are also along the lines of testing for Airflow:

Happy coding! (:

--

--

Jun Wei Ng

Software developer @ Thoughtworks. Opinions are my own