Deploy and Run Apache Airflow on AWS ECS Following Software Development Best Practices

Thierry Turpin
The Startup
Published in
5 min readJul 14, 2020
Solution architecture

This blog post is covering how to apply best practices in the deployment of Apache Airflow. In order to have true scaling Airflow is deployed with the Celery executor.

Motivation of architecture choice:

  • CodeBuild: fully manged aspect and easy of setup and configuration. Able to link multiple pipelines to a single GitHub repository
  • Elastic Container Registry (ECR): fully managed and offering vulnerability scanning
  • sonarcloud: unique offering? And integration capabilities
  • Elastic File System (EFS): fully managed file system that can be shared on all Airflow nodes
  • Elastic Container Service (ECS): fully manged container solution that leverages the use of a docker-compose file
  • Secrets manager: fully managed secrets management including rotating credentials for the RDS database
  • Aurora RDS and ElasticCache for Redis, fully managed services
  • GitHub, community and collaboration and perfect integration with CodeBuild

The solution is based on 2 CodeBuild pipelines:

  • a first one is responsible of building the Docker image, adding docker tags and pushing the Docker image to the ECR private registry
  • a second pipeline is responsible of executing Python tests and validate DAGs code quality using sonarcloud. Only if all quality gates are passed, the DAG definition gets pushed to EFS. EFS that is mounted on the ECR instances, so that the DAG definitions immediately becomes available on all the hosts of the Airflow cluster

Building the Docker image

Apache Airflow is composed of many Python packages and deployed on Linux.

Best practices here is to have a reliable build chain for the Docker image and being able to trace down the Docker image down to the exact GIT commit. This can simply be realized using --build-arg argument with the docker build command.

The Docker file we use is based on the very popular puckel/docker-airflow. The Docker file has arguments that will compose the LABEL.

It’s possible to inspect Docker docker images without the need of downloading the image locally with skopeo.

The docker image has the label values:

  • LABEL_BUILD_NUMBER: 23 value derived from CodeBuild
  • SOURCE_VERSION: 2eea3da019f6eb2…of the GIT commit
skopeo inspect

GIT commit history view on GitHub:

The GIT commit: 2eea3da019f6eb28672f5fe1bceef249a917a109

CodeBuild history:

CodeBuild Build Number 23

If the build is successful the Image is pushed to the ECR private Docker registry. We use the option to automatically scan the docker image for vulnerabilities.

AWS ECR uses open source CoreOS Clair project and provides you with a list of scan findings.

Vulnerabilities found in the Docker file

The scan detected 2 critical CVEs issues in the Linux Kernel 5.0.21.

DAGs pipeline

Python unittest is used to check our DAG definitions, hooks and plugins.

All tests are stored in a directory tests along our dags folder in the repository. This allows us to use the discover option of Python unittest, doing so, writing additional checks does not require a definition change of the buildspec.yml.

After the Python syntax checking is done the dags folder is pushed to the sonarcloud for code quality analysis ( done in script.sh that saves the output to results.json ).

From the sonarcloud scan, we use the new_security_rating score to determine if the build phase can continue and copy the dags directory to efs or let the build stop and result in a failed status.

Introduce some bugs and non-compliant code

Let’s check the sonarcloud scan in action.

In the next DAG task we first write twice dummy to a /tmp/ file. After this we introduce some non-compliant use of tempfile and to finish we add a bug.

From the moment to code is committed, CodeBuild start handling the steps defined in the buildspec.yml.

CodeBuild Phase details

In the Build logs we can trace every statements output, and we see the sonarcloud returned the value ERROR for new_security_rating.

CodeBuild Build log

In the sonarcloud dashboard, we see 1 Bug was detected, 1 vulnerability was detected with 2 security hotspots and 3 code smells. We get a full view on the quality of our code.

sonarcloud overview

This can further be analyzed, commented assigned to a team member and so forth.

sonarcloud detected issues

The status of the build is also immediately visible on our GitHub commit:

GitHub commits with status

Airflow on ECS

Airflow and dockerized workloads can be deployed in many ways. Here we opted for ECS because it’s ease of use and the support of the docker-compose format. This is a small step compared to bringing the solution to Kubernetes.

Doing that we are able to run on a local workstation with docker-ce the almost identical setup like we have on ECS.

Below on the left we have a docker-compose for usage on a workstation on the right the ECS version.

The differences we have are:

  • links: this is needed for the webserver to be able to get a log from a worker node
  • environment: the database credentials are retrieved from secrets manager via the additional file: ecs-paramas.yml
  • logging: Docker supports many log drivers, here we use CloudWatch for the container logs
docker-compose files

Spinning up an ECS cluster

The creation of an ECS cluster including cluster nodes and mounting of the efs file share requires a single aws command. The cluster is ready in less than 2 minutes.

With just one more “docker-compose ecs style” command and a couple seconds later we have Apache Airflow on Celery up and running.

ECS tasks

Getting further

This was an introduction to the subject. Micropole delivers advanced analytics projects in the cloud at any scale. This introduction will also be the base of live-streaming sessions on bringing your analytics workloads to the cloud.

Stay tuned: https://www.lucyinthecloud.com/

Further reading

--

--

Thierry Turpin
The Startup

BI architect | #Analytics | #BusinessIntelligence | #AWS | #BigData | #Cloud | Micropole | Cycling | CX |