Apache Airflow 1.10.0 Released | Highlights
After a long wait of 8 months, we finally have a new release of Apache Airflow — Apache Airflow 1.10.0 .
This release adds ~800 commits to the previous (1.9.0) release.
Highlights
- New Role-Based Access Control (RBAC) web interface in beta. Flask AppBuilder is used for the new Airflow RBAC Web UI instead of Flask Admin.This will support integration with different authentication backends out-of-the-box, and generate permissions for views and ORM models that will simplify view-level and DAG-level access control. To use this UI you need to add the following line in the
airflow.cfg
file:
rbac = True
- Addition of Kubernetes Operator and Executor to natively launch arbitrary Kubernetes Pods using the Kubernetes API. This makes Airflow even more extensible and flexible than ever before since developers are not limited to an existing set of features in Airflow. This isolates the execution of one workflow from another, eliminating the need to manage many potentially conflicting Python packages.
For example this will allow your Data Scientists to use different Python versions for building their machine learning models. This would also allow easier secret management with Kubernetes Secrets. Track Airflow-Kubernetes Integration on this Jira issue.
For example usage visit https://kubernetes.io/blog/2018/06/28/airflow-on-kubernetes-part-1-a-different-kind-of-operator/.
- Performance Optimisation for large DAGs: DAGs containing a huge number of tasks will perform better now thanks to this PR: https://github.com/apache/incubator-airflow/pull/3116
- Timezone support: Check the following link to read more on timezone support:
https://github.com/apache/incubator-airflow/blob/master/docs/timezone.rst - Added GCP & AWS integration: This release adds various operators both for GCP and AWS. Some of them are Google Kubernetes Engine, Google Cloud Storage, BigQuery, DataProc and Dataflow Operators. Check the below page for documentation regarding Airflow’s integration with all the major cloud providers:
https://airflow.readthedocs.io/en/1.10.0/integration.html - Documentation: There have been lots of additions to documentation on using Hooks, Operators and Plugins. Also documentation is now maintained for previous versions as well on ReadTheDocs.
For documentation of latest stable release which is 1.10.0 currently visit https://airflow.apache.org/
For versioned documentation visit https://airflow.readthedocs.io . Example: For documentation of Airflow 1.9.0, visit https://airflow.readthedocs.io/en/1.9.0/
If you are a developer and want to keep track of docs of the latest development on the master branch visit https://airflow.readthedocs.io/en/latest - Tons of Bug Fixes: For a detailed ChangeLog visit CHANGELOG.txt:
If you are updating your Airflow version, please don’t forget to have a look at UPDATING.md for backwards-incompatible changes that might affect you.
Installation / Upgrading
Installation and upgrading requires setting SLUGIFY_USES_TEXT_UNIDECODE=yes
in your environment orAIRFLOW_GPL_UNIDECODE=yes
. In case of the latter a GPL runtime dependency will be installed due to a dependency (python-nvd3 -> python-slugify -> unidecode). This is a licensing issue and may not concern most of the users in which case you can set AIRFLOW_GPL_UNIDECODE=yes
To do this copy/paste the following in your .bashrc or .bash_profile or in your shell:
export AIRFLOW_GPL_UNIDECODE=yes
OR
export SLUGIFY_USES_TEXT_UNIDECODE=yes
If you get an error that says “Unknown column” when you try to upgrade from previous versions, run the following command:
airflow upgradedb
Many new parameters have also been added to ‘airflow.cfg’ file in 1.10, so you might need to add those missing settings, otherwise, Airflow would give errors like below as it couldn’t find lineage
config in airflow.cfg
file:
[2018-09-14 10:03:53,260] {base_task_runner.py:107} INFO - Job 45989: Subtask fetch_requests [2018-09-14 10:03:53,260] {cli.py:464} ERROR - Section lineage Option backend does not exist in the config!
So what are you waiting for, just run the following command and enjoy your coffee :)
pip install -U apache-airflow
Also, do let us know if you find any bugs or have any comments on Airflow dev mailing list :)