Running Argo Workflow in Azure Kubernetes Service
Workflow and process orchestration tools help companies that deal with information processing be more efficient. This is done by automating the execution of repetitive manual tasks, thus enabling faster and better decision making, lowering expenses, but also facilitating compliance processes by providing monitoring and audit trails. When it comes to data pipelines and machine learning, a workflow tool is an essential component that drives the overall project.
Overall, the focus of any orchestration tool is to provide a centralized platform for repeatable, reproducible, and efficient automated tasks. The benefits of using workflow orchestration solutions cannot be ignored but then comes the challenge of selecting the right tool.
When determining which workflow orchestration tool to select, there were some some guiding points which helped us:
- The components that were initially developed were standalone, and were also written in different languages. To make it more agnostic, the components were placed in containers. The orchestration tool would then need to be able to deal with container based tasks
- Kubernetes native: Kubernetes provided us the capability to easily scale out when load increased, so the tool had to run within a Kubernetes context.
- Can handle simple directed acyclic graph (DAG) scenarios and dependencies, no complex situations were foreseen.
Based on our guiding points, we decided to choose Argo Workflow — an open source workflow engine for orchestrating parallel tasks on Kubernetes. This would allow for ease of scalability since Argo is designed for containers and helps get away with limitations of server-based environments, while being cloud agnostic as it can run in any Kubernetes cluster.
Below you will find some tips on how to get started with Argo in Azure Kubernetes Service (AKS), including step by step instructions and links to the code.
Getting started with Argo in AKS
I had already assessed Argo on my local machine using minikube, following the Argo Workflow guide, it was quite simple to get started. However when moving from a local workstation to our Azure Kubernetes Services, there were some hurdles.
1. Install the Argo CLI
If you have not already done so, download and install the latest Argo CLI into your workstation.
2. Create an Argo namespace in Kubernetes
kubectl create ns argo
3. Install Argo Workflows
Argo offers various installation manifests. Download the standard install manifest to your workstation.
With AKS 1.19 and above, Azure had moved away from using docker as the container runtime for its Kubernetes service and now uses containerd¹. Argo’s default workflow executor happens to be docker runtime, which needs to be changed.
In the install manifest file which which was downloaded, update the container runtime executor to k8sapi in the ConfigMap section:
Save and apply the updated install.yaml
kubectl apply -n argo ./install.yaml
4. Configure Service Account & RBAC
Argo requires a service account to communicate with the Kubernetes resources in order to support its features. The service account also needs a minimum set of role-based access permissions. Create a yaml file as below to configure the service account, roles and binding the service account to those roles. Take note of the name of the service account.
# pod get/watch is used to identify the container IDs of the current pod
# pod patch is used to annotate the step's outputs back to controller (e.g. artifact location)
# logs get/watch are used to get the pods logs for script outputs, and for log archival
- kind: ServiceAccount
Apply the yaml to the argo namespace in your Kubernetes cluster.
kubectl apply -n argo -f ./argo-service-rbac.yaml
5. Test Argo Workflow
We will use to the Argo hello world example to test out the Argo setup. The major thing to take note of is that we indicate that the service account we created above is the one to be used (or else Argo would use the default service account which does not have the necessary permissions):
This is a simple hello world example.
You can also run it in Python: https://couler-proj.github.io/couler/examples/#hello-world
- name: whalesay
args: ["hello world"]
Submit the above workflow using the following command:
argo submit -n argo --watch ./argo-hello-world.yaml
The command will run and allow you to watch the workflow
That concludes the installation and setup of Argo within our Azure Kubernetes Service. With the above you should be ready to get your own Argo workflow orchestration running.
You can find the code on the following github page: https://github.com/Bongani/argo-aks