Running Argo Workflow in Azure Kubernetes Service

Workflow and process orchestration tools help companies that deal with information processing be more efficient. This is done by automating the execution of repetitive manual tasks, thus enabling faster and better decision making, lowering expenses, but also facilitating compliance processes by providing monitoring and audit trails. When it comes to data pipelines and machine learning, a workflow tool is an essential component that drives the overall project.

Overall, the focus of any orchestration tool is to provide a centralized platform for repeatable, reproducible, and efficient automated tasks. The benefits of using workflow orchestration solutions cannot be ignored but then comes the challenge of selecting the right tool.

Overview of our workflow

Tool Selection

When determining which workflow orchestration tool to select, there were some some guiding points which helped us:

  • The components that were initially developed were standalone, and were also written in different languages. To make it more agnostic, the components were placed in containers. The orchestration tool would then need to be able to deal with container based tasks
  • Kubernetes native: Kubernetes provided us the capability to easily scale out when load increased, so the tool had to run within a Kubernetes context.
  • Can handle simple directed acyclic graph (DAG) scenarios and dependencies, no complex situations were foreseen.

Based on our guiding points, we decided to choose Argo Workflow — an open source workflow engine for orchestrating parallel tasks on Kubernetes. This would allow for ease of scalability since Argo is designed for containers and helps get away with limitations of server-based environments, while being cloud agnostic as it can run in any Kubernetes cluster.

Below you will find some tips on how to get started with Argo in Azure Kubernetes Service (AKS), including step by step instructions and links to the code.

Getting started with Argo in AKS

I had already assessed Argo on my local machine using minikube, following the Argo Workflow guide, it was quite simple to get started. However when moving from a local workstation to our Azure Kubernetes Services, there were some hurdles.

1. Install the Argo CLI

If you have not already done so, download and install the latest Argo CLI into your workstation.

2. Create an Argo namespace in Kubernetes

kubectl create ns argo

3. Install Argo Workflows

Argo offers various installation manifests. Download the standard install manifest to your workstation.

With AKS 1.19 and above, Azure had moved away from using docker as the container runtime for its Kubernetes service and now uses containerd¹. Argo’s default workflow executor happens to be docker runtime, which needs to be changed.

In the install manifest file which which was downloaded, update the container runtime executor to k8sapi in the ConfigMap section:

apiVersion: v1
kind: ConfigMap
metadata:
name: workflow-controller-configmap
data:
containerRuntimeExecutor: k8sapi

Save and apply the updated install.yaml

kubectl apply -n argo ./install.yaml

4. Configure Service Account & RBAC

Argo requires a service account to communicate with the Kubernetes resources in order to support its features. The service account also needs a minimum set of role-based access permissions. Create a yaml file as below to configure the service account, roles and binding the service account to those roles. Take note of the name of the service account.

apiVersion: v1
kind: ServiceAccount
metadata:
name: workflow
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: workflow-role
rules:
# pod get/watch is used to identify the container IDs of the current pod
# pod patch is used to annotate the step's outputs back to controller (e.g. artifact location)
- apiGroups:
- ""
resources:
- pods
verbs:
- get
- watch
- patch
# logs get/watch are used to get the pods logs for script outputs, and for log archival
- apiGroups:
- ""
resources:
- pods/log
verbs:
- get
- watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: workflow-binding
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: workflow-role
subjects:
- kind: ServiceAccount
name: workflow

Apply the yaml to the argo namespace in your Kubernetes cluster.

kubectl apply -n argo -f ./argo-service-rbac.yaml

5. Test Argo Workflow

We will use to the Argo hello world example to test out the Argo setup. The major thing to take note of is that we indicate that the service account we created above is the one to be used (or else Argo would use the default service account which does not have the necessary permissions):

apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
generateName: hello-world-
labels:
workflows.argoproj.io/archive-strategy: "false"
annotations:
workflows.argoproj.io/description: |
This is a simple hello world example.
You can also run it in Python: https://couler-proj.github.io/couler/examples/#hello-world
spec:
serviceAccountName: workflow
entrypoint: whalesay
templates:
- name: whalesay
container:
image: docker/whalesay:latest
command: [cowsay]
args: ["hello world"]

Submit the above workflow using the following command:

argo submit -n argo --watch ./argo-hello-world.yaml

The command will run and allow you to watch the workflow

That concludes the installation and setup of Argo within our Azure Kubernetes Service. With the above you should be ready to get your own Argo workflow orchestration running.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store