What to consider before choosing Argo Workflow?

To go full Kubernetes-native or not?

Published in

datamindedbe

6 min readMar 9, 2021

The recent explosion of tools including task and data orchestration tools should make you wonder if you’re still doing the right thing. Purely based on Github-stars of the open-source frameworks, Airflow is still the most popular one. This does not take into account the popularity of closed-source, or cloud vendor tools. Understanding where they overlap or differ has been described fairly well by others (this one, or that one).

This articles focuses on the least overlapping one: Argo Workflow.

Task orchestration

Imagine you are tasked with four items: cleaning data, training a model, evaluating a model, using the model to make inferences on unseen data. In the beginning, this happens ad-hoc, because you’re the only one in the organization performing them. This is fine, as you manage to deliver what the downstream consumer expects from you. It required you almost zero initial effort. Victim of your own success,

your team grows,
more similar use-cases are being worked out,
and more other teams and products will depend on you performing these tasks in a stable manner.

This is when a task orchestrator comes into play. You leverage this tool to model each task as being a vertex (node) in a graph of tasks. An edge (arrow) represents an execution dependency. This type of graph is called a direct acyclic graph (DAG). You rely on your orchestrator to trigger and monitor these flows reliably.

Secondly, your orchestrator should be language or framework agnostic. You might start-off with a Python specific orchestrator (like Luigi) because it seems easy to get-started. Eventually there will be another “hot thing” that your orchestrator will need to support. Back to the drawing board… Running a task in container on k8s offers this flexibility.

Thirdly, defining a workflow should be as light as possible for end-users. The cognitive load of reading (let alone maintaining) a 250-line DAG-definition is not to be underestimated. Look for easy templating options to remove boilerplate configurations.

Using Argo Workflows

Argo Workflows is an open source container-native workflow engine for orchestrating parallel jobs on Kubernetes (k8s). Argo Workflows is implemented as a k8s custom resource definition (CRD). CRD’s are used to define custom API objects. It allows for the extension of the vanilla k8s-experience in a k8s-compliant fashion.

Argo Workflow is part of the Argo project, which offers a range of, as they like to call it, Kubernetes-native get-stuff-done tools (Workflow, CD, Events, Rollouts).

Users can interact with it through the Argo CLI, UI or via kubectl.

To get a better feel of what the end-user will be dealing with, let’s go over a few key concepts.

Core concepts

A Workflow is the most fundamental object. It defines and stores the state of a workflow. Consider it as a dynamic object.

apiVersion: argoproj.io/v1alpha1 
kind: Workflow 
metadata:
  generateName: hello-world-    # Name of this Workflow 
spec:
  entrypoint: whalesay          # Will run "whalesay" step first
  templates:   
    - name: whalesay            
      container:       
        image: docker/whalesay       
        command: [cowsay]       
        args: ["hello world"]

Although the basis will always be to run a container, there are other template types, which are divided into 2 groups.

Template definitions

These define actual work to be done in a step.

container — most popular type
script — templatable convience wrapper for container

- name: gen-random-int     
  script:       
    image: python:alpine3.6       
    command: [python]       
    source: |         
      import random         
      i = random.randint(1, 100)         
      print(i)

resource — any operation on Kubernetes resources

Example here creates a ConfigMap.

- name: k8s-owner-reference     
  resource:       
    action: create       
    manifest: |         
      apiVersion: v1         
      kind: ConfigMap         
      metadata:           
        generateName: owned-eg-         
      data:           
        some: value

suspend — more useful than you think

- name: delay     
  suspend:       
    duration: "20s"

Template invocations

These invoke other template types. Typically defines a structure.

steps — define steps in a “list of lists” way

Example here runs step1 first, then step2a and step2b in parallel.

- name: hello-hello-hello     
  steps:     
    - - name: step1         
        template: prepare-data     
    - - name: step2a         
        template: run-data-first-half       
      - name: step2b         
        template: run-data-second-half

dag — define steps as a dependency graph.

- name: diamond     
  dag:       
    tasks:       
    - name: A         
      template: echo       
    - name: B         
      dependencies: [A]         
      template: echo       
    - name: C         
      dependencies: [A]         
      template: echo       
    - name: D         
      dependencies: [B, C]         
      template: echo

WorkflowTemplate to the rescue!

In order to reduce the number of lines of text in Workflow YAML files, use WorkflowTemplate . This allow for re-use of common components. The basic hello-world example becomes. This is similar to the k8s-native podTemplate.

apiVersion: argoproj.io/v1alpha1 
kind: WorkflowTemplate 
metadata:
  name: workflow-template-submittable
spec: 
  arguments: 
    parameters: 
      - name: message 
        value: hello world 
  templates: 
    - name: whalesay-template 
      inputs: 
        parameters: 
          - name: message 
      container: 
        image: docker/whalesay 
        command: [cowsay] 
        args: ["{{inputs.parameters.message}}"]

More concepts and examples can be found in the documentation.

Cool, but why Argo Workflow and not just Airflow or something else?

Argo is designed to run on top of k8s. Not a VM, not AWS ECS, not Container Instances on Azure, not Google Cloud Run or App Engine. This means you get all the good of k8s, but also the bad.

If you are already quite invested in k8s, then it makes sense to first look at Argo. You will recognise all of the mechanisms known in vanilla k8s.

The good

Resilience to container crashes and failures, inherited from k8s.
Autoscaling and options to configure this. Simultaneously triggering 100’s or 1000’s of Argo Workflows is not a problem with minimal tuning (setting cpu and memory requirements per task correctly, etc).
Possibility for endless configurability.
Full support for RBAC, inherited from k8s. Their RBAC model also integrates nicely with SSO. For full isolation requirements (each project has its own k8s namespace and own privileges), common in enterprises, this is a big plus compared to Airflow.

The bad

The relevance of the following three considerations will depend on your situation at hand.

#1. Everyone will write and maintain YAML files

A short YAML file for a single project is maintainable. Once the number of workflows start increasing, and the requirements become more complex, Argo offers you tricks and templating features to keep it manageable.

If you’re organization is used to this way of working (thanks to the use of other k8s-native tools), then you might find it acceptable. Otherwise, don’t jump for it yet.

Just look at the official examples to get a feel of how your repo will look like.

#2. Users will need to be Kubernetes experts

If your team consists of seasoned k8s experts, using Argo will feel like second nature. A novice user will first need to understand containers and k8s. That burden might be a huge slow down initially. On the other side, this cost is fair if IT management is betting on the full k8s way of working.

For maintainers of the Argo setup it is even more important to know your way in k8s. Most probably also be very knowledgeable of AWS, GCP or Azure.

#3. Maintenance of a full-fledged enterprise setup is heavy

Installing Argo Workflow on an existing k8s cluster is relatively easy.

kubectl create ns argo
kubectl apply -n argo -f https://raw.githubusercontent.com/argoproj/argo-workflows/stable/manifests/namespace-install.yaml

Maintaining all the YAML files to enable for an enterprise-IT-security-compliant setup is not something to be lighthearted about. Have a look at the number of configurations options of the community-maintained Helm-chart to get a feel for the number of moving parts.

To put this more into perspective, configuring and supporting Airflow (and others) to the highest security compliance levels is equally non-trivial. This could explain the high number of “fully-managed” orchestrators out there. For example, AWS recently released Amazon Managed Workflows for Apache Airflow. The industry’s cry has been heard.

Conclusion

If you are already heavily invested in Kubernetes, then yes look into Argo Workflow (and its brothers and sisters from the parent project).

The broader and harder question you should ask yourself is: to go full k8s-native or not? Look at your team’s cloud and k8s experience, size, growth targets. Most probably you will land somewhere in the middle first, as there is no free lunch.

Need help with taking this decision? Let’s get in touch via LinkedIn or our website.

Acknowledgements

Thank you Data Minded for giving the learning opportunities to produce this article.