Argo workflows as alternative to Cloud Composer

Published in

Compendium

5 min readMay 5, 2020

Background

In previous posts (scheduling jobs #1, scheduling jobs #2)I have been writing about how to do workflow scheduling using GCPs Cloud Composer (airflow).

Something that has been bugging me about Cloud Composer is the steep price (380$ / month minimum!). For small clusters and small amount of jobs, the spend in dollars and infrastructure does not really add up to the value provided.

A perfect example of this is our in-house Computas application.

It used to run in kubernetes cluster as cron-jobs. At some point we started getting dependencies between jobs, so cron was not really an option anymore.

Every employee in computas has an app installed that gives you access to all the other employees — image, phone, mail.., in addition it also contains personal info on each employee like vacationdays, sickdays and kks budget.
Computas also has several in house conferences each year with several parallel tracks. Using the application we have created, each employee can decide which tracks to go to. The master data for this is in JIRA.

The application uses firebase as its backend. And to fill firebase with relevant data from our on prem systems we have several ETL jobs.

Conceptual diagram of the infrastructure involved

If we were to spin up a composer cluster to handle the ETL for this, the cost would be overkill, as the rest of the infrastructure is less than 100$ /month..

Argo workflows

There are several alternatives to run composer managed. A decent choice is running airflow on a VM, but seeing as the jobs already were kubernetes cron jobs, we wanted something that could easily install to the kubernetes cluster with minimal overhead and run the jobs as-is.

Argo workflows is kubernetes native and has a relative small footprint compared to airflow. It uses custom resources to describe jobs and deploys a controller to run them - all native kubernetes concepts. This means that Argo does not need an external database to keep state, as state is kept in the resource itself.

It also has an (simple) UI, but by default this is not exposed to the internet.

The workflows can run by cron-schedule or as one-off jobs. For our usecase the cron-schedule fits us best. You can configure what to do if the job is already running (Forbid, kill existing, skip..)

The workflows are defined in yaml.

Argo also has a rich ecosystem. For example it has plugins to do event driven architecture (ie, spawn workflows after consuming pubsub messages etc), something that is a bit cumbersome to do in Composer.

Deploying workflows

Argo workflows has a really nice concept when it comes to defining workflows, that makes the ci-cd process really easy to do in both mono-repos and micro repos. A workflow can consist of several templates, and a template can be built alongside a docker image (with version). This means that your build process can build a named template, and the workflow will just pick this up when it becomes available. This means after you have committed code, and build process is done your workflow is ready to run.

compare this to how you have to make this in cloud composer:

First create a trigger that syncs the repo folder to a bucket for a commit..
Then behind the scenes, this bucket is synced to each airflow worker every 30 seconds.
And you keep refreshing the dag in the ui until you see the content of DAG chaning
Then you run your job..

Creating an ingress with IAP

Argo workflows does not come with an ingress out of the box. Of course, you can create a tunnel on demand to see the argo ui, but it is not really optimal to do this every time you want an overview of things. Do not worry, it is easy to expose argo-server to the internet!

You need to own a domain.
Create an A record in the domain to point at the IP of your Loadbalancer
Patch the argo server so it can be exposed by an L7 LB

kubectl patch svc argo-server -n argo -p '{"spec": {"type": "NodePort"}}'

Create ingress using GKE ui and select argo-server.

select argo-server and hit “Create ingress”

Enable IAP on the LB so you need to login to access resources

In the IAP ui you can add who should be allowed to your site!

What about all the airflow operators?

There is something you are going to loose, and that is all the airflow operators. You can mimick some/most of them by running gcloud/any docker images and “gsutil”, “bq”, etc, bash commands. However, if your usecase is to use a lot of the built in airflow operators, argo workflows is not for you.

Other points — composer vs argo

Autoscaling

Argo autoscales out of the box. You can have a small cluster with an autoscaling node pool and it will just work.
Composer on the other hand will not - mostly because how google has configured it. Airflow can autoscale easily if done right.. There are medium posts that have some solutions on how you can achieve this

Argo has a nice minimalistic UI, but does not have all the features of airflow
Composer has all the features you need, but it looks crap (hopefully this will change at some point).

Fundamental difference in scheduling

Argo runs as cron with a DAG.
Composer runs a job on a specific time with start time. (and does backfill for you)

Logging

Argo logs are stored with the kubernetes resource, which means they are also deleted once the job is deleted. (unless shipped to stackdriver). Streaming logs is done realtime in ui.
Composer logs is stored and versioned in a bucket.

Final remarks

All in all i really like Argo. I have deployed it twice now (once internal and once on a customer as a Cloud Composer replacement) and both places it performs very nicely.

It is a great alternative to Composer, especially if you already have a kubernetes cluster and want to schedule jobs in that cluster!

It is also great for those where 380$ minimum/month is way too much to spend on workflow scheduling.

Composer still have its role to play; Its a strong managed solution where you do not need to know any kubernetes and can focus on creating business value from day 1!