Stackable Demo: airflow. Data Engineering on K8s

Alex McLintock
2 min readDec 6, 2023

--

So I have been trying out stackable.tech which is a way of launching data engineering platform on Kubernetes (k8s). I blogged about setting up my environment and also the initial trino-taxi-data demo.

Over the last couple of days I have been working on two other demos.

  • airflow
  • nifi-kafka-druid-earthquake-data

There is nothing to link these two demos together except that they seem to have no overlap. I could theoretically install one and then the other.

Airflow

The Airflow demo was installed very quickly and without too much trouble. You need the stackablectl command and a k8s cluster. Here’s how I installed mine. There is no .deb or .rpm file

#!/bin/sh -x
echo See https://docs.stackable.tech/management/stable/stackablectl/installation
wget -O stackablectl https://github.com/stackabletech/stackable-cockpit/releases/download/stackablectl-23.11.0/stackablectl-x86_64-unknown-linux-gnu
# or
# curl -L -o stackablectl https://github.com/stackabletech/stackable-cockpit/releases/download/stackablectl-23.11.0/stackablectl-x86_64-unknown-linux-gnu
chmod +x stackablectl
mv stackablectl ~/bin

The current version seems to be 23.11 but you should check.

#!/bin/sh -x 
stackablectl demo install airflow-scheduled-job

This runs pretty quickly and you can check status with

#!/bin/sh -x
helm list
read -r -p $'Press Enter to continue...\n' key
stackablectl operator installed
read -r -p $'Press Enter to continue...\n' key
stackablectl stacklet list

This last command gave me the external endpoints for the Airflow web interface. And this confirms my guess for the admin user

$ stackablectl stacklet credentials airflow airflow
Credentials for airflow (airflow) in namespace 'default':

USERNAME admin
PASSWORD adminadmin
An airflow web user interface showing two directed acyclic graphs (DAGs)
The Airflow User Interface

After some investigation I was happy that it was running.

The Airflow User Interface showing the execution of one DAG over the last 24 hours.
One DAG has been running on schedule

Spurred on by this quick success I tried another demo without deleting this one.

See my next blog for nifi-kafka-druid-earthquake-data

--

--

Alex McLintock

Big Data Enthusiast, Analytics/DS/ML Platform Consultancy in London