How to Develop for ARM on a Budget

With a Kubernetes cluster based on Raspberry Pis, GitLab, and spare time.

Remco Hendriks
Oct 1, 2020 · 29 min read
Photo by Craig Dennis from Pexels

Developing a full computer cluster in one’s bedroom may seem like an exotic or complicated thing to do. However, with the wonderfully versatile Raspberry Pi platform any interested tinkerer can now easily play with building such as clusters themselves, and on a reasonable budget! While any developer can start some nodes on AWS or Azure at the click of a button, developing your own physical cluster has a satisfaction all its own, and allows one to learn things you never would otherwise. At the end of this cookbook, you will have a small, but fairly fast and stable arm64-based Kubernetes cluster, paired with GitLab to use as a build and deployment platform, so the cluster can be used for something real.

While Raspberry Pis are simple and cheap, they are real computers running a real OS, making them an ideal tinkering platform. One of the major differences with other ‘real’ computers is the CPU architecture, but this may soon change as well. Intel and x86 have dominated the server and desktop markets for many years, but there are some major moves happening hinting that this landscape is about to shift. Amazon AWS released their 2nd generation arm-based 64-bit CPU instance type, Canonical releases Ubuntu 20 with support for arm 64-bit, and Apple announced ARM based Apple silicon for their upcoming Mac computers. Raspberry Pis offer probably the cheapest and easiest way to gain some real experience with ARM64 right now!

The moves around ARM ignited my interest in doing something interesting with a couple Raspberry Pis I had lying around. While doing so, I did encounter some issues while trying to make everything work, which I’ve tried to document in this story. I used six Raspberry Pis to form a Kubernetes development cluster, which I integrated into a workflow developing web applications. It will fulfill an important part of my personal development pipeline where I test the apps I make, before these are shipped into production. I will touch various subjects and tools which are put together to make this work. It’s quite involved, so there’s no deep-dive on the architecture or software used. You can use this as a cookbook to replicate the set-up I made. Some steps are abbreviated, and I presume basic knowledge of using Ubuntu Server with the command-line, shell usage, and editing files.

Materials used

My experimental setup has the following hardware. Like other home-brew Kubernetes clusters, it uses commodity hardware. I find it particularly interesting to find out how much performance I can buy in comparison with cloud providers, without enterprise-grade security or fail-overs.

List of required hardware:

I’ve decided to go with the Raspberry Pi 4 with 4GB of memory, because it has the required power to run the ‘regular’ k8s version of Kubernetes maintained by CNCF and Google. While the ‘lightweight’ version k3s may work as well, with more memory to spare and thus workable on Pis with 2GB memory or less, I opt for the regular version to stick as close as possible to a production-grade cluster.

To power the Pis, I went with the cheapest option, using six USB-C wall adapters. While this isn’t aesthetically pleasing, it’s more cheap than using a USB hub with cables, or convenient as six power-over-ethernet HATs.

Putting your cluster together is fairly straightforward, and should take a few hours. There are plenty of posts on the internet who assemble a similar set-up, so I wont go into the details of that. When completed, it feels quite sturdy and easy to work with. When you’re ready to put it in your bookshelf, get a small desk fan to blow air through it. I use two spare case fans connected to a 5V USB cable, which blow just enough air, and are completely silent. This way, my Pis rarely get hotter than 50 degrees Celsius.

My home-brew Pi cluster

The one-time cost of this set-up is USD 594.85. While this seems quite expensive upfront, it will quickly earn itself back if you are using a cluster from a cloud provider. I’ll make an overview later in this cookbook.

Let’s put this together.

Next: Install Ubuntu 20 LTS 64-bit on USB

Part 1: Install Ubuntu 20 LTS 64-bit on USB

I want each Pi to boot from USB, because it is much faster than a micro-SD card. This blog demonstrates that using USB drives increase performance tremendously. It’s also much less prone to data corruption with a sudden reboot, which can happen frequently considering the novelty of the stack I am using. I’m not using SSD drives, a regular fast USB 3 thumb drive is cheaper and is performant enough for my case.

I’ve followed this thread on the Raspberry Pi forum to make this work.

$ sudo fdisk -l
Disk /dev/sda: 59.77 GiB, 64160400896 bytes, 125313283 sectors
Disk model: Flash Drive
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x87c6153d
Device Boot Start End Sectors Size Id Type
/dev/sda1 * 2048 526335 524288 256M c W95 FAT32 (LBA)
/dev/sda2 526336 125313249 124786914 59.5G 83 Linux

Here, the boot partition is /dev/sda1 , mount it to your filesystem:

$ sudo mkdir -p /mount/data1
$ sudo mount /dev/sda1 /mount/data1

4. Decompress vmlinuz on the boot partition:

$ cd /mount/data1
$ sudo su
$ zcat vmlinuz > vmlinux

5. Edit config.txt, edit the [pi4] section into:

[pi4]
max_framebuffers=2
dtoverlay=vc4-fkms-v3d
boot_delay
kernel=vmlinux
initramfs initrd.img followkernel

This is all that is required to boot Ubuntu directly from USB. Shut down the Pi, remove the micro-SD card, and boot. It should present you the usual freshly-installed Ubuntu prompts to set up a password.

6. Enable automatic kernel decompression. This is required in case the operating system downloads kernel updates, replacing the old decompressed kernel. The Pi won’t boot up unless it has a decompressed kernel. To do this automatically after each update session, Add a new script to the boot partition auto_decompress_kernel:

$ sudo su
$ cd /boot/firmware
$ curl https://gist.githubusercontent.com/remcohendriks/6ceb8e39396aabf25db0a5322445ec8b/raw > auto_decompress_kernel

In /etc/apt/apt.conf.d/, add a file 999_decompress_rpi_kernel and add:

DPkg::Post-Invoke {"/bin/bash /boot/firmware/auto_decompress_kernel"; };

Make the script executable:

$ chmod +x 999_decompress_rpi_kernel

To test if the automation works, check with sudo apt-get upgrade . It should mention if a new kernel is decompressed or not.

Rinse and repeat for all the Pis. You can re-use the micro-SD card each time, you don’t need to flash it again after every use.

Next: Install Kubernetes

Part 2: Install Kubernetes

Getting Kubernetes up and running requires quite some pre-work, which can be tedious to repeat for every single node. So try to use an automation tool like Ansible to save some time on these steps.

I’ve used this post to set up my cluster. I’ve added and changed some instructions to my liking.

network:
version: 2
renderer: networkd
ethernets:
eth0: # eth0 is the gigabit ethernet adapter.
dhcp4: no
addresses: [192.168.3.10/24] # I use 10-15 for my nodes.
gateway4: 192.168.3.1 # change to your router's IP.
nameservers:
# change nameservers as you like:
addresses: [192.168.3.1,8.8.8.8]
dhcp6: no

2. Install Docker. This is pretty straightforward. Apply:

$ sudo apt install -y docker.io
$ sudo systemctl enable --now docker
$ sudo usermod -aG docker ubuntu

Exit the shell using exit and log in again to use the docker command.

Test if the installation works with running the hello-world container:

$ docker run hello-world
Unable to find image 'hello-world:latest' locally
latest: Pulling from library/hello-world
256ab8fe8778: Pull complete
Digest: sha256:7f0a9f93b4aa3022c3a4c147a449bf11e0941a1fd0bf4a8e6c9408b2600777c5
Status: Downloaded newer image for hello-world:latest
Hello from Docker!
This message shows that your installation appears to be working correctly.
[...]

3. To ensure container security, change the control group drivers to use systemd instead of the default cgroups . This is recommended by Kubernetes. Change /etc/docker/daemon.json to:

{
"exec-opts": ["native.cgroupdriver=systemd"],
"log-driver": "json-file",
"log-opts": {
"max-size": "100m"
},
"storage-driver": "overlay2"
}

Additionally, cgroups need to be enabled when the system boots, so add the following values to /boot/firmware/cmdline.txt :

My /boot/firmware/cmdline.txt looks like:

net.ifnames=0 dwc_otg.lpm_enable=0 console=serial0,115200 console=tty1 root=LABEL=writable rootfstype=ext4 elevator=deadline rootwait fixrtc cgroup_enable=cpuset cgroup_enable=memory cgroup_memory=1 swapaccount=1

4. Set up iptables for correct network routing. Put the following snippet into /etc/sysctl.d/k8s.conf:

net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1

Apply with sudo sysctl --system.

5. Assign hostnames to your nodes, this will be used by Kubernetes as recognizable names. I use kubernetes-master-1 for master, kubernetes-worker-[x] for nodes. Do this with the following command:

$ sudo hostnamectl set-hostname kubernetes-worker-1

6. Reboot. Check if the hostname is set by entering hostname , check if the cgroups are set up properly with docker info . It shouldn’t show any warnings anymore at the cgroup section:

$ docker info
Client:
[...]
Logging Driver: json-file
Cgroup Driver: systemd
Plugins:
[...]

7. Set up Kubernetes repository and install packages. Use this one-liner:

$ sudo apt-get update && sudo apt-get install -y apt-transport-https curl
$ curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add -
$ cat <<EOF | sudo tee /etc/apt/sources.list.d/kubernetes.list
deb https://apt.kubernetes.io/ kubernetes-xenial main
EOF
$ sudo apt-get update
$ sudo apt-get install -y kubelet kubeadm kubectl

If you disabled automatic updates in part 1, you shouldn’t worry about automatic updates of these packages. Otherwise, pin the packages with:

$ sudo apt-mark hold kubelet kubeadm kubectl

8. Initialize the Kubernetes control plane. This is the point where all pre-installation comes together, and your first Pi will become the master node. With the initialization command, I already set the pod network CIDR for use with Flannel, the container network interface:

$ sudo kubeadm init --pod-network-cidr=10.244.0.0/16

If all goes well, it will say that the control-plane has initialized successfully, how to use the cluster, and the command to let other nodes join. It looks like:

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 192.168.3.10:6443 --token 17b49p.jpram6b1rpj579w4 --discovery-token-ca-cert-hash sha256:ee70e44ea9f07285b10dee9c72c3ef56a93bb002eba9eb145d48666958f49801

I usually use the generated config on the master Pi node and on my development notebook, for my convenience tinkering with the cluster. Check if you can access the cluster by listing the nodes:

$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
kubernetes-master-1 Ready master 11m2s v1.18.5

9. Install the Container Network Interface (CNI). This is required to make virtual networks between nodes. Flannel is a light-weight solution which has an arm64 implementation, which is suitable for my cluster. Apply it with the one-liner:

$ curl -sSL https://raw.githubusercontent.com/coreos/flannel/v0.12.0/Documentation/kube-flannel.yml | kubectl apply -f -

Verify that the installation succeeded by checking on pod statuses for coredns and kube-flannel. These should get a Running status after a while, so keep checking with the following command:

$ kubectl get po -n kube-system
NAME READY STATUS RESTARTS AGE
coredns-f9fd979d6-vb8hq 1/1 Running 0 2m31s
coredns-f9fd979d6-wtkcv 1/1 Running 0 2m31s
etcd-k8s-m-1 1/1 Running 0 2m38s
kube-apiserver-k8s-m-1 1/1 Running 0 2m38s
kube-controller-manager-k8s-m-1 1/1 Running 0 2m38s
kube-flannel-ds-arm64-sncmx 1/1 Running 0 54s
kube-proxy-7fx9t 1/1 Running 0 2m31s
kube-scheduler-k8s-m-1 1/1 Running 0 2m38s

10. Add the other Pis to join the cluster: Repeat step 1–7 for the other Pis, and run the join command:

$ sudo kubeadm join 192.168.3.10:6443 --token 17b49p.jpram6b1rpj579w4     --discovery-token-ca-cert-hash sha256:ee70e44ea9f07285b10dee9c72c3ef56a93bb002eba9eb145d48666958f49801

For every node that joins, a flannel pod is created on the new node, and should get a Running status. Check on your master node or development laptop (sample log for 2 Pis):

$ kubectl get po -n kube-system
NAME READY STATUS RESTARTS AGE
coredns-f9fd979d6-vb8hq 1/1 Running 0 15m
coredns-f9fd979d6-wtkcv 1/1 Running 0 15m
etcd-k8s-m-1 1/1 Running 0 15m
kube-apiserver-k8s-m-1 1/1 Running 0 15m
kube-controller-manager-k8s-m-1 1/1 Running 0 15m
kube-flannel-ds-arm64-s96gz 1/1 Running 0 63s
kube-flannel-ds-arm64-sncmx 1/1 Running 0 14m
kube-proxy-7fx9t 1/1 Running 0 15m
kube-proxy-t864p 1/1 Running 0 63s
kube-scheduler-k8s-m-1 1/1 Running 0 15m

11. Install the Kubernetes Dashboard, the proof that the cluster is running OK. For me, this is the indicator that all previous steps are successful. The dashboard provides a useful web user interface to collect information about your cluster, and also allows you to manage the resources. The default installation one-liner works for arm64:

$ kubectl apply -f https://raw.githubusercontent.com/kubernetes/dashboard/v2.0.3/aio/deploy/recommended.yaml

Check that the dashboard deployed correctly by checking the status of the pods, it should be Running:

$ kubectl get po -n kubernetes-dashboard
NAME READY STATUS RESTARTS AGE
dashboard-metrics-scraper-7b59f7d4df-xjgvz 1/1 Running 0 68s
kubernetes-dashboard-5dbf55bd9d-qb8lr 1/1 Running 0 69s

12. To access the dashboard, you will need to create a user and get an access token. This official guide details how to do that. I’ll summarize the steps to do so.

Make a file dashboard-sa.yaml, and apply it with kubectl apply -f dashboard-sa.yaml:

apiVersion: v1
kind: ServiceAccount
metadata:
name: admin-user
namespace: kubernetes-dashboard

Similarly, do so for dashboard-crb.yaml :

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: admin-user
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: cluster-admin
subjects:
- kind: ServiceAccount
name: admin-user
namespace: kubernetes-dashboard

Obtain the access token by running:

$ kubectl -n kubernetes-dashboard describe secret $(kubectl -n kubernetes-dashboard get secret | grep admin-user | awk '{print $1}')

This will output the details of the service account token, with a long token string in the data part. You will need to copy and save that for usage with the dashboard.

To access the dashboard, run in a separate terminal window:

$ kubectl proxy

Leave this running, and open in a web browser:

http://localhost:8001/api/v1/namespaces/kubernetes-dashboard/services/https:kubernetes-dashboard:/proxy/

It should prompt with a login screen:

Login screen, image from the Kubernetes documentation

Select token, paste your access token, press sign in. You are now logged into the Kubernetes dashboard.

13. For the final step, I want the dashboard to display simplified resource usage statistics. This is done by metrics-server, a system to collect CPU and memory usage from pods, and tools to act upon changes of it, for example auto scaling policies. First, download the configuration:

$ curl -LO https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.3.7/components.yaml

In your favorite editor, open components.yaml, and locate the metrics-server deployment. At spec.template.spec.containers.args, add the following elements to the list:

- --kubelet-insecure-tls
- --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
- --metric-resolution=30s

Next, at spec.template.spec add the following property:

hostNetwork: true

Apply the configuration:

$ kubectl apply -f components.yaml

After a few minutes, CPU and memory usage counters and graphs show in the Dashboard, and one of my favorite simple usage commands work:

$ kubectl top nodes
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
kubernetes-master-1 659m 16% 2172Mi 58%
kubernetes-worker-1 612m 15% 2872Mi 77%
kubernetes-worker-2 523m 13% 2685Mi 72%
kubernetes-worker-3 871m 21% 2834Mi 76%
kubernetes-worker-4 505m 12% 2402Mi 65%
kubernetes-worker-5 593m 14% 2432Mi 65%

I use this a lot to check up on the load of the Pis, especially when running pipelines and building images, which I describe in Step 5.

Next: Set up Persistent Volumes with Rook and Ceph

Part 3: Set up Persistent Volumes with Rook and Ceph

Now that the cluster is set-up and running, you are off to host any software on your basic Pi Kubernetes setup. In reality, most production software configurations on Kubernetes use Persistent Volumes to store data, for usage with databases, simple file storage, and more. If you are like me, and want to simulate a production environment as close as possible, you’re going to need this. Cloud solutions like AWS, Azure and GCP offer Persistent Volumes out of the box, and can be used immediately without caring much about the availability and performance. In my case, I use Rook with Ceph to make Persistent Volumes work.

For my setup, I use decently large 128GB drives, because the Ceph cluster allows for data replication across nodes for availability purpose. Additionally, I usually over-provision volumes. Although I have a total of 640 GB to use, If I configure a 30GB volume, 90GB will be provisioned to ensure availability. This is useful in case of node failure. However, it is possible to over-provision your Ceph-cluster, allocating much more than the 640GB available. Filling up the volumes may cause the cluster to become unstable.

Setting up Rook with Ceph is straightforward, but requires a few configuration changes to make it work with arm64. I use the Rook with Ceph quickstart documentation.

$ sudo lsblk -f
NAME FSTYPE LABEL UUID FSAVAIL FSUSE% MOUNTPOINT
[...]
sda
├─sda1 vfat system-boot B726-57E2 105.7M 58% /boot/firmware
└─sda2 ext4 writable 483efb12-d682-4daf-9b34-6e2f774b56f7 52.3G 7% /
sdb
├─sdb1 vfat EFI 67E3-17ED
└─sdb2 vfat UNTITLED C29D-16F3

It shows the boot USB thumb drive as sda, and the data USB thumb drive as sdb. Now, I need to remove the default partition(s) that are set-up when bought new. To do, enter:

$ sudo sgdisk --zap-all /dev/sdb

Clearing the thumb drive entirely makes it ready for use with Ceph. If you check again, sdb shows as empty:

$ sudo lsblk -f
NAME FSTYPE LABEL UUID FSAVAIL FSUSE% MOUNTPOINT
[...]
sda
├─sda1 vfat system-boot B726-57E2 105.7M 58% /boot/firmware
└─sda2 ext4 writable 483efb12-d682-4daf-9b34-6e2f774b56f7 52.3G 7% /
sdb

2. Clone the repository, change configuration to work with arm64. Enter:

$ git clone --single-branch --branch release-1.4 https://github.com/rook/rook.git
$ cd rook/cluster/examples/kubernetes/ceph

The default configuration of Rook uses CSI (Container Storage Interface) images hosted on quay.io. These are mostly built for x86, so I need to change these to use arm64 versions. Fortunately, the Raspbernetes repository has images specifically built for Raspberry Pis and arm64. To do so, open operator.yaml, enable unsupported ceph-csi images on line 44:

ROOK_CSI_ALLOW_UNSUPPORTED_VERSION: "true"

At line 48, uncomment and change the image paths:

ROOK_CSI_CEPH_IMAGE: "raspbernetes/ceph-csi:v3.1.0-arm64"
ROOK_CSI_REGISTRAR_IMAGE: "raspbernetes/csi-node-driver-registrar:1.3.0"
ROOK_CSI_RESIZER_IMAGE: "raspbernetes/csi-external-resizer:0.5.0"
ROOK_CSI_PROVISIONER_IMAGE: "raspbernetes/csi-external-provisioner:1.6.0"
ROOK_CSI_SNAPSHOTTER_IMAGE: "raspbernetes/csi-external-snapshotter:2.1.1"
ROOK_CSI_ATTACHER_IMAGE: "raspbernetes/csi-external-attacher:2.2.0"

3. Deploy the Rook operator:

$ kubectl create -f common.yaml
$ kubectl create -f operator.yaml

## verify the rook-ceph-operator is in the `Running` state before proceeding
$ kubectl -n rook-ceph get pod
NAME READY STATUS RESTARTS AGE
rook-ceph-operator-775d4b6c5f-pf79c 1/1 Running 0 2m46s
rook-discover-kgssc 1/1 Running 0 45s

The pods associated with Rook should be in a Running status. Then, deploy the Ceph cluster. The default values in cluster.yaml suffice for clusters with three or more workers. Apply the cluster configuration:

$ kubectl create -f cluster.yaml
## to verify:
$ kubectl -n rook-ceph get pod

After fifteen minutes or so, the csi pods should show up and get a Running status.

If all pods in the rook-ceph namespace are Running (and jobs Completed), you are ready to configure a Storage Class to use the Rook-Ceph pool you just created.

4. Set up the block pool and storage class, and set it as default. From the ceph directory, enter:

$ kubectl apply -f csi/rbd/storageclass.yaml

The rook-ceph-block storage class should show up in the Kubernetes dashboard. For convenience, set it up as default storage class:

$ kubectl patch storageclass rook-ceph-block -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'

This allows you to ignore the storage class definition of your Persistent Volumes, one less configuration item to change between your Pi cluster and your production cluster.

5. Test the setup with the example from the Rook with Ceph documentation. In the cluster/examples/kubernetes folder, open mysql.yaml with your editor, locate the wordpress-mysql deployment. Under spec.template.spec.containers, change the image: mysql:5.6 to image: mariadb. Unfortunately, mysql doesn’t support arm64 yet.

Similarly for wordpress.yaml, change the wordpress deployment image from wordpress:4.6.1-apache to wordpress:5-apache.

Afterwards, apply the configurations:

$ kubectl create -f mysql.yaml
$ kubectl create -f wordpress.yaml

Both these apps should make two Persistent Volume Claims, and can be checked by entering:

$ kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESSMODES AGE
mysql-pv-claim Bound pvc-95402dbc-efc0-11e6-bc9a-0cc47a3459ee 20Gi RWO 1m
wp-pv-claim Bound pvc-39e43169-efc1-11e6-bc9a-0cc47a3459ee 20Gi RWO 1m

It can take a few minutes for the volumes to show up, and get bound by the pods.

Your Pi cluster is now set up for use with Persistent Volumes.

If you seek for more detailed configuration options, this post covers more information.

Next: Setup MetalLB to access your services

Part 4: Set up MetalLB to access your services

Now that your Pi cluster is set up and able to use Persistent Volumes for data storage, I want to access services running in my cluster from other computers in my local network. Currently, all configured services get a cluster-ip which is only accessible from the Pis themselves. If you still have the WordPress example running from previous step, you can check this with:

$ kubectl get svc wordpress
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
wordpress LoadBalancer 10.99.152.63 <pending> 80:30763/TCP 4m

The Cluster-IP assigned to the service is a virtual network address, which you cannot reach from the rest of your local network. It also shows a <pending> External-IP, which will never be fulfilled, the part where MetalLB comes in.

MetalLB acts as a ‘virtual’ load balancer, which is created automatically, just like cloud vendors do when you configure a service as type: LoadBalancer. With MetalLB, no physical external load balancer is provisioned, but does it virtually within the cluster itself. It works for arm64, perfect for my development Pi cluster.

MetalLB also supports Load Balancing features using BGP which talks with my UniFi Security Gateway router. This is very exciting to configure and try out, but I’m not interested in configuring a high-performance environment for my development needs.

Setting up MetalLB is easy and requires little configuration. I use snippets from this detailed blog post.

The only requirement for this set-up is that you have a small range of IP addresses to spare on the network of your Pis. In my case, I use 192.168.3.100 — 192.168.3.199.

$ kubectl apply -f https://raw.githubusercontent.com/metallb/metallb/v0.9.3/manifests/namespace.yaml
$ kubectl apply -f https://raw.githubusercontent.com/metallb/metallb/v0.9.3/manifests/metallb.yaml
$ kubectl create secret generic -n metallb-system memberlist --from-literal=secretkey="$(openssl rand -base64 128)"

This will start the MetalLB deployment, but it won’t work until the IP address range is set up in a ConfigMap. Make a file metallb-config.yaml with the following content:

apiVersion: v1
kind: ConfigMap
metadata:
namespace: metallb-system
name: config
data:
config: |
address-pools:
- name: default
protocol: layer2
addresses:
- 192.168.3.100-192.168.3.199 # change this to your own range

Apply:

$ kubectl apply -f metallb-config.yaml

MetalLB automatically activates, and accepts services with type: LoadBalancer.

If you still have the WordPress example running, it should automatically obtain an external IP now. To check this out, look in the Kubernetes Dashboard or execute:

$ kubectl get svc wordpress
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
wordpress LoadBalancer 10.99.152.63 192.168.3.100 80:30763/TCP 5m

Here you go, you should now be able to access the WordPress example site from any machine on the same network with IP address 192.168.3.100.

Well, that was unexpectedly easy to do.

Next: Organize workflow with GitLab

Part 5: Organize workflow with GitLab

I arrived at the point that my cluster and it’s tools are ready for use as a development cluster. To summarize, you should have:

With this set up, I am going to extend my development street to use the Pi cluster to build and deploy artifact images. I use GitLab to host my private repositories, mainly because it comes with Pipeline tools I like the most. It allows me to organize how to test and build my software, make artifact images and deploy it to Kubernetes clusters. I assume basic knowledge how to use GitLab and how Pipelines work.

For this guide, I am using a full-stack JavaScript web app as example. This consists of a NodeJS API as backend and an Angular.IO app as front-end. Both have their own docker image, which I am building in the pipeline.

Before, a common pipeline of mine would look like:

I left out steps for additional things like software testing and hardening. Unless you are planning to use an arm64 production cluster, this will not work in my desired situation with my new Pi cluster. I cannot re-tag the develop image for production, as the images made for my arm64 Pi cluster won’t run on a x86 production cluster. Thus, I need to build images separately for arm64 and x86.

Although the free GitLab runners might support cross-architecture building, I am not exploring this because using the buildx feature is experimental, and building images can cost fair amounts of time, which is limited in the free tier. I have six Pis with arm64 quad-cores, why not try to use those instead?

So the change in the pipeline is simple:

Left aside the additional x86 image build step, this looks pretty straightforward. In reality, it is not quite like that. You will find out later on.

To make this work, I need to configure and install gitlab-runner and docker-in-docker on the Pi cluster. For the former, I modify and use the official Helm chart. Here are the steps:

$ curl -fsSL -o get_helm.sh https://raw.githubusercontent.com/helm/helm/master/scripts/get-helm-3
$ chmod 700 get_helm.sh
$ ./get_helm.sh

2. Download and modify the values.yaml configuration file for gitlab-runner. The original chart repository is here; my changed helm values file is here.

$ curl -s https://gist.githubusercontent.com/remcohendriks/339594095369db210429fdcb3cdf4434/raw -o values.yaml

I won’t discuss all changes, a list of important changes:

3. Set up the runner registration token (line 25). You will find this token at Settings -> CI/CD -> Runners, in your repository or group in GitLab.

If you run your own GitLab server, you will need to change the registration URL at line 19.

I specifically do not set up caching at this point. I’ve tried several methods, and chose for caching using a shared docker-in-docker service next to the GitLab runners, which works for docker layer caching. If you want artifact caching between pipeline steps, you can set up a S3 cache using Minio in your cluster.

For all other configuration options, refer to the GitLab runner configuration page.

4. Install gitlab-runner with Helm. First, add the GitLab repository:

$ helm repo add gitlab https://charts.gitlab.io

Next, install the chart with your values file:

$ kubectl create ns gitlab-runner 
$ helm install --namespace gitlab-runner gitlab-runner -f values.yaml gitlab/gitlab-runner

It should set up the gitlab-runner namespace with the runner deployment. When the runner pod is running, it should automatically register itself in GitLab, visible in the GitLab Runners settings section:

Two private group runners set-up, one for arm64 and one for x86 (named arm64 here)

4. Set up service account to allow the runner to manager Kubernetes resources. Although you won’t need this for building images, I use the runners for setting up resources in my cluster as well. In the values.yaml file, on line 271 the service account name is gitlab-sa, which doesn’t yet exist. Add the role by applying:

$ kubectl apply -f https://gist.githubusercontent.com/remcohendriks/c6529d3b5d14be2cf7d41f7881808f72/raw

This is enough to build images, so you can continue with the next step. If you’re interested in managing cluster resources with the runners, you’ll need to attach a role to the service account. The simplest to do is attach a cluster-admin cluster role binding, which gives access to the runners to manage all resources in the cluster. This is generally a bad idea, as there’s no security or boundaries what the runners can do with your cluster. This extends to anyone who’s able to access your repository on GitLab and commanding pipelines of their own. Otherwise, you can create a role with the appropriate permissions for the runner. If you want to apply cluster-admin role, do:

# Don't do this in production:
$ kubectl apply -f https://gist.githubusercontent.com/remcohendriks/b7423c888879af05bb8debb722dd39a3/raw

5. Set up the docker-in-docker service to enable docker layer caching. This is a specific solution to a big problem building docker images with different architectures. I summarize the differences briefly, to give an idea what the intricacies are when adding package dependencies in a Dockerfile.

In my front-end Angular.IO app, I have node-sass as dependency. Looking up the release artifacts page on GitHub, there are pre-built bindings for x86, but not for arm64. When installing node-sass on a x86 computer, it will automatically detect and download the right binding. On unlisted architectures, it builds it’s own binding using node-gyp. The good of this is, it builds automatically if the operating system has Python and general build tools installed (e.g. build-essential for Ubuntu, build-base for alpine). The bad of this is, it takes forever to build. Below is an excerpt of a build log.

Step 5/10 : RUN npm ci
---> Running in 1a2cb8ad725a
[...]
> node-sass@4.12.0 install /app/node_modules/node-sass
> node scripts/install.js
Downloading binary from https://github.com/sass/node-sass/releases/download/v4.12.0/linux-arm64-72_binding.node
Cannot download "https://github.com/sass/node-sass/releases/download/v4.12.0/linux-arm64-72_binding.node":
HTTP error 404 Not Found
[...]
> node-sass@4.12.0 postinstall /app/node_modules/node-sass
> node scripts/build.js
Building: /usr/local/bin/node /app/node_modules/node-gyp/bin/node-gyp.js rebuild --verbose --libsass_ext= --libsass_cflags= --libsass_ldflags= --libsass_library=
[... 800 lines of gyp build log ...]gyp info ok
Installed to /app/node_modules/node-sass/vendor/linux-arm64-72/binding.node
[...]
added 1986 packages in 737.857s

That’s 12 minutes of installing packages for arm64. In comparison, the same build on x86:

Step 5/10 : RUN npm ci
---> Running in 6124215968d3
[...]
> node scripts/install.js
Downloading binary from https://github.com/sass/node-sass/releases/download/v4.12.0/linux_musl-x64-72_binding.node
Download complete
Binary saved to /app/node_modules/node-sass/vendor/linux_musl-x64-72/binding.node
Caching binary to /root/.npm/_cacache/node-sass/4.12.0/linux_musl-x64-72_binding.node
[...]
added 1986 packages in 18.086s

No build required for x86, 12 minutes faster. And it hasn’t started building the Angular.IO artifact yet. For every push, I need to wait 34 minutes to build an image:

That’s more than one coffee of waiting.

I sorely need docker layer caching in the build step, that works on my Pi cluster.

Unfortunately, the issue post on GitLab is old, long, and not really resolved for distributed runners. The directions in the official guide are correct, but does not work for distributed runners either, because the cache gets deleted with the pod it’s in after each run. A workable solution is mentioned in the middle of the issue post, using a separate docker-in-docker service alongside the gitlab-runner in your cluster, which I am going to use.

Essentially, the docker images are to be built by a perpetually running docker-in-docker service, backed by persistent storage to use as cache. Each pipeline job uses the docker-in-docker service as host, sharing the compute resources, while it can run multiple jobs in parallel.

To set up the docker-in-docker service, apply this gist:

$ kubectl apply -f https://gist.githubusercontent.com/remcohendriks/abb6bee55952837f33debe13882b7cf2/raw/

This starts one pod in the gitlab-runner namespace, next to the runner installed by the helm chart above. The deployment is not set-up to scale, this would cause a cache split by two pods, potentially missing cache layers per run.

Your runner(s) and cache are set-up now. In the next step, I verify it works with the speed upgrade from the cache service.

6. Set up an arm64 build job and verify it runs. As mentioned earlier, I have two images to build; one for back-end and one for front-end. For brevity, I demonstrate with only the build job for the front-end, which is the most complex of the two.

My .gitlab-ci.yaml file (left out stages for brevity):

image: docker:19.03.12services:
- docker:19.03.12-dind
stages:
- build-dev
build-app-dev:
before_script:
- docker login -u $CI_REGISTRY_USER -p $CI_REGISTRY_PASSWORD $CI_REGISTRY
stage: build-dev
variables:
DOCKER_HOST: tcp://dind:2375 # dind = service in k8s
script:
- docker build --build-arg CONFIGURATION=dev -t $CI_REGISTRY_IMAGE/app:arm64-develop ./app
- docker push $CI_REGISTRY_IMAGE/app:arm64-develop
tags:
- arm64

The build log:

$ docker build --build-arg CONFIGURATION=dev -t $CI_REGISTRY_IMAGE/app:arm64-develop ./app
Step 1/11 : FROM node:12-alpine AS build
---> 137cb187b393
Step 2/11 : ARG CONFIGURATION=development
---> Using cache
---> 0b993343edde
Step 3/11 : RUN apk add python-dev build-base
---> Using cache
---> c98ae4b33fea
Step 4/11 : WORKDIR /app
---> Using cache
---> d90bda790614
Step 5/11 : COPY package*.json ./
---> Using cache
---> c28772974aba
Step 6/11 : RUN npm ci
---> Using cache
---> 4884a88c4d1c
Step 7/11 : COPY . .
---> 28ac0277648b
Much better.

You’re all set with GitLab connected to your Pi cluster to build images.

Next: Comparison and conclusion

Part 6: Comparison and conclusion

To find out how well this setup integrates in to my daily workflow, I spend a couple of weeks using it to run several apps simultaneously. To compare performance I’ve benchmarked the Pi cluster against two AWS EKS clusters, using either arm64 or x86 instances. The goal here was to look at affordability, so I focussed on hardware I could get at a fixed price point. The only additional constraint was that the hardware needs to have at least 4 GB of memory, otherwise the cluster is not usable for the use-cases I looked at.

Benchmark

I test the performance of the cluster using sysbench, testing the cpu, memory and I/O performance of the environment. This runs in a single pod on ubuntu:latest. It’s a simple test, not running bare-metal and working on one node. It does not reflect the performance of the entire cluster, but it shows what you can get from a Pi configured to work in a Kubernetes cluster.

The clusters in the test:

For the EKS clusters, I choose 3 nodes per cluster, and 64GB root EBS volume.

Although the vCPU count varies across the clusters, I want to simulate budget conditions with a fixed 4 GB memory constraint. Both the m6g.medium and t3.medium are the minimum viable choice which both cost USD 31 and USD 33 per month, respectively (on demand, without EBS, eu-west-1).

To fire up a temporary, shell, do:

$ kubectl run my-shell --rm -i --tty --image ubuntu:latest -- bash

Install sysbench:

$ apt update && apt install -y sysbench

Run benchmarks:

# CPU benchmark:
$ sysbench cpu run --threads=4 --time=60 --cpu-max-prime=20000
# Memory benchmark:
$ sysbench memory run --memory-block-size=1M --memory-total-size=100G --time=60 --memory-oper=read
$ sysbench memory run --memory-block-size=1M --memory-total-size=100G --time=60 --memory-oper=write
# I/O benchmark:
$ sysbench --test=fileio --file-total-size=100M --file-extra-flags=direct --file-num=10 prepare
$ sysbench fileio run --file-num=10 --file-total-size=100M --file-test-mode=rndrw --file-extra-flags=direct --time=60

Results

+--------------------+---------+------------+-----------+
| | Pi 4 | m6g.medium | t3.medium |
+--------------------+---------+------------+-----------+
| CPU events/s | 1973.60 | 1063.70 | 658.89 |
| Memory read MiB/s | 4781.64 | 27168.50 | 20818.73 |
| Memory write MiB/s | 3203.49 | 12367.71 | 15289.32 |
| IO read MiB/s | 3.89 | 23.40 | 19.63 |
| IO write MiB/s | 2.59 | 15.60 | 13.08 |
+--------------------+---------+------------+-----------+

Looking at the results, the AWS instances are on average faster on all benchmarks. Especially the memory and I/O speed are comparatively outstanding. The CPU performance is noteworthy, but only because the Pi can use all four cores simultaneously in the test.

The big differentiator in this comparison is the quality of the hardware. My Pi cluster is made of inexpensive and power-efficient components, while the AWS instances are enterprise-grade. This is especially true for the hard disks, the cheap USB thumb drives are at least five times slower than the high-speed SSD EBS drives of AWS.

I am happy with the CPU performance. Although it has twice the cores of the t3.medium and four times more than the m6g.medium; it is nearly two times faster. Not bad for a budget cluster.

Cost

Owning a development cluster can be very costly if run on a commercial cloud provider. For exploring Kubernetes and the occasional tinkering, a set-up with 24/7 availability is remarkably expensive for the needs I have. I compare the run costs of my Pi cluster with a budget offering of AWS. Details of the comparison:

Total cost of AWS EKS:

5x Linux on m6g.medium @ 100% usage/mo, on-demand: USD 140.95
5x 64 GB EBS general-purpose SSD (gp2) per month: USD 32.00
1x AWS EKS cluster @ USD 0.10/hr, per month: USD 73.20
Total cost of services: USD 246.15

At USD 246.14 per month, the AWS EKS cluster has enough power and features use reliably as development cluster. For a solo developer, this is obviously not an economical choice, running a home-brew Pi cluster may be much more worthwhile at a one-off price of USD 594.85.

Other observations

Changing package dependencies to work with arm64 is cumbersome, as some do not have prebuilt binaries. A usual fix is to add the required compilers such as gcc or Python. However, building both x86 and arm64 images from one Dockerfile means unnecessary for one platform or the other. Separation of Dockerfiles into platform versions may work, but duplicates any maintenance.

While building an image with a large number of dependencies, the node became unresponsive, showing as NotReady with the kubectl get nodes command. I couldn’t reach it using ssh, and waited for 15 minutes to find it responsive again. Nothing to worry about.

To test for a more severe situation, I pulled the power plug out of one of the worker nodes. Like the situation described above, it reported as NotReady until I plugged it in again. The image building job expectedly failed, needing a manual restart. After about 5 minutes, everything was working again.

Conclusion and what’s next

I’m very happy with how this Pi Kubernetes cluster worked out for me. The biggest take-away was how reliable this setup ended up being. Once the cluster was up and running, I could confidently use it during long development sessions, making dozes of artifacts on the cluster without worrying that it would fail. After going through the process of building this cluster several times while writing this article and actively using it for a few weeks, I’m fully ready to recommend it as a great tool to work and play with.

After Kubernetes installation and setting up the resources, there’s plenty of interesting things to try next. Some topics I didn’t touch in this cookbook:

Thanks for reading, please let me know what you think of it!

Photo by Craig Dennis from Pexels

Thanks to Shabaz Sultan for reviewing

The Startup

Get smarter at building your thing. Join The Startup’s +793K followers.

Sign up for Top 10 Stories

By The Startup

Get smarter at building your thing. Subscribe to receive The Startup's top 10 most read stories — delivered straight into your inbox, once a week. Take a look.

By signing up, you will create a Medium account if you don’t already have one. Review our Privacy Policy for more information about our privacy practices.

Check your inbox
Medium sent you an email at to complete your subscription.

Remco Hendriks

Written by

Javascript Web Developer, DevOps Engineer, Mandarin Chinese Learner

The Startup

Get smarter at building your thing. Follow to join The Startup’s +8 million monthly readers & +793K followers.

Remco Hendriks

Written by

Javascript Web Developer, DevOps Engineer, Mandarin Chinese Learner

The Startup

Get smarter at building your thing. Follow to join The Startup’s +8 million monthly readers & +793K followers.

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store