Geek Culture
Published in

Geek Culture

Bare Metal Kubernetes with MetalLB, HAProxy, Longhorn, and Prometheus

A few years back I created a blog post about creating a Kubernetes cluster with Rook/Ceph. My main goal at that time was to see if we could use Kubernetes as a clustering solution for our software stack on the edge. Although k3s worked great, everything else still had some rough edges.

For one, ARM support was still lacking for many dependencies, and MetalLB was still in its early days. HAProxy was not available for Kubernetes so the only choice was to use Traefik with k3s (which does not mean that Traefik is not a good option, it is just that we use HAProxy extensively and know it works well with our stack).

Things have moved on fast since then though! Installing MetalLB is dead simple, and HAProxy now also has support for Kubernetes. Moreover, the people at Rancher have developed Longhorn which is an excellent alternative to Rook/Ceph.

So let’s give everything a spin and see how it all works out. We will deploy k3s with MetalLB, HAProxy, Prometheus, and a test echo server on a 7 node ARM based cluster.

Prepare Cluster Nodes

We will create a 7 node cluster having 3 nodes as controllers and 4 nodes as workers. On each node we will install Ubuntu Server 20.04 LTS, and to help us install everything else we will be using Ansible.

We start with the following:

For convenience create a user admin on each node:

For password I used my-secret-pw but you should use whatever works for you. To make your life easy do keep it the same for each host. After running through this guide you can change it or disable it completely if you want.

Node1 Cluster Admin

We will use node1 to bootstrap our cluster. Log into node1 as user admin and install Ansible on it:

Next, if none exists yet, create a key-pair for user admin on node1 that you can use to send to all the other nodes for login:

Now use ssh-copy-id to copy over the keys of user admin to all the other nodes:

Now create a file called hosts with the following content:

Let’s test everything out:

Ansible on node1 should be able to ping all the other nodes without problems. With Ansible working we can now complete our initial preparation of the nodes for k3s installation.

First remove some software we will not need to save some resources:

For good measure update the software on each node to the latest:

Install k3s

Our nodes are now ready for installation of k3s. On node1 install k3s with the following:

This will install a kubernetes controller without the k3s provided servicelb and traefik. For load-balancer we will instead use MetalLB and for Ingress we will use HAProxy instead of Traefik. For K3S_TOKEN I used a not so secret my_super_secret. This token will be used later to connect the other nodes to the Kubernetes cluster, so change the token as you see fit.

With node1 now running the first controller, we can copy over the Kubnernetes configuration file over to admin's home directory so that we do not need to use sudo to run kubectl:

You can add export KUBECONFIG=~/.kube/k3s-config to ~/.profile so that the environment variable is set with every new shell.

With that out of the way we can install the remaining controllers on node2 and node3. We can do so using Ansible:

As you can see we use the same K3S_TOKEN as we defined earlier on node1. For server we now use a URL that points to the controller on node1. As. with node1 we also disable servicelb and traefik.

We can now install the workers:

Check if everything is running as expected:

You should see all the controllers and all the workers. From the output you will see that the controllers have a role but the workers do not. Let’s fix that by giving the workers a the appropriate role:

To control on what nodes we can deploy what, we will also add another label called node-type that we can use in deployment specs:

You can check all the labels given to all the nodes as follows:

Install Helm

With Kubernetes now running on all our nodes, follow the instructions from here:

I would recommend using the script install method.

Install MetalLB

With Helm now installed we can use Helm to install MetalLB:

With the MetalLB repo added we can install MetalLB, but before we do, create a configurations value file called metallb-values.yaml with the following content:

The above configuration will instruct MetalLB to use IP range to 250 to services that are marked as LoadBalancer types. You should change this of course to values that make sense for your environment.

Let’s now install MetalLB:

Excellent! MetalLB is now installed. We can now install HAProxy.

Install HAProxy

Like with MetalLB, we can use Helm to install HAProxy. First add the repo:

Before we install HAProxy however, there is an ongoing issue (#222) when installing on ARM hardware, you need to use the following command:

If you are using AMD hardware, you can use the following instead:

After installing HAProxy, you can change the service type from NodePort to LoadBalancer instead. You can do that by editing the service spec as follows:

This will open up the HAProxy service spec. Look for the type that currently is set to NodePort. Change this to LoadBalancer. Once changed, save your changes and exit the editor.

Install Longhorn

We can now install the shared storage service called Longhorn. Before we can install it we need to make sure our cluster has all the necessary requirements installed. First install iscsi on all the nodes:

Next install nfs-common on all nodes:

Now check if our cluster is ready:

The script will check our cluster if all requirements are satisfied. If something went wrong, check the Longhorn Install Guide on how to fix it. If you followed this guide then the script *should* have completed successfully.

Now we are ready to install Longhorn:

You can check the Longhorn Helm Install page for more info.

Check if Longhorn is running properly:

If Longhorn is running properly, go change the type of the longhorn-frontend from ClusterIP to NodePort:

Check the longhorn-system services to see what port number has been assigned for the longhorn-frontend app:

You should see something like this:

From the output we can see that longhorn-frontend is running on port 32057 on each cluster node. This means we can use any node IP to access it, for example:

Install Prometheus

Now that we have a cluster block device service available via Longhorn, we can deploy Prometheus. Why? Because we will have Prometheus place its database on a the cluster block device so that if Prometheus goes down on one node, it can safely restart on another node without (serious) data loss.

Let’s first create a namespace monitoring where we will deploy Prometheus:

Next we need to create a cluster role for Prometheus. Create a file called prometheus-role.yaml with the following content:

Create the RBAC role:

Now create a file called prometheus-config-map.yaml with the following content:

This will configure Prometheus to scrape metrics from deployments, services, etc. See the Prometheus Kubernetes Example for more information about what the above does.

Create the configuration map for prometheus:

Now create a Storage Claim for Prometheus. Create a file called prometheus-pvc.yaml with the following content:

We will only assign 1G of data as this is just a test cluster so that should be more than enough. In production you definitely would want more.

Create the storage claim:

Now we are ready to create the deployment spec for Prometheus. Create a file called prometheus-app.yaml with the following:

You should note a couple of things here. We are using a persistent storage volume which we will mount in the container under /prometheus. To ensure this mount gets the correct permissions, we set the securityContext (prometheus runs as user nobody which is UID 65534 with GID 65534). We also configure Prometheus to only retain 12h worth of data, and never more than 500MB. In production you should change this (be sure to have it match the size you gave for the persistent volume claim!). Lastly, we define the web.external-url to match the LoadBalancer IP of HAProxy as we want to have HAProxy front it (we will define this further in the ingress spec of prometheus). In production you would probably use a DNS name instead.

Create the prometheus deployment on the cluster:

Next create the service to expose the prometheus port, and tell it to scrape itself too. Create a file called prometheus-service.yaml with the following:

Create the service:

Now that we have a service, we can create an ingress. Create a file called prometheus-ingress.yaml with the following:

Here we instruct HAProxy to handle the ingress for Prometheus. Any request received by HAProxy that starts with path /prometheus will be forwarded to the prometheus service which we exposed on port 8080. The service will in turn forward it to port 9090 running inside the container.

Deploy the ingress configuration:

Now you should be able to access prometheus at the following URL:

Excellent! Now we have a running prometheus server that uses a clustered block storage. Although it already collects information about our cluster, it does not yet for any metrics generated by the nodes themselves (e.g. CPU utilization etc). For this we will need to install Prometheus Node Exporter.

Prometheus Node Exporter

Create a file called node-exporter-daemon.yaml with the following:

Where before the kind of spec was usually Deployment for others, here we will use DaemonSet. This will ensure the node exporter will run on all nodes, even if we add new nodes to the cluster.

Create the daemon set:

Now create a service so Prometheus can scrape from the node exporters. Create a file called node-exporter-service.yaml with the following:

Deploy it to the cluster:

Good! Now Prometheus will also give us all the metrics about each node in the cluster. Next up, let’s deploy a test application to our newly created cluster.

Test Echo Server

To test our cluster we will deploy a simple test echo server. We will deploy it in its own namespace called test:

Now create a file called echo-server-app.yaml with the following content:

Couple of things to note here:

  • The property replicas is set to 3 so we will create 3 instances of the echo server
  • A selector is defined to ensure the echo server only gets deployed on worker nodes.
  • We added a readinessProbe which will test each instance at the root path to see if the service is up.

Let’s deploy the app to the cluster:

Now we need to create a service that will expose the echo server service to the cluster. Create a file called echo-server-service.yaml with the following:

Some things to note here too:

  • We instruct Prometheus to also scrape the metrics from the echo server at path /metrics on port 9000.
  • We open up port 9000 on the cluster to expose the service inside the container also running on port 9000.

Create the service:

Now define an ingress configuration that will allow HAProxy to proxy requests for our echo server. Create a file called echo-server-ingress.yaml with the following:

Deploy it to the cluster:

Now test it! As our HAProxy runs at we should be able to access the echo server at

Useful Sources

To help me along creating this guide possible I used the following sources you should check out too:

Thanks for reading!



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store