Build Your Own Multi-Node Kubernetes Cluster with Monitoring

Building your own multi-node Kubernetes cluster (BYOC) from scratch with Grafana monitoring.

Syed Salman Qadri
Apr 1 · 12 min read
Photo by Scott Webb on Unsplash

At Seeloz, we’re building AI to reduce waste and inefficiencies in supply chain management. We’ve recently begun exploring migrating our big-data pipeline into Kubernetes. While we would highly recommended a startup use a managed Kubernetes solution, we’ve been given access to several powerful non-cloud VMs which we wanted to take advantage of.

In this piece we’ll walk you through building your own multi-node Kubernetes cluster (BYOC) from scratch, including setting up the network (via Calico) and storage volumes (via StorageOS) for your cluster. We will also setup the Kubernetes dashboard for visual orchestration, helm/tiller to help us deploy pre-packaged solutions, as well as Prometheus and Grafana to show detailed real-time monitoring and alerting for your cluster.

If you want to learn more about what Kubernetes actually is and what it can do for you, check out their official documentation:


Cluster Design Considerations

Cluster topology and the number of nodes

The first decision you need to make is the kind of cluster topology you wish to build. The simplest option requires just one node, and the most highly available topology requires a minimum of nine. Your decision will be based on various factors, from the number of Nodes you have available to your budgeting constraints. You have the following options:

  • Single-node clusters
    Building a single-node Kubernetes for production, such as using MiniKube, is not recommended. One of the key reasons one uses Kubernetes is its ability to handle node failures, so building a single-node cluster defeats much of the point of using Kubernetes. If you primarily want the many benefits of containerization, consider only installing Docker on your node or VM, and then running your services with a restart policy.
  • Single-master, multi-node cluster
    This is what we’re going to focus on in this piece. It means we will have a single Kubernetes master running on a node all by itself, and then three or more slaves (aka Minions), for a total minimum of four nodes:
  • HA (High-availability) cluster
    There are two options here: Stacked etcd and External etcd. For the Stacked approach, you need at least six nodes (three nodes for the control plane and at least three nodes for your minions). For the External etcd approach you would need at least nine nodes. You can read more about them here: https://kubernetes.io/docs/setup/independent/ha-topology/

In this piece, we will be building a single-master, multi-node cluster, with one master and three or more minions.


Node Specs

According to official documentation (link), each node in the cluster should have at least two CPUs and 2 GB of RAM. But depending on what you intend to run on the nodes, you will probably need more. You must also have at least Ubuntu 16.04.6 LTS, or CentOS 7.5+ (minimum requirements for some add-ons). However, we highly recommend using Ubuntu 18.04.2 LTS because that has a much more recent Linux Kernel version 4.15 (you may also consider the non-LTS Ubuntu 18.10 which has the latest containerization improvements from kernel 4.18). The newer Linux kernels contain enhancements to better support the world of containers, a world that did not exist in the same form back when Kernel 3.10 was released 6 years ago, which is what CentOS 7.x uses. Having said that, this guide is written against nodes that use CentOS 7.6, since this is what is on the VMs given to us. he main difference in the instructions will be related to using apt-get instead of yum.


Networking

The next most important decision you will make, is to decide on what networking solution you will use. Before you make this decision, you cannot even start a Kubernetes cluster. Unfortunately, it’s confusing as there is a long list to choose from, and there is no default or recommended one. While there are many official options to choose from (link), it seems that the popular ones are Calico, Flannel, and WeaveNet. Here’s a great article we found to help you make the decision:

At the end of this article the author concludes that the best overall networking solution is Calico, which has not just great performance, but also good security, resource utilization and setup experience. This guide will show you how you can setup Calico as your networking layer for Kubernetes.


Storage Volumes

Finally, in order to have a workable Kubernetes cluster, you must have a persistent storage class set up. Again, there’s a long list of options to choose from (link). As we went through the list, we found that each option had its own set of constraints. For example, if we want to use the gcePersistentDisk option, we can only do so if the nodes that run my cluster are Google Cloud VMs. At the same time, using the simpler local node storage option called HostPath is highly discouraged by Kubernetes:

HostPath (Single node testing only — local storage is not supported in any way and WILL NOT WORK in a multi-node cluster)

After scanning all the options, the ones that seemed most applicable for our use-case were these:

  • Glusterfs
    This is a popular free option that can turn any storage mounts you may have into a distributed file-system that your pods can use.
  • StorageOS
    This option is free up to 500 GB, but is much simpler to set up and get going with than Glusterfs or Portworx. The main issue with StorageOS is the lack of support for the ReadWriteMany policy, which means you cannot have multiple pods with write access to the same underlying storage.
  • Portworx
    This seemed like another decent option. However, it requires running several services on the VMs which we wanted to avoid and is generally a little heavy.

In this guide, we will use StorageOS as our storage volume solution mainly for its simplicity.


Setting up the Cluster

Here are the commands you should run on every node that you want to have in your cluster. Note that these instructions assume you have a CentOS 7+ node. Ubuntu instructions are mostly similar, except, of course, that you have to use apt-get instead of yum. Also some of the kubeadm installation instructions are a little different, which you can find here: link.

You should be able to just copy-paste the following on a CentOS node with no changes:

sudo -i                 # Become root
sudo yum update -y # Update all packages
# Install the yum-config-manager and add the repo to install docker
sudo yum install -y yum-utils device-mapper-persistent-data lvm2
sudo yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo

# Configure iptables for Kubernetes
cat <<EOF > /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
EOF
sysctl --system

# Add the kubernetes repo needed to find the kubelet, kubeadm and kubectl packages
cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://packages.cloud.google.com/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://packages.cloud.google.com/yum/doc/yum-key.gpg https://packages.cloud.google.com/yum/doc/rpm-package-key.gpg
exclude=kube*
EOF

# Set SELinux in permissive mode (effectively disabling it)
setenforce 0
sed -i 's/^SELINUX=enforcing$/SELINUX=permissive/' /etc/selinux/config

# Turn off the swap: Required for Kubernetes to work
sudo swapoff -a

# Install Kubernetes and Docker
sudo yum install -y kubelet kubeadm kubectl --disableexcludes=kubernetes docker-ce docker-ce-cli containerd.io

# Start Docker
sudo systemctl enable --now docker

# Start Kubernetes
systemctl enable --now kubelet

Additional instructions for the Master Node

# Make sure you are NOT root for these commands! But it must be a sudoer user.# For sanity, just disable the entire firewall until you've figured out exactly what services you'll want to install.
systemctl disable firewalld --now # Disable the firewall
# Start kubeadm
sudo kubeadm init --pod-network-cidr=192.168.0.0/16
### NOTE THE OUTPUT: It has the token you need to add workers!!! ###

After running the above command, in your output you should see a line that starts with ‘kubeadm join’. Copy that line — you will need that later to add nodes to the cluster.

Install a Pod Network (Calico)

Next, on our Kubernetes master we need to install a Pod Network. As discussed earlier, we will be using Calico. Here are the commands to run on your Kubernetes master to install Calico (from their Quickstart):

kubectl apply -f https://docs.projectcalico.org/v3.5/getting-started/kubernetes/installation/hosted/etcd.yamlkubectl apply -f https://docs.projectcalico.org/v3.5/getting-started/kubernetes/installation/hosted/calico.yaml

Install the Kubernetes Dashboard

# Install the K8s dashboard
kubectl create -f https://raw.githubusercontent.com/kubernetes/dashboard/master/aio/deploy/recommended/kubernetes-dashboard.yaml
# Create an admin account called k8s-admin
kubectl
--namespace kube-system create serviceaccount k8s-admin
kubectl create clusterrolebinding k8s-admin --serviceaccount=kube-system:k8s-admin --clusterrole=cluster-admin

Setup kubectl on your laptop / workstation

macOS

brew install kubernetes-cli

Ubuntu

sudo apt-get update && sudo apt-get install -y apt-transport-https
curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add -
echo "deb https://apt.kubernetes.io/ kubernetes-xenial main" | sudo tee -a /etc/apt/sources.list.d/kubernetes.list
sudo apt-get update
sudo apt-get install -y kubectl

CentOS

cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=
https://packages.cloud.google.com/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=
https://packages.cloud.google.com/yum/doc/yum-key.gpg https://packages.cloud.google.com/yum/doc/rpm-package-key.gpg
EOF

yum install -y kubectl

Setup a proxy to your workstation

Create a ~/.kube directory on your laptop and then scp the ~/.kube/config file from the k8s (Kubernetes) master to your ~/.kube directory.

Then get the authentication token you need to connect to the dashboard:

kubectl -n kube-system describe secret $(kubectl -n kube-system get secret | grep k8s-admin | awk '{print $1}')

Store this token somewhere safe but readily accessible; you’ll need it (frequently) to access the Kubernetes dashboard. Now start the proxy:

kubectl proxy

Now open the dashboard by going to:
http://localhost:8001/api/v1/namespaces/kube-system/services/https:kubernetes-dashboard:/proxy/

You should see the dialog above. Select ‘Token’ and then copy-paste the token from the prior step and sign in. Hopefully that was a success!


Setup Your Minion Nodes

# Setup your firewall settings
#sudo firewall-cmd --zone=public --add-port=10250/tcp --permanent # Kubelet API
#
sudo firewall-cmd --zone=public --add-port=30000-32767/tcp --permanent # NodePort Services
#
sudo firewall-cmd --reload
# For sanity, we'll just disable the firewall. Once you know exactly what services you want you can expose desired ports via firewall-cmd.
systemctl
disable firewalld --now # Disable the firewall

# Now paste the kubeadm join command from the kubeadm init command on master
sudo kubeadm join <master-cluster-ip>:6443 --token <something.token> --discovery-token-ca-cert-hash sha256:<hash>

Note that the join command output from the init command expires. If that happened to you, you can regenerate a new join command by going to your k8s master and typing this:

sudo kubeadm token create --print-join-command

Repeat the steps for every single node you wish to add to the cluster. Note that the join command will be the same for every node.


Installing Helm

Helm is like a package manager for Kubernetes. It will allow you to deploy a pre-packaged set of Kubernetes objects as a set, and you can even then remove the name set just as easily. For example, helm is what we’ll use to install StorageOS. To install helm, just run the following on your workstation:

macOS

brew install kubernetes-helm
helm init

Linux

The Snap package for Helm is maintained by Snapcrafters.

sudo snap install helm --classic

Installing StorageOS

We will use helm to install storageos. In the command below, replace node1–3 with the hostnames of each of your nodes:

helm install storageos/storageos --name=storageos --namespace=storageos --set cluster.join="node1\,node2\,node3"
ClusterIP=$(kubectl get svc/storageos --namespace storageos -o custom-columns=IP:spec.clusterIP --no-headers=true)
ApiAddress=$(echo -n "tcp://$ClusterIP:5705" | base64)
kubectl patch secret/storageos-api --namespace storageos --patch "{\"data\": {\"apiAddress\": \"$ApiAddress\"}}"

Now we need to create a new Kubernetes StorageClass based on StorageOS that we will use for all our PersistentVolume objects. For this one, let’s use the Kubernetes dashboard to see how we can deploy objects through that:

Start your proxy:

kubectl proxy

Go to http://localhost:8001/api/v1/namespaces/kube-system/services/https:kubernetes-dashboard:/proxy. If you haven’t recently logged in, you’ll see the following screen again:

Again, copy-paste the token you saved earlier. If you lost it, just run the following again:

kubectl -n kube-system describe secret $(kubectl -n kube-system get secret | grep k8s-admin | awk '{print $1}')

Once logged-in, press the “+ CREATE” button on the top-right of the dashboard:

Now in the Create from Text Input screen, enter the following and press the Upload button:

kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: fast
provisioner: kubernetes.io/storageos
parameters:
pool: default
description: Kubernetes volume
fsType: ext4
adminSecretNamespace: default
adminSecretName: storageos-secret

Note that in the StorageClass above, I call it ‘fast’ because the storage system on my cluster is all SSDs. If that’s not true for you or if you have a mix, it is best to create a StorageClass for each type of storage and name it appropriately.

Finally, let’s go back to the terminal and make the new StorageClass we made as the default one, so that subsequent deployments that need storage classes don’t have to explicitly know about it:

kubectl patch storageclass fast -p '{"metadata":{"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'

That completes the StorageOS installation. There is also a StorageOS dashboard, which you can access by running the following:

kubectl --namespace storageos port-forward svc/storageos 5705

Now just visit http://localhost:5705. Default username and password should be ‘storageos/storageos’:

That’s it! Our cluster is now ready to use. Next up, we’ll deploy a monitoring solution on top of our new cluster.


Setting up Cluster Monitoring

Now that we have a Kubernetes cluster up and running, we need a good monitoring system. In this guide, we’ll be using Prometheus to collect all the metrics and Grafana to visualize them. There is a great project called the prometheus-operator that gives us not just Grafana and Prometheus, but also pre-build dashboards for us to use. First clone the repo:

git clone git@github.com:coreos/prometheus-operator.git
cd prometheus-operator/contrib/kube-prometheus

Unfortunately, I found that I had to tweak it a little to allow me to preserve the state of any changes I do in Grafana, such as adding users or installing custom plugins. As such, before we start, I’m going to create a PVC (Persistent Volume Claim) for Grafana. Apply the following to your cluster (either via a yaml file or by copy-pasting into Kubernetes as shown earlier):

kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: grafana-pvc
spec:
accessModes:
- ReadWriteOnce
volumeMode: Filesystem
resources:
requests:
storage: 10Gi

You need to add the following initContainer to the spec section of your grafana deployment (located at contrib/kube-prometheus/manifests/grafana-deployment.yaml):

initContainers:
- name: "init-chown-data"
image: "busybox:1.30.0"
command: [
"chown",
"-R",
"472:472",
"/var/lib/grafana"
]
resources:
volumeMounts:
- name: "grafana-storage"
mountPath: /var/lib/grafana

Then in the volumes: section of that same file, for grafana-storage it is defined as emptyDir: {}. We need to change it to point at the grafana-pvc we created:

volumes:
- persistentVolumeClaim:
claimName: grafana-pvc
name: grafana-storage

I also had to remove the securityContext section of that grafana-deployment.yaml file.

That’s it! Now just apply it all into Kubernetes with the following commands:

kubectl create -f manifests/
sleep 30
kubectl create -f manifests/
kubectl apply -f manifests/
sleep 30
kubectl apply -f manifests/
kubectl --namespace monitoring port-forward svc/prometheus-k8s 9090 &
kubectl --namespace monitoring port-forward svc/grafana 3000 &
kubectl --namespace monitoring port-forward svc/alertmanager-main 9093 &

The sleeps in that script above is because, according to its documentation, it says This command sometimes may need to be done twice (to workaround a race condition).

To visit the prometheus dashboard, visit http://localhost:9090:

To visit the alertmanager dashboard, visit http://localhost:9093:

Finally, and most importantly, to visit Grafana go to http://localhost:3000. The default username and password should be “admin/admin”.

In Grafana, click on a pre-built dashboard, such as the one called “Kubernetes / Compute Resources / Cluster”:

Grafana truly is beautiful, but what’s even cooler about it is that it has so many pre-built plugins and dashboards. Some of the plugin installations require you to go to the Grafana machine and install them in the plugin folder (which we can now do thanks to the Persistent Volume we created for Grafana via StorageOS). To do this, you will first have to find out the name of the Grafana pod either via the Kubernetes dashboard or by running the following:

kubectl --namespace monitoring get pods

Then run this:

kubectl exec --namespace monitoring -t <pod_name> grafana-cli plugins install <plugin_name>

Thanks for reading!

Better Programming

Advice for programmers.

Syed Salman Qadri

Written by

VP Engineering, Seeloz. Xoogler. Former Yahoo. Former Microsoft.

Better Programming

Advice for programmers.

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade