Kubernetes v1.4.4 Cluster on bare metal CentOS 7 nodes with SSL

Kubernetes Logo

I’m writing this guide more for my future self, as it is still fresh in my head. Some information — this guide will use the following software versions:

  • Kubernetes v1.4.4 (latest stable version available at the time of writing)
  • CentOS 7 operating system
  • weave-kube v1.7.x for Pod network overlay
  • kubernetes-dashboard v1.4
  • Heapster (including InfluxDB) addon
  • kube-dns addon

I begun with 3 Centos 7 VMs o my local network. I know, it is container in the container, but for my development purposes ir was fine. Since on each VM I had one “real” network interface, cluster was not complicated at all, no exposure to public network. I used an awesome tool called kubeadm (v1.5.0-alpha)— which dramatically simplified cluster initialization and joining to master nodes. All I had to do was:

3 node cluster was up in no time.
Now, initially I tried using the same approach on bare metal production nodes, but immediately ran into issues. I wanted to separate all infra-cluster communication on a separate VLAN and did not expose any of the communication to the public IPs, which were allocated to all my production nodes. All was good until I loaded the weave-kube DaemonSet which immediately went into the CrashLoop and did not provide an expected Pod network overlay. With no luck I tried re-create the cluster using several combinations of flags available to pass to kubeadm, but after spending day and a half on this, I gave up.

Then I found a guide on how to setup kubernetes cluster on bare metal nodes running CoreOS. Guide structure seemed OK, but the actual configuration values like ADVERTISE_IP, DNS_SERVICE_IP, SERVICE_IP_RANGE were unclear. I spent two days creating and tearing apart the same cluster multiple times, each time I got closer to a working thing. So, here is the final version of steps which one needs to take in order to fire up a Kubernetes cluster.


Cluster planning

Before even starting to play with Kubernetes, we need to plan the cluster. Cluster in this guide, will consist of 3 bare metal nodes, one master and two workers. All nodes has public IP addresses as well as private IP addresses which belongs to the same VLAN. Since Kubernetes will run in it’s own virtual network, we must plan it now, let’s make it 10.100.0.0/16. All nodes should be up and running with CentOS 7 by now. Also, the following packages should be installed: Docker, socat, ebtables. If any packages are missing, Kubernetes will complain when you try to install the Kubernetes RPMs.

Master node setup

  • ETCD Server
    yum -y install etcd should install the server. There are only two parameters you need to change in the etcd config file:
    ETCD_LISTEN_CLIENT_URLS, this should be set to the url: http://<master IP>:2379
    ETCD_ADVERTISE_CLIENT_URLS, his should be set to the url: http://<master IP>:2379
    After configuration is done, start the etcd with systemctl. I’m using single ETCD node for this setup, but I don’t see why this cannot be a multi-node cluster.
  • The Kubernetes RPMs
    Use the bash script below to download kubernetes-release GitHub repository, and build rpm packages. Once packages are built, install them on the node
  • Root certificates
    Generate self-signed root certificates for your master node using the bash script below

Script takes 3 input parameters. First parameter is master_ip — the real IP of your master node, other nodes will use to access it. This could be public or some internal VLAN IP. Second parameter is master_cluster_ip, now this is very important to IP address and it should be the first IP address of your virtual cluster IP address pool from earlier. If you pass the wrong IP address here, you will only be able to get the core Kubernetes components up, and nothing else will work, including all of the worker nodes — not so cool, huh? The last parameter is the certificate output directory, which should be set to /etc/kubernetes/ssl.

  • Systemd service
    Change the default kubelet service configuration to the one from the Gist below. Before doing systemctl daemon-reload, change the variables in this service: ${MASTER_IP} is the real IP address of your master node; ${DNS_SERVER_IP} should be an address from your virtual cluster IP pool from the Cluster Planning section, pick one. Note, I’m not sure if the kubeadm rpm installation or kubelet rpm installation will automatically install a drop-in for your systemctl kubelet.service, you should remove it before starting your kubelet service as it messes up with the configuration. The easiest way to check if the drop-in is present is to run systemctl status kubelet — it will show you if any drop-ins is picked up by the systemctl.
  • Kubernetes system components (/etc/kubernetes/manifests/)
    Kubernetes has the following main components: kube-apiserver, kube-proxy, kube-controller-manager, kube-scheduler. If you wish, read more about it in the Kubernetes documentation. There are several approaches to run these components. If you build Kubernetes from source, or download pre-built tarballs, you will get all these core components as separate binaries, which should be integrated into the systemd to run them properly. The other way, and IMHO better way is to run these core components as Kubernetes containers themselves on the master node. This way we don’t have to create systemd services for all of them and we can use a very cool thing called Hyperkube. It is a Docker image based on the CoreOS where all Kubernetes core components are built into one binary called hyperkube and wrapped in Docker image — very convenient. Image is continuously updated and can be found here.
    kube-apiserver.yaml Pod manifest (see below). Things you need to replace there are: 
    ${ETCD_ENDPOINT} — the ETCD server endpoint on master node from earlier
    ${SERVICE_IP_RANGE} — this is the Virtual cluster IP pool from earlier (10.100.0.0/16)
    ${MASTER_IP} — the IP address of your master node, always the same

kube-proxy.yaml Pod manifest (see below). It is good as-is, nothing needs to be changed.

kube-controller-manager.yaml Pod manifest (below). Good as-is again.

kube-scheduler.yaml Pod manifest (below). Good as-is.

  • Kubernetes addons (/etc/kubernetes/addons/)
    After Kubernetes core components manifest files are prepared, there is still work to do. There are 3 essential Kubernetes addons: kube-dns (DNS service), kubernetes-dashboard(monitoring, operations) and weave-kube (Pod overlay network).
    kube-dns.yaml Pod and Service manifest all-in-one (below). The only thing you need to change is the ${DNS_SERVICE_IP}, use the same IP as in kubelet.service earlier, this is very important.

weave-kube.yaml DaemonSet and Configuration manifest. This addon is very important, because it will automatically assign IP addresses for all your Pods

kubernetes-dashboard.yaml manifest

heapster addon. If you want to have nice charts of CPU and memory utilization in the Kubernetes dashboard, install this addon. All instructions are here: https://github.com/kubernetes/heapster

  • Master Startup
    After all this configuration, start your Kubernetes master node by executing the following commands

Worker node setup

Worker node needs way less components than master node, so the setup is way easier. Just follow the steps below to prepare your worker node and join it to the Kubernetes cluster.

  • Generate certificates
    Use root certificates generated on the master node to generate self-signed certificates for the worker node. It is very important to use the same root certificates as on master node, otherwise the worked node will not be able to join the cluster. Script takes 3 input parameters:
    worker_ip — IP address of your worker node
    worker_fqdn — hostname of your worker node
    output_directory — should be /etc/kubernetes/ssl
  • Install Kubernetes RPMs
    Use scp or other tools to transfer kubernetes rpms built on master node the the worker node. After transfer is complete, install the rpms.
  • Kubelet systemd service. Similar to the master kubelet.service except here we use the https and a 6443 port for security reasons. Things to change in this template:
    ${MASTER_IP} — IP of the master node
    ${DNS_SERVER_IP} — IP address of the DNS service from above
  • kube-proxy.yaml manifest for worker. As you can see in the template below, one thing needs to be updated to reflect your configuration:
    ${MASTER_IP} — IP address of your master node
    ${cluster ip pool} — virtual cluster IP pool from the Cluster Planning section
    this manifest should be copied to /etc/kubernetes/manifests directory on the worker node.
  • kubeconfig manifest
    This file tells to kubelet and kube-proxy about the certificates we generated earlier. This should be copied to /etc/kubernetes/ directory.
  • Start the worker node
    After all the configuration, it is the time to start your worker node. Before starting the worker node, Kubernetes cluster master node should be fully up (including “mandatory” addons described above in this guide).

After you start the kubelet service on the worker node it should automatically join the cluster, weave-kube should deploy a DaemonSet to the worker node indicating that the Pod network overlay works as expected. After 5 seconds, you can check if the node joined the cluster by issuing kubectl get nodes command on the master. It should display you all the nodes which are currently in the cluster.

Final thoughts

After I tried kubeadm (still developed continuously) to initialize the cluster it seems to me that it uses the same approach as described in this guide. All steps are automated for you and the final effort is dramatically reduced for the end user when it comes to setting up the cluster and joining worker nodes. But as the disclaimer of kubedm says: it is still in beta and should not be used for production systems. There is a reason why, one might thing — what could go wrong with it, if it works on my local cluster? The answer is — everything.

One thing is the network interfaces. While on simple setup with 1 network interface per node kubeadm performs beautifully, it struggles when master node has multiple interfaces and picks up the interface and IP address from the interface which has the default gateway route. Of course you can pass the advertise-ip flag to it, but it still does not work as expected. At this point in time, there is simply not enough configuration flags for kubeadm for it to be flexible and predictable. Until it is stable, I recommend to invest the time into manually configuring the cluster and have it exactly as you want it.