Deploying Anthos Cluster on BareMetal -Beware Proxy Servers Involved -Part 2

Faizan Qazi
Google Cloud - Community
4 min readJan 3, 2023

Anthos clusters on bare metal is software to create, manage, and upgrade Kubernetes clusters on your own hardware in your own data center.

This article is Part 2 of a 3 Part series for deploying Anthos Cluster on BareMetal.
Part 1
Part 3

Let’s start with the steps from the documentation and explain any additional steps required for it to work as debugged by our team.

Prerequisites:

Compute:

OS:
CentOS | 8.2, 8.3, 8.4, 8.5
RHEL | 8.2, 8.3, 8.4, 8.5, 8.6
Ubuntu | 18.04, 20.04
Python | >=3.6
VM:(Admin)
Resource | Minimum | Recommended
CPUs / vCPUs | 2 core | 4 core
RAM Ubuntu | 4 GiB | 8 GiB
CentOS/RHEL | 6 GiB | 12 GiB
Storage | 128 GiB | 256 GiB

VM: (Control)
# It is recommended to have 3 control nodes for an HA cluster,
# however you can also opt for a single control node
# and as many worker nodes as needed by your application.
Resource | Minimum | Recommended
CPUs / vCPUs |4 core | 8 core
RAM |16 GiB | 32 GiB
Storage |128 GiB | 256 GiB

# Luckily we went ahead with the recommended configuration
# as we used the admin workstation for setting up
# a private registry as well. We will find out more about it later.

You should update your respective package managers as described here.

You shall also install bmctl gcloud cli gsutil and kubectl along with docker on your admin workstation.

To install docker on RHEL you have to do a little tweak as RHEL supports podman and preflight check will fail without having docker > 19.03 installed(which was the first tiny hiccup for us) as shown below

[2022-10-28 11:20:49+0000] Error creating cluster: error to parse the target cluster: error parsing cluster config: 1 error occurred:
* Docker checks failed: 1 error occurred:
* Docker version too old: got 4.1.1, want at least 19.03.0
# Install Docker on RHEl
sudo yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
sudo yum install docker-ce docker-ce-cli containerd.io
sudo yum install docker-ce docker-ce-cli containerd.io --allowerasing
# to install specific version
# yum list docker-ce --showduplicates | sort -r
# sudo yum install docker-ce-XX.YY.ZZ docker-ce-cli-XX.YY.ZZ containerd.io
sudo systemctl start docker
sudo groupadd docker
sudo usermod -aG docker $USER
docker --version

While it’s also good to read through the Node machine prerequisites here, I'd like to point out an additional soft requirement we found in one of the preflight checks, which was the space requirement on the /var location. To sum it up make sure the node VMs have the following storage space available

/                    | 17 GiB (18,253,611,008 bytes)
/var | 30 GiB (32,212,254,720 bytes)
# depending on the container runtime /var/lib/docker or /var/lib/containerd
/var/lib/docker | 30 GiB (32,212,254,720 bytes) for control plane nodes.
/var/lib/docker | 10 GiB (10,485,760 bytes) for worker nodes.
/var/lib/kubelet | 500 MiB (524,288,000 bytes).
/var/lib/etcd | 20 GiB (21,474,836,480 bytes, applicable to control plane nodes only).

Network:
The link here defines
1. the steps needed to configure your package manager to use the proxy server
2. the list of URLs that should be allow listed on the proxy server
3. the ports that shall be enabled on your VMs as defined here.

You shall also enable Passwordless root access to all cluster node machines through SSH as described here.

You can also reserve the CIDR block per node as per the following table

Maximum pods per node | CIDR block per node | Number of IP addresses
32 | /26 | 64
33 – 64 | /25 | 128
65 – 128 | /24 | 256
129 - 250 | /23 | 512

We also need a static Control Plane VIP(virtual ip) and an Ingress VIP for our cluster as needed by the loadbalancer. Additional info regarding the load balancer is as follows:

We shall also provision a loadbalancer that has the listen port 443 enabled as configurable in the cluster config file under loadBalancer.ports.controlPlaneLBPort.
It should also have all of the IP addresses of the cluster’s control plane nodes in the backend group that listens on port 6444 .
We also MUST set up a health check on the loadbalancer that monitors the backend nodes without which the cluster creation fails. I don’t know how to emphasize it more but don’t ignore it as the logs generated aren’t very specific about the failure reasons. Hence it should use HTTPS protocol and check the /readyz endpoint with the query strings as:

#  Send String
GET /readyz HTTP/1.1\r\nHost: \r\nConnection: close
# Receive String
HTTP/1.1 200

Storage:
These disks need to be formatted and mounted by the user, which can be done before or after cluster creation. Anthos clusters on bare metal clusters use the local volume provisioner (LVP) to manage local persistent volumes.
For better isolation, configuring disks through LVP node mounts is recommended.
You may also set up storage using CSI Provisioning.

Well, these were all the prerequisites needed and now is the time to go through all the debugging to have our Anthos Cluster behind an MITM Proxy Server in Part 3 of this series.

Disclaimer:

This Warning could be removed in the future as the product team has indeed raised a feature request to support MITM at the time of this article. Hopefully, it will be out soon.

--

--