How to Set Up a Raspberry Pi Cluster
Everything you need to buy, install, and stand up your own data center
This is the second article of the series described in Develop and Deploy Kubernetes Applications on a Raspberry Pi Cluster. After completing the steps outlined in this article, you’ll be ready to learn How to Install Kubernetes on a Raspberry Pi Cluster, the next article in the series.
As this article references a couple of other articles relevant to the task, you might find it helpful to have them open in another browser window so you can cross-reference between them and this one as needed.
There are five main sections to this article:
- My Target Cluster Topology — This section describes what we’ll build in this article.
- Initial Setup — This section describes what equipment is needed, how to install the OS, and how to complete the initial configuration of each Raspberry Pi.
- Configure the Network — This section describes the process to follow to configure the actual Raspberry Pi cluster. At the end of the section, we’ll have a working cluster.
- Set Up Keyless SSH Access Between Hosts— This section will cover how to configure key-less
sshaccess between the nodes of the cluster. It will also describe how to set up a reverse-
sshtunnel to a host in the external network to allow any permitted host in the external network
sshaccess into any of the permitting cluster nodes. This isn’t possible out of the box because the Pi router won’t forward IP traffic from the external network into the cluster network.
- Set Up i2cssh— This section covers how to have simultaneous terminal sessions open to several hosts. i2cssh is a terminal multiplexer similar to tmux. With i2cssh, you can have commands replicated to every terminal session. This is useful if you want to execute the same commands on multiple hosts at the same time.
I’ve also appended a list of references that I found helpful. They include information covered in this article, as well as supporting and background information.
This is going to take a while. So grab some coffee and dive in!
1. My Target Cluster Topology
The above diagram illustrates the target cluster, excluding software like MySQL, that will be built in this article. The details shown and described below match my specific configuration.
The intent of my Pi cluster is to be used as a way to deploy Kubernetes-hosted applications, and to make those applications’ services available and observable from an external network. As such, we need nodes with the following capabilities:
- A router node that will provide DHCP and routing capability for the Kubernetes worker nodes deployed on what is essentially a private subnet. I call this the cluster network. It also has two network interfaces, one each on the private and public subnets.
- A Kubernetes master node. This node provides all the Kubernetes control and administrative capabilities (e.g., cluster networking support, automated restart capabilities, and various administrative capabilities, such as deployment, application monitoring, and application management). In my topology, this node doubles as the router described above.
- A Kubernetes node to host common application capability such as Prometheus, Traefik, MySQL, Grafana, Telegraf, and InfluxDB (used by Telegraf). These applications will be covered in more detail in a later article.
- Kubernetes worker nodes. These are the nodes that host the application services and are actively managed by the Kubernetes master node.
The actual cluster topology is:
- “kubemaster” is the hostname of the Pi router. Its roles include the first three described above, namely as router, Kubernetes master node, and host for the common software applications such as MySQL. It has two network interfaces. The external facing IP address is static and is 10.0.0.100 on my home network. It’s internal, static, cluster IP address is 192.168.1.1.
kube-noden, where n is a unique number used to discriminate the individual nodes. These are the Kubernetes worker nodes. They will also run a MySQL client and Telegraf agent software.
2. Initial Setup
This section will supply a parts list, describe the OS installation process, and cover the required configuration steps.
As stated in my initial article in the series, I will not be replicating readily available information. In keeping with this philosophy, I’ll be referencing Tim Downey’s Unlimited Power! My Unstoppable Raspberry Pi Kubernetes Cluster. This article is an excellent description of what to buy and how to install the Raspbian OS.
In this section, I’ll present an outline of his article highlighting any differences between our two setups. Most notably, unlike his approach, I decided to defer installing Kubernetes until later in the process. Other than using 5 vs. 8 Pis, and using a Raspberry Pi 4B for the Pi Router, our clusters are identical. The next subsections cover setting up each Raspberry Pi in the cluster.
Parts list and rationale
I pretty much followed his parts list with the exception of the Raspberry Pi 4B for the Pi Router. I also used this host as the
kube-apiserver host. I initially tried using a Raspberry Pi 3B+ for this, but it was too underpowered to perform the role of
I did use a keyboard and monitor during the initial setup of all Pis except the Pi router (you can
ssh to it from the external network). You’ll need these as the instructions assume you won’t initially be logging on to the non-router Raspberry Pis via
ssh. You can get by without a keyboard and monitor if you use the Pi router to perform all admin activities. It may take a little work to determine which IP addresses, though. You’ll also have to disable Wi-Fi access on the non-router nodes later in the process.
Install Raspbian Stretch
I had some trouble using Etcher so I chose a different way to burn the OS image onto the SD cards. I used the native Mac command-line tools for accomplishing this, as documented on the Raspberry Pi image installation page. I found this approach easy to use. I also installed Buster Lite instead of Stretch Lite on the Pi 4B. The Pi 3Bs were set up with Stretch Lite as Buster Lite came out after I had an initial cluster set up.
As I was using a 64GB SD card on the Pi 4B, I had to take a different approach for formatting the card. SD cards over 32GBs have to be formatted differently. I followed the instructions at How to Format SD Card to FAT32 for Pi 3 by “Moses” to accomplish this.
I also chose not to perform a “headless” installation, and instead connected a keyboard and monitor directly to the Pis and configured them after initial boot-up. I did this because, for the most part, I didn’t want any of the Pis to connect to my wireless network. I didn’t want this to happen because I wanted to keep the Pis on a separate, somewhat protected, network managed by my Pi Router host. The exception to this rule is the Raspberry Pi I intend to configure as a router. As it will live on both networks, it will have an address on my wireless network. For this host, the wireless network address is statically assigned.
Configure hostname and change password
I used different hostnames than those described in the article. See the cluster topology section above for details.
One key item wasn’t mentioned in Tim’s article that is absolutely critical when installing Kubernetes on the cluster. In addition to the steps listed in Tim’s article, you’ll also need to do this if you’re going to proceed on to installing Kubernetes:
sudo systemctl disable dphys-swapfile
Give clustered Pis static IP addresses
I chose to defer this step until I had the Pi Router set up with DHCP configured to serve static IP addresses to the remaining Pis in the network. This will be covered in the next section.
rak8S Ansible playbook
I skipped this step since I wanted the experience of installing Kubernetes from scratch. To do otherwise would undermine my goal of learning more about how Kubernetes works, and for me, that includes the installation process. Installing Kubernetes will be covered in a later article.
3. Configure the Network
I stated earlier that I needed to provide a way to separate my Kubernetes Pi cluster (cluster network) from my home network (external network). This can be accomplished using a router. This type of router is also sometimes called a gateway, jump host, jump box, or jump server. This capability accomplishes several things. It provides:
- a way to configure a group of hosts into their own private network
- a DHCP server for the cluster
- some isolation of those hosts from other networks. (In my case, this router isn’t particularly secure, but it does provide the rudimentary capability I was looking for.)
- a secure way for hosts on an external network to gain access to hosts on the cluster network (i.e., via a reverse
- a way for the hosts in the cluster network to gain access to the Internet. This capability is needed for basic administration tasks such as installing software (e.g.,
Tim Downey has another excellent article on configuring a Raspberry Pi cluster titled Baking a Pi Router for My Raspberry Pi Kubernetes Cluster. As with the previous section, I’ll present an outline of his article highlighting any differences between our two setups. I’ll also explain some details described in his article that I initially didn’t understand.
This is already covered in the previous section.
Install Raspbian Linux
This is already covered in the previous section.
Static IP for Pi router on home network
Configure Ethernet interface on Pi router
The purpose of this section is to assign a static IP address to the Ethernet network interface facing the cluster network. Only a relatively simple change to the
/etc/dhcpcd.conf file is required. In Baking a Pi Router for My Raspberry Pi Kubernetes Cluster, the cluster is using the address space of
10.0.0.1/8. As the
10.0.0.0/8 address space is already used for my home network, I chose to use
192.168.1.1/24 instead. In my configuration I took defaults for everything modifying the static IP address assignment block to look like this:
Taking each line in turn:
interface eth0— this specifies that the following block pertains to the
static ip_address=192.168.1.1/24— In general, I understood that this line requests that the address of 192.168.1.1 be assigned to this host; however, I wasn’t familiar with the notation. It turns out this notation is part of the Classless Inter-Domain Routing (CIDR) specification. Put simply, 192.168.1.1/24 specifies that the network address is specified by the first three octets (24 bits) of the address, i.e., 192.168.1. This is denoted by the /24 suffix, which specifies the first 24 bits of the address identify the network. The final octet is used to identify specific hosts on the subnet. Identifier 1, as in 192.168.1.1, is reserved for this host. 192.168.1.1/24 also specifies that all identifiers between 1 and 255 are available for assignment to hosts on the network. Identifier 0 is reserved by the networking stack. See the References section below for more information on CIDR.
static domain_name_servers=22.214.171.124,126.96.36.199— These addresses reference Google’s public DNS servers.
From the dnsmasq website: “Dnsmasq provides network infrastructure for small networks: DNS, DHCP, router advertisement and network boot.”
Since my cluster is using a different address space, I needed to modify the
/etc/dnsmasq.conf file. My changes are highlighted in red.
listen-addressspecifies the address the DHCP server will listen on. In this case, I set it to the host’s Ethernet (i.e., wired) network address. Recall this was specified in the
/etc/dhcpcd.confas described above.
dhcp-rangespecifies that the address pool to be used for assignment is the range of 192.168.1.130 to 192.168.1.200. This range was chosen arbitrarily; it just sounded reasonable. It allows for up to 71 hosts in the network, which for my purposes is plenty.
dhcp-hostis used to assign static IP addresses to specific hosts on the network. To accomplish this, you’ll need the MAC address of each host. For example, b8:27:eb:bf:3a:92 is the MAC address of the host to be assigned the IP address of 192.168.1.130. Obtaining the MAC address can be accomplished as follows:
The MAC address is identified by the highlighted
ether entry. This needs to be run on each host that will have a static IP address assigned via DHCP.
Forward internet from Wi-Fi (
wlan0) to Ethernet (
This step is needed to allow internet access from the cluster network. I used this section as-is.
Test it all out
With the exception of
ssh-ing into each machine by hostname (I used IP addresses), I used this section as-is.
What didn’t work
Since all the above sections warned of issues along with specific recommendations for how to avoid them, my installation just worked.
I have the same advice. These exact steps may not work due to OS and hardware upgrades, but it should be relatively easy to find solutions to any problems you may encounter.
4. Set Up Keyless SSH Access Between Hosts
When this section is completed, you’ll be able to freely
ssh between the hosts in your cluster, and from selected hosts in the external network, with ease. This section is comprised of two sub-sections:
- Set up keyless SSH access
- Set up a reverse-SSH tunnel
Set up keyless SSH access
Some of the information in this section was taken from “How to configure passwordless login in Mac OS X and Linux.” It is very important to note that these steps are directional, i.e., they give keyless access from a specific host to a remote host — for example, from the cluster host that you are performing these steps on to a remote host.
The public key of a host’s user is copied to a remote host. During the copy process
ssh-copy-id, will be required to complete the normal authentication process to the remote host. After copying has completed, the public key that was copied is now considered authenticated. For more information on how the public/private keys are used during keyless access, see “Understanding the SSH Connection and Encryption Process.”
From the host permitting access, the steps are as follows:
ssh-keygen -t rsa— This will generate a private/public key pair. If you already have a public/private key pair, you can skip this step.
The key pair is used to provide mutual authentication between the hosts involved in the SSH session.
-t specifies that the generated key type is RSA.
There will be several prompts. The first will be the location/name of the generated pair. The defaults are
~/.ssh/id_rsa.pub (the public key). If you change these names, substitute the name you used for
~/.ssh/id_rsa wherever these names are used.
The next prompts will be for the key’s password. Take the defaults if you don’t want to have to enter a password every time you
ssh into the remote host (the point of this section is to avoid entering a password). Note: Empty passwords are considered a security risk, but in some cases, like this one, it is acceptable.
The gist below shows an example of
ssh-copy-id -i ~/.ssh/id_rsa.pub user@remotehost — This will copy the public key, in this example
id_rsa.pub, to the remote host.
In our cluster, you will be able to
ssh-copy-id from a host in the cluster network to a host in the external network, but you won’t be able to
ssh-copy-id from a host in the external network to a host in the cluster network yet. The next step, “Set up a reverse-ssh tunnel, enables this.
Repeat the above steps for each host in the network that you want to enable keyless access for. Note that Remote Host can also be in the cluster network.
Here’s a pictorial illustration of the above:
Set up a reverse-SSH tunnel
Having a reverse-SSH tunnel from a cluster host(s) to a host in the external network is incredibly useful, almost mandatory. I do all my cluster configuration from my MacBook. It’s possible to do this from the Pi router, but nowhere near as convenient, especially if you want multiple
i2cssh windows into the cluster.
The previous step, Set up keyless SSH access, is a prerequisite for this step. You’ll have to have already created and
ssh-copy-id’d the keys from the permitting hosts to the remote host(s) on the external network. The permitting hosts in my cluster are
kube-node4. On my network the host on the external network is located at
10.0.0.223 on the external network.
Some of the information below was taken from “Start AutoSSH on Boot.”
The steps are as follows:
sudo apt-get install autossh.
autosshis a tool that will set up a configured reverse-ssh tunnel.
autosshwill also monitor the tunnel and re-establish it if it fails. This step installs
autosshonto the Raspberry Pi. It’s possible to configure
autosshto set up the tunnel at boot time. This is what we’ll do next.
- Edit your
/etc/rc.localfile, adding the following line (with changes as appropriate):
autossh -o StrictHostKeyChecking=no -i /home/pi/.ssh/id_rsa -fN -R <port>:localhost:22 <user>@<remotehost>
Here’s what the above line specifies:
-oflag is used to specify configuration options that have no command line analog.
autosshto ignore the
~/.ssh/known_hostsfile, i.e., connect even if the host isn’t in
known_hosts. The See the Debian SSH_CONF and SSH main pages for more details.
autosshwhere to find the key file.
autosshto run in the background.
autosshthat there is no command to be run on the remote host, only to set up the tunnel.
-Ris the forwarding specification. In the above command,
<port>:localhost:22specifies which port on the remote host to use for the tunnel.
autosshto use port 22 to terminate the connection on this host.
On my cluster hosts, I use the following:
autossh -o StrictHostKeyChecking=no -i /home/pi/.ssh/id_rsa -fN -R 13003:localhost:22 firstname.lastname@example.org
13003 is arbitrary. It can be any unused port on the remote host, usually above
1024. Keep note of it, as it will be needed in a moment. Reboot the host to ensure
autossh starts up as expected on boot. To verify the tunnel works as expected, on the remote host (in my case
10.0.0.223 which is my MacBook), run:
ssh pi@localhost -p 13003
-p specifies the port used by the reverse tunnel. (Note:
localhost is used because we’re using the tunnel port on the current host, local host, to connect to the remote host via the tunnel port
You’ll be prompted for a password. What? The point of this is to be able to
ssh without a password! Well, we haven’t yet copied the public key of the host in the external network to the host in the cluster network. We couldn’t do this since we didn’t have connectivity between the two networks yet. The reverse-tunnel allows us to do that. So now we can enable keyless access from the external host to the cluster network:
ssh-copy-id -i ~/.ssh/id_rsa.pub pi@localhost -p <tunnelPort>
In this section the tunnel port is
13003, so the command will be:
ssh-copy-id -i ~/.ssh/id_rsa.pub pi@localhost -p 13003
ssh into the cluster host. If all went well, you won’t be prompted for a password.
ssh pi@localhost -p 13003
Repeat these steps for each cluster host from which you want to enable
autossh . You’ll need to choose a different tunnel port for each cluster host (e.g.,
kube-node2). This is because the host on the external network, e.g., my MacBook at
10.0.0.223, needs to have a unique tunnel port number for each cluster host it will
ssh into. In my configuration, I use ports
13003 from my MacBook. So for example:
kube-node1I specify port
kube-node2, I specify port
kube-node3, I specify port
13002. And so on.
One other thing I did to simplify my life was to configure my
~/.ssh/config to allow me to use shortcuts to
ssh into other hosts. Specifically, it provides a shorthand for
ssh'g over a reverse tunnel. Here’s my
Taking the first entry,
Host master ..., here’s what each line means:
Host— This is the alias for the specification that follows.
Hostname— This is the IP address for the local host to connect to.
Port— This is the port number on the remote host. Typically, this is port 22, the standard
sshport and the port we specified in the
User— This is the user to use for the login.
Here’s how you use it as a substitute for
ssh pi@localhost -p 13004:
5. Set Up i2cssh
This step will enable you to have terminal sessions open to several hosts at the same time. With i2cssh, you can choose to have commands you type replicated to every terminal window, or you can isolate the focus to a single terminal window. Setting up i2cssh isn’t strictly necessary, but I think you’ll find it useful.
gem install i2cssh
The above command assumes Ruby is already installed on your Mac, it should be. You can use Homebrew to install it if it isn’t. If you’re not familiar with Homebrew, you should take the time to get to know it. It’s pretty much required for developing software on a Mac.
To start an i2cssh session, using the
~/.ssh/config we set up previously, type:
i2cssh -m master,node1,node2,node3,node4 -p Rich -C 5 -b
Here’s what each item in the above command means:
-m— specifies a comma separated list of hosts to connect to
-p— specifies which iTerm2 profile to use. In this case, I have an iTerm2 profile named
Rich. The most prominent feature of this profile is that it has a yellow background as shown in the picture below.
-C— specifies the number of columns to use for the terminal display. The above command opens sessions to five hosts, indicating I chose to have a column per host.
-b— specifies to start the sessions “broadcasting” input from
stdin, i.e., the keyboard. This results in a command being entered into all terminal sessions simultaneously. “Broadcast” mode can be toggled on/off by
This (very small) image of the resulting sessions shows the result. It also shows the results of running
pwd simultaneously, in “broadcast” mode, in each terminal session.
Since I don’t like typing, I set up a shell alias for this:
alias csshall=”i2cssh -m master,node1,node2,node3,node4,pi-node1 -p Rich -C 6 -b”
Whew, that’s a lot to cover. But we now have a fully configured Raspberry Pi cluster. We accomplished the following:
- Performed the initial setup of each Raspberry Pi. This included installing the OS and setting the hostname.
- Configured the network. This included setting up DHCP so the cluster nodes would get static IP addresses assigned by the Pi router (
kubemasterin this article).
- Set up keyless access to all nodes, including a reverse-SSH tunnel between hosts in the cluster network and a host(s) in the external network.
- Installed and set up i2cssh, a terminal multiplexer.
These are the primary references that I used for the initial setup of the Raspberry Pi hosts, as well as for configuring these hosts into a cluster.
- Unlimited Power! My Unstoppable Raspberry Pi Kubernetes Cluster
- Baking a Pi Router for My Raspberry Pi Kubernetes Cluster
These references have additional material covering the initial configuration of a Raspberry Pi:
- Raspberry Pi Basics: installing Raspbian and getting it up and running
- Install Raspbian Buster Lite in your Raspberry Pi
These references pertain to the network aspect of setting up the cluster:
- Configuring a Raspberry Pi as a wireless access point — This document has good background information useful when setting up a Pi as a router.
- Building a Raspberry Pi Kubernetes Cluster — Part 1 — Routing — This article is very similar to the Tim Downey article I used as the basis for this article. There is some additional information, however. It’s also part of a series that has some overlap with this series.
- Basic CIDR overview from Wikipedia — This is a good overview of Classless Inter-Domain Routing.
- Network configuration — General networking configuration documentation, which includes DHCP configuration. This was useful for me to better understand network configuration, as well as the contents of the
- DHCP Overview — This is a good overview of DHCP. I especially liked the part about the discovery and assignment message exchange, as I had some trouble with address assignment not working. This helped me troubleshoot my problem (which I traced to my Ethernet hub, restarting it resolved the issue).
- How to locate the DHCP Server — This reference describes several techniques for locating the DHCP server on the network. I found this useful in troubleshooting (as I indicated in the previous bullet). For Raspbian, this command shows the message exchange between the client and the server for DHCP discovery and assignment. In my particular case, the
DHCPDISCOVERrequest failed. As a side note, the
DHCPDISCOVERmessage is broadcast to the entire subnet. Per the standard, the DHCP server will be listening on port 67, so only the host listening on that port will respond. Note in the exchange below, as expected, my DHCP server with the IP address of 192.168.1.1 responded as follows:
These references cover keyless SSH access and reverse-SSH tunneling:
- How to configure passwordless login in Mac OS X and Linux — This covers the information needed to be able to SSH into a host without providing a password (keyless access).
- Start AutoSSO on Boot — This covers the information needed to automatically set up an SSH reverse-tunnel on boot.
- SSH jump host — This describes an alternative to setting up a reverse-SSH tunnel to access hosts in another subnet. I didn’t choose to do this in my cluster, but it is an option. One of the pros of this approach is it’s relatively simple when compared to the approach I described. The pro with my approach is that it could be argued that it’s more secure.
- Simplify Your Life With an SSH Config File — This covers the various ways you can customize your
~/.ssh/configfile. I used this to configure shortcuts for my
- i2cssh and iTerm2