How to Set Up a Raspberry Pi Cluster

Everything you need to buy, install, and stand up your own data center

Richard Youngkin
Mar 16, 2020 · 19 min read
Image for post
Image for post
Jonathan at Unsplash

This is the second article of the series described in Develop and Deploy Kubernetes Applications on a Raspberry Pi Cluster. After completing the steps outlined in this article, you’ll be ready to learn How to Install Kubernetes on a Raspberry Pi Cluster, the next article in the series.

As this article references a couple of other articles relevant to the task, you might find it helpful to have them open in another browser window so you can cross-reference between them and this one as needed.

There are five main sections to this article:

  1. My Target Cluster Topology — This section describes what we’ll build in this article.
  2. Initial Setup — This section describes what equipment is needed, how to install the OS, and how to complete the initial configuration of each Raspberry Pi.
  3. Configure the Network — This section describes the process to follow to configure the actual Raspberry Pi cluster. At the end of the section, we’ll have a working cluster.
  4. Set Up Keyless SSH Access Between Hosts— This section will cover how to configure key-less access between the nodes of the cluster. It will also describe how to set up a reverse- tunnel to a host in the external network to allow any permitted host in the external network access into any of the permitting cluster nodes. This isn’t possible out of the box because the Pi router won’t forward IP traffic from the external network into the cluster network.
  5. Set Up i2cssh— This section covers how to have simultaneous terminal sessions open to several hosts. i2cssh is a terminal multiplexer similar to tmux. With i2cssh, you can have commands replicated to every terminal session. This is useful if you want to execute the same commands on multiple hosts at the same time.

I’ve also appended a list of references that I found helpful. They include information covered in this article, as well as supporting and background information.

This is going to take a while. So grab some coffee and dive in!

1. My Target Cluster Topology

Image for post
Image for post
My cluster topology — images courtesy of HiClipart, Comcast, BlackBox, and Cleanpng

The above diagram illustrates the target cluster, excluding software like MySQL, that will be built in this article. The details shown and described below match my specific configuration.

The intent of my Pi cluster is to be used as a way to deploy Kubernetes-hosted applications, and to make those applications’ services available and observable from an external network. As such, we need nodes with the following capabilities:

  • A router node that will provide DHCP and routing capability for the Kubernetes worker nodes deployed on what is essentially a private subnet. I call this the cluster network. It also has two network interfaces, one each on the private and public subnets.
  • A Kubernetes master node. This node provides all the Kubernetes control and administrative capabilities (e.g., cluster networking support, automated restart capabilities, and various administrative capabilities, such as deployment, application monitoring, and application management). In my topology, this node doubles as the router described above.
  • A Kubernetes node to host common application capability such as Prometheus, Traefik, MySQL, Grafana, Telegraf, and InfluxDB (used by Telegraf). These applications will be covered in more detail in a later article.
  • Kubernetes worker nodes. These are the nodes that host the application services and are actively managed by the Kubernetes master node.

The actual cluster topology is:

  • “kubemaster” is the hostname of the Pi router. Its roles include the first three described above, namely as router, Kubernetes master node, and host for the common software applications such as MySQL. It has two network interfaces. The external facing IP address is static and is 10.0.0.100 on my home network. It’s internal, static, cluster IP address is 192.168.1.1.
  • n, where n is a unique number used to discriminate the individual nodes. These are the Kubernetes worker nodes. They will also run a MySQL client and Telegraf agent software.

2. Initial Setup

This section will supply a parts list, describe the OS installation process, and cover the required configuration steps.

As stated in my initial article in the series, I will not be replicating readily available information. In keeping with this philosophy, I’ll be referencing Tim Downey’s Unlimited Power! My Unstoppable Raspberry Pi Kubernetes Cluster. This article is an excellent description of what to buy and how to install the Raspbian OS.

In this section, I’ll present an outline of his article highlighting any differences between our two setups. Most notably, unlike his approach, I decided to defer installing Kubernetes until later in the process. Other than using 5 vs. 8 Pis, and using a Raspberry Pi 4B for the Pi Router, our clusters are identical. The next subsections cover setting up each Raspberry Pi in the cluster.

Parts list and rationale

I pretty much followed his parts list with the exception of the Raspberry Pi 4B for the Pi Router. I also used this host as the host. I initially tried using a Raspberry Pi 3B+ for this, but it was too underpowered to perform the role of .

I did use a keyboard and monitor during the initial setup of all Pis except the Pi router (you can to it from the external network). You’ll need these as the instructions assume you won’t initially be logging on to the non-router Raspberry Pis via . You can get by without a keyboard and monitor if you use the Pi router to perform all admin activities. It may take a little work to determine which IP addresses, though. You’ll also have to disable Wi-Fi access on the non-router nodes later in the process.

Install Raspbian Stretch

I had some trouble using Etcher so I chose a different way to burn the OS image onto the SD cards. I used the native Mac command-line tools for accomplishing this, as documented on the Raspberry Pi image installation page. I found this approach easy to use. I also installed Buster Lite instead of Stretch Lite on the Pi 4B. The Pi 3Bs were set up with Stretch Lite as Buster Lite came out after I had an initial cluster set up.

As I was using a 64GB SD card on the Pi 4B, I had to take a different approach for formatting the card. SD cards over 32GBs have to be formatted differently. I followed the instructions at How to Format SD Card to FAT32 for Pi 3 by “Moses” to accomplish this.

I also chose not to perform a “headless” installation, and instead connected a keyboard and monitor directly to the Pis and configured them after initial boot-up. I did this because, for the most part, I didn’t want any of the Pis to connect to my wireless network. I didn’t want this to happen because I wanted to keep the Pis on a separate, somewhat protected, network managed by my Pi Router host. The exception to this rule is the Raspberry Pi I intend to configure as a router. As it will live on both networks, it will have an address on my wireless network. For this host, the wireless network address is statically assigned.

Configure hostname and change password

I used different hostnames than those described in the article. See the cluster topology section above for details.

Disable swap

One key item wasn’t mentioned in Tim’s article that is absolutely critical when installing Kubernetes on the cluster. In addition to the steps listed in Tim’s article, you’ll also need to do this if you’re going to proceed on to installing Kubernetes:

Give clustered Pis static IP addresses

I chose to defer this step until I had the Pi Router set up with DHCP configured to serve static IP addresses to the remaining Pis in the network. This will be covered in the next section.

rak8S Ansible playbook

I skipped this step since I wanted the experience of installing Kubernetes from scratch. To do otherwise would undermine my goal of learning more about how Kubernetes works, and for me, that includes the installation process. Installing Kubernetes will be covered in a later article.

3. Configure the Network

I stated earlier that I needed to provide a way to separate my Kubernetes Pi cluster (cluster network) from my home network (external network). This can be accomplished using a router. This type of router is also sometimes called a gateway, jump host, jump box, or jump server. This capability accomplishes several things. It provides:

  • a way to configure a group of hosts into their own private network
  • a DHCP server for the cluster
  • some isolation of those hosts from other networks. (In my case, this router isn’t particularly secure, but it does provide the rudimentary capability I was looking for.)
  • a secure way for hosts on an external network to gain access to hosts on the cluster network (i.e., via a reverse tunnel).
  • a way for the hosts in the cluster network to gain access to the Internet. This capability is needed for basic administration tasks such as installing software (e.g., ).

Tim Downey has another excellent article on configuring a Raspberry Pi cluster titled Baking a Pi Router for My Raspberry Pi Kubernetes Cluster. As with the previous section, I’ll present an outline of his article highlighting any differences between our two setups. I’ll also explain some details described in his article that I initially didn’t understand.

Equipment

This is already covered in the previous section.

Install Raspbian Linux

This is already covered in the previous section.

Static IP for Pi router on home network

No changes.

Configure Ethernet interface on Pi router

The purpose of this section is to assign a static IP address to the Ethernet network interface facing the cluster network. Only a relatively simple change to the file is required. In Baking a Pi Router for My Raspberry Pi Kubernetes Cluster, the cluster is using the address space of . As the address space is already used for my home network, I chose to use instead. In my configuration I took defaults for everything modifying the static IP address assignment block to look like this:

Image for post
Image for post

Taking each line in turn:

  • — this specifies that the following block pertains to the Ethernet interface
  • — In general, I understood that this line requests that the address of 192.168.1.1 be assigned to this host; however, I wasn’t familiar with the notation. It turns out this notation is part of the Classless Inter-Domain Routing (CIDR) specification. Put simply, 192.168.1.1/24 specifies that the network address is specified by the first three octets (24 bits) of the address, i.e., 192.168.1. This is denoted by the /24 suffix, which specifies the first 24 bits of the address identify the network. The final octet is used to identify specific hosts on the subnet. Identifier 1, as in 192.168.1.1, is reserved for this host. 192.168.1.1/24 also specifies that all identifiers between 1 and 255 are available for assignment to hosts on the network. Identifier 0 is reserved by the networking stack. See the References section below for more information on CIDR.
  • — These addresses reference Google’s public DNS servers.

Install dnsmasq

From the dnsmasq website: “Dnsmasq provides network infrastructure for small networks: DNS, DHCP, router advertisement and network boot.”

Since my cluster is using a different address space, I needed to modify the file. My changes are highlighted in red.

Image for post
Image for post
  • specifies the address the DHCP server will listen on. In this case, I set it to the host’s Ethernet (i.e., wired) network address. Recall this was specified in the as described above.
  • specifies that the address pool to be used for assignment is the range of 192.168.1.130 to 192.168.1.200. This range was chosen arbitrarily; it just sounded reasonable. It allows for up to 71 hosts in the network, which for my purposes is plenty.
  • is used to assign static IP addresses to specific hosts on the network. To accomplish this, you’ll need the MAC address of each host. For example, b8:27:eb:bf:3a:92 is the MAC address of the host to be assigned the IP address of 192.168.1.130. Obtaining the MAC address can be accomplished as follows:
Image for post
Image for post

The MAC address is identified by the highlighted entry. This needs to be run on each host that will have a static IP address assigned via DHCP.

Forward internet from Wi-Fi () to Ethernet ()

This step is needed to allow internet access from the cluster network. I used this section as-is.

Test it all out

With the exception of -ing into each machine by hostname (I used IP addresses), I used this section as-is.

What didn’t work

Since all the above sections warned of issues along with specific recommendations for how to avoid them, my installation just worked.

Concluding remarks

I have the same advice. These exact steps may not work due to OS and hardware upgrades, but it should be relatively easy to find solutions to any problems you may encounter.

4. Set Up Keyless SSH Access Between Hosts

When this section is completed, you’ll be able to freely between the hosts in your cluster, and from selected hosts in the external network, with ease. This section is comprised of two sub-sections:

  • Set up keyless SSH access
  • Set up a reverse-SSH tunnel

Set up keyless SSH access

Some of the information in this section was taken from “How to configure passwordless login in Mac OS X and Linux.” It is very important to note that these steps are directional, i.e., they give keyless access from a specific host to a remote host — for example, from the cluster host that you are performing these steps on to a remote host.

The public key of a host’s user is copied to a remote host. During the copy process , will be required to complete the normal authentication process to the remote host. After copying has completed, the public key that was copied is now considered authenticated. For more information on how the public/private keys are used during keyless access, see “Understanding the SSH Connection and Encryption Process.”

From the host permitting access, the steps are as follows:

  1. — This will generate a private/public key pair. If you already have a public/private key pair, you can skip this step.

The key pair is used to provide mutual authentication between the hosts involved in the SSH session. specifies that the generated key type is RSA.

There will be several prompts. The first will be the location/name of the generated pair. The defaults are and (the public key). If you change these names, substitute the name you used for wherever these names are used.

The next prompts will be for the key’s password. Take the defaults if you don’t want to have to enter a password every time you into the remote host (the point of this section is to avoid entering a password). Note: Empty passwords are considered a security risk, but in some cases, like this one, it is acceptable.

The gist below shows an example of :

Creating a public/private key pair

2. — This will copy the public key, in this example , to the remote host.

In our cluster, you will be able to from a host in the cluster network to a host in the external network, but you won’t be able to or from a host in the external network to a host in the cluster network yet. The next step, “Set up a reverse-ssh tunnel, enables this.

Repeat the above steps for each host in the network that you want to enable keyless access for. Note that Remote Host can also be in the cluster network.

Here’s a pictorial illustration of the above:

Image for post
Image for post

Set up a reverse-SSH tunnel

Having a reverse-SSH tunnel from a cluster host(s) to a host in the external network is incredibly useful, almost mandatory. I do all my cluster configuration from my MacBook. It’s possible to do this from the Pi router, but nowhere near as convenient, especially if you want multiple or windows into the cluster.

The previous step, Set up keyless SSH access, is a prerequisite for this step. You’ll have to have already created and ’d the keys from the permitting hosts to the remote host(s) on the external network. The permitting hosts in my cluster are through . On my network the host on the external network is located at on the external network.

Some of the information below was taken from “Start AutoSSH on Boot.”

The steps are as follows:

  1. . is a tool that will set up a configured reverse-ssh tunnel. will also monitor the tunnel and re-establish it if it fails. This step installs onto the Raspberry Pi. It’s possible to configure to set up the tunnel at boot time. This is what we’ll do next.
  2. Edit your file, adding the following line (with changes as appropriate):

Here’s what the above line specifies:

  • flag is used to specify configuration options that have no command line analog. directs to ignore the file, i.e., connect even if the host isn’t in . The See the Debian SSH_CONF and SSH main pages for more details.
  • tells where to find the key file.
  • directs to run in the background.
  • tells that there is no command to be run on the remote host, only to set up the tunnel.
  • is the forwarding specification. In the above command, in specifies which port on the remote host to use for the tunnel. in instructs to use port 22 to terminate the connection on this host.

On my cluster hosts, I use the following:

Port is arbitrary. It can be any unused port on the remote host, usually above . Keep note of it, as it will be needed in a moment. Reboot the host to ensure starts up as expected on boot. To verify the tunnel works as expected, on the remote host (in my case which is my MacBook), run:

specifies the port used by the reverse tunnel. (Note: is used because we’re using the tunnel port on the current host, local host, to connect to the remote host via the tunnel port .)

You’ll be prompted for a password. What? The point of this is to be able to without a password! Well, we haven’t yet copied the public key of the host in the external network to the host in the cluster network. We couldn’t do this since we didn’t have connectivity between the two networks yet. The reverse-tunnel allows us to do that. So now we can enable keyless access from the external host to the cluster network:

In this section the tunnel port is , so the command will be:

Now into the cluster host. If all went well, you won’t be prompted for a password.

Repeat these steps for each cluster host from which you want to enable . You’ll need to choose a different tunnel port for each cluster host (e.g., , ). This is because the host on the external network, e.g., my MacBook at , needs to have a unique tunnel port number for each cluster host it will into. In my configuration, I use ports through from my MacBook. So for example:

  • On I specify port in the command. On , I specify port . On , I specify port . And so on.

One other thing I did to simplify my life was to configure my to allow me to use shortcuts to into other hosts. Specifically, it provides a shorthand for 'g over a reverse tunnel. Here’s my file:

Taking the first entry, , here’s what each line means:

  • — This is the alias for the specification that follows.
  • — This is the IP address for the local host to connect to.
  • — This is the port number on the remote host. Typically, this is port 22, the standard port and the port we specified in the command above.
  • — This is the user to use for the login.

Here’s how you use it as a substitute for :

That’s it.

5. Set Up i2cssh

This step will enable you to have terminal sessions open to several hosts at the same time. With i2cssh, you can choose to have commands you type replicated to every terminal window, or you can isolate the focus to a single terminal window. Setting up i2cssh isn’t strictly necessary, but I think you’ll find it useful.

On my MacBook, I’m using iTerm2 with i2cssh. iTerm2 can be downloaded from the website and installed like any Mac application. To install i2cssh:

The above command assumes Ruby is already installed on your Mac, it should be. You can use Homebrew to install it if it isn’t. If you’re not familiar with Homebrew, you should take the time to get to know it. It’s pretty much required for developing software on a Mac.

To start an i2cssh session, using the we set up previously, type:

Here’s what each item in the above command means:

  • — specifies a comma separated list of hosts to connect to
  • — specifies which iTerm2 profile to use. In this case, I have an iTerm2 profile named . The most prominent feature of this profile is that it has a yellow background as shown in the picture below.
  • — specifies the number of columns to use for the terminal display. The above command opens sessions to five hosts, indicating I chose to have a column per host.
  • — specifies to start the sessions “broadcasting” input from , i.e., the keyboard. This results in a command being entered into all terminal sessions simultaneously. “Broadcast” mode can be toggled on/off by .

This (very small) image of the resulting sessions shows the result. It also shows the results of running simultaneously, in “broadcast” mode, in each terminal session.

Image for post
Image for post

Since I don’t like typing, I set up a shell alias for this:

Conclusion

Whew, that’s a lot to cover. But we now have a fully configured Raspberry Pi cluster. We accomplished the following:

  1. Performed the initial setup of each Raspberry Pi. This included installing the OS and setting the hostname.
  2. Configured the network. This included setting up DHCP so the cluster nodes would get static IP addresses assigned by the Pi router ( in this article).
  3. Set up keyless access to all nodes, including a reverse-SSH tunnel between hosts in the cluster network and a host(s) in the external network.
  4. Installed and set up i2cssh, a terminal multiplexer.

References

These are the primary references that I used for the initial setup of the Raspberry Pi hosts, as well as for configuring these hosts into a cluster.

These references have additional material covering the initial configuration of a Raspberry Pi:

These references pertain to the network aspect of setting up the cluster:

  • Configuring a Raspberry Pi as a wireless access point — This document has good background information useful when setting up a Pi as a router.
  • Building a Raspberry Pi Kubernetes Cluster — Part 1 — Routing — This article is very similar to the Tim Downey article I used as the basis for this article. There is some additional information, however. It’s also part of a series that has some overlap with this series.
  • Basic CIDR overview from Wikipedia — This is a good overview of Classless Inter-Domain Routing.
  • Network configuration — General networking configuration documentation, which includes DHCP configuration. This was useful for me to better understand network configuration, as well as the contents of the file.
  • DHCP Overview — This is a good overview of DHCP. I especially liked the part about the discovery and assignment message exchange, as I had some trouble with address assignment not working. This helped me troubleshoot my problem (which I traced to my Ethernet hub, restarting it resolved the issue).
  • How to locate the DHCP Server — This reference describes several techniques for locating the DHCP server on the network. I found this useful in troubleshooting (as I indicated in the previous bullet). For Raspbian, this command shows the message exchange between the client and the server for DHCP discovery and assignment. In my particular case, the request failed. As a side note, the message is broadcast to the entire subnet. Per the standard, the DHCP server will be listening on port 67, so only the host listening on that port will respond. Note in the exchange below, as expected, my DHCP server with the IP address of 192.168.1.1 responded as follows:
Image for post
Image for post

These references cover keyless SSH access and reverse-SSH tunneling:

  • How to configure passwordless login in Mac OS X and Linux — This covers the information needed to be able to SSH into a host without providing a password (keyless access).
  • Start AutoSSO on Boot — This covers the information needed to automatically set up an SSH reverse-tunnel on boot.
  • SSH jump host — This describes an alternative to setting up a reverse-SSH tunnel to access hosts in another subnet. I didn’t choose to do this in my cluster, but it is an option. One of the pros of this approach is it’s relatively simple when compared to the approach I described. The pro with my approach is that it could be argued that it’s more secure.
  • Simplify Your Life With an SSH Config File — This covers the various ways you can customize your file. I used this to configure shortcuts for my commands.
  • i2cssh and iTerm2

Better Programming

Advice for programmers.

Sign up for The Best of Better Programming

By Better Programming

A weekly newsletter sent every Friday with the best articles we published that week. Code tutorials, advice, career opportunities, and more! Take a look

By signing up, you will create a Medium account if you don’t already have one. Review our Privacy Policy for more information about our privacy practices.

Check your inbox
Medium sent you an email at to complete your subscription.

Thanks to Zack Shapiro

Richard Youngkin

Written by

Developing distributed apps since 1991 | youngkin.github.io

Better Programming

Advice for programmers.

Richard Youngkin

Written by

Developing distributed apps since 1991 | youngkin.github.io

Better Programming

Advice for programmers.

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app