Poor Man’s Proxmox Cluster

Published in

Moqume Blog

8 min readNov 16, 2013

I had written this elsewhere before, but thought I would share it on my own site as well. The idea here is to create a Proxmox VE cluster with limited resources, in particular a lack of a private network / VLAN. We address this by creating a virtual private network using a lightweight VPN provider, namely Tinc.

You could use something else, like OpenVPN or IPSEC. The former is a bit on the heavy side for the task, whilst the latter may not have all the features we need. Specifically, Tinc allows us to create an auto-meshing network, packet switching and use multicast. Multicast will be needed to create a Proxmox VE cluster, whilst the virtual switching ensures packets will eventually be routed to the right server and VM.

Create an additional vmbr

By default there should already be a vmbr0 bridge for Proxmox. We will need to create — or modify — an additional vmbr, which in this example we name vmbr1.

Warning: on many systems, vmbr0 bridge is used to make your server accessible over the public network — so do not edit that unless absolutely required!

You also need to think of what private IP block you would like to use, and assign each Proxmox VE server an IP from within that private IP block. For example, I use the IP range 192.168.14.0/23 (which is 192.168.14.1–192.168.15.254 and a netmask of 255.255.254.0). The 192.168.15.x range I assign to the Proxmox VE servers, whereas the 192.168.14.x range I assign to containers / VMs. Using that IP range, you would change the file as following:

# for Routing
auto vmbr1
iface vmbr1 inet static
    address 192.168.15.20/23
    bridge_ports dummy0
    bridge_stp off
    bridge_fd 0

You can force the changes using:

ifdown vmbr1 && ifup vmbr1

You will need to do this on each server, taking care to select a different IP address. Keep it simple, start at 192.168.15.1, and increment the last number for each new server.

Tinc

The next step would be installing Tinc and configuring it in such a way that Proxmox VE can use multicast over that virtual private network.

So on the server, install Tinc with:

apt-get install tinc -y

Next, create a directory where the configuration for the VPN will reside (you can have multiple configurations as such):

mkdir -p /etc/tinc/vpn/hosts

Next, we create a basic configuration, which tells Tinc to use a “switch” mode and what this server’s “name” is. For sake of simplicity, use the hostname for the “name” (use uname -n to determine it):

cat > /etc/tinc/vpn/tinc.conf <<EOF
Name = server1
AddressFamily = ipv4
Device = /dev/net/tun
Mode = switch
ConnectTo = 
EOF

The “ConnectTo” is currently left blank, but will be important once you have setup the other servers. More on this later.

Then we create a server-specific configuration. Note that the filename is the same as specified in “Name =” above.

cat > /etc/tinc/vpn/hosts/server1 <<EOF
Address = 123.4.5.6
Port = 655
Compression = 0
EOF

Obviously you should replace the “Address” line with the actual public IP address of your server.

Now we need to create a public/private key. The private key will remain exactly that: private. The public key will be appended to the file we just created (/etc/tinc/vpn/hosts/server1), which will eventually be distributed to the other servers.

tincd -n vpn -K4096

It will ask you to confirm two file locations. The default should be correct (with the last/2nd one the file as mentioned above).

Now we need an up/down script, to do some post configuration of the network when the VPN comes up (or goes away). This is a simple copy & paste, provided you have setup vmbr1 as outlined earlier:

cat > /etc/tinc/vpn/tinc-up <<EOF
#!/bin/bash# Attach the 'vpn' interface to vmbr1
/sbin/ifconfig vpn up
/sbin/brctl addif vmbr1 vpn# Set a multicast route over vmbr1
/sbin/route add -net 224.0.0.0 netmask 240.0.0.0 dev vmbr1# To allow VMs on a private IP to access the Internet (via vmbr0):
/sbin/iptables -t nat -A POSTROUTING -o vmbr0 -j MASQUERADE# To allow IP forwarding:
echo 1 > /proc/sys/net/ipv4/ip_forward# To limit the chance of Corosync Totem re-transmission issues:
echo 0 > /sys/devices/virtual/net/vmbr1/bridge/multicast_snooping
EOFcat > /etc/tinc/vpn/tinc-down <<EOF
#!/bin/bash
/sbin/route del -net 224.0.0.0 netmask 240.0.0.0 dev vmbr1
/sbin/brctl delif vmbr1 vpn
/sbin/ifconfig vpn down
echo 0 > /proc/sys/net/ipv4/ip_forward
EOFchmod +x /etc/tinc/vpn/tinc-up
chmod +x /etc/tinc/vpn/tinc-down

What the above does, is add the VPN tunnel to the vmbr1 bridge. Furthermore, it allows multicast messages over vmbr1. It also sets the use of masquerading, to allow a VM on a private IP to communicate successfully with the outside world — it will use the IP address of vmbr0 to do so.
Then, you need to tell Tinc that the contents in the “vpn” sub-directory should be started whenever it starts:

echo "vpn" >> /etc/tinc/nets.boot

You will need to do this on each server that needs to be part of the VPN. In addition, the files within the directory /etc/tinc/vpn/hosts/ needs to be distributed to all servers (so that all servers have the files from the other servers). Its simple enough to script this, if you want to go that route, but that’s beyond the scope here.

As mentioned earlier, you will need to edit the /etc/tinc/vpn/tinc.conf and provide the name of another server in the “ConnectTo” setting that was previously left blank. Which server you chose is entirely up to you, and you could chose a different one for each server — remember that Tinc is auto-meshing, so it will connect all servers over time.

Note: without making that change to /etc/tinc/vpn/tinc.conf, Tinc will not know what to do so you will not have a working VPN as a result.

Once you have edited the configuration as outlined, (re)start Tinc using the following command:

service tinc restart

And test your network by pinging another node on its private IP, ie:

ping -c3 192.168.15.32

Note I use the “-c3” here, to limit the amount of pings. If the VPN was not configured correctly, or a firewall is interfering, you may otherwise end up with a large number of “Host or destination is unreachable” errors.

Forcing the private IP address

We need to force Proxmox VE, or more specifically Corosync, to use the private IP addresses rather than the public IP address. This because the multicast needs to be done over our virtual private network.

The easiest, but also the “dirtiest” method is to simply change the /etc/hosts, which I will outline here.

The first step is to ensure that the /etc/hosts file is read before attempting to do a full DNS lookup:

cat > /etc/host.conf <<EOF
order hosts, bind
multi on
EOF

Next edit the /etc/hosts file, by commenting out the original line, and adding our own:

…
# Original:
#123.4.5.6 server1.myprovider.com server1
# Ours:
192.168.15.20 server1.myprovider.com server1

Make sure that the private IP address matches the one you assigned to vmbr1 (double check with ifconfig vmbr1).

Again, this is a “dirty” method and you may want to use your own DNS server instead that resolves IPs for a local network (say, “server1.servers.localnet”).

At this stage, reboot the server to ensure the changes get picked up and everything works as expected (that is, your server comes back up online — hmm!).

Create the cluster

If you do not yet have a cluster configured, you need to create one first. So pick a particular server that you would like to consider as a “main server” and perform the following:

pvecm create <arbitrary-name>

Where <arbitrary-name> is something of your own choosing. Keep the name short and simple, without spaces or other funny characters.

The “main server” is a loose term really, as any server within the cluster can manage other servers. But use it as a consistent starting point for adding other servers to the cluster.

You can check if things are working correctly with:

~# pvecm status
…
Node name: server1
Node ID: 1
…
Node addresses: 192.168.15.20

In particular, you’d want to make sure that the “Node addresses:” portion is the private IP address as on vmbr1.

Adding servers to the cluster

Adding a server (node) to the cluster will need a little preparation. Specifically, because we use private IP addresses for the cluster, we need to force other nodes to do the same when trying to contact another node. In other words, if server1 wants to contact server2, it should use the 192.x range instead of the public IP address.

So, based on the above example, on server1 we need to add a line to the /etc/hosts like this:

cat >> /etc/hosts <<EOF
192.168.15.21 server2.myprovider.com server2
EOF

Note the double “>>” brackets. If you use a single “>” one, you overwrite the entire file with just that line. You’ve been warned.

And on server2, we need to make sure server1 can be contacted using its private IP as well, so on that server, we perform:

cat >> /etc/hosts <<EOF
192.168.15.20 server1.myprovider.com server1
EOF

All of this can be made much fancier with your own DNS server and bindings, but again, this is beyond the scope and goes on the assumption you don’t mind doing this for the 2, 5 or 10 servers or so you may have. If you have a few hundred, then I wouldn’t expect you to be looking at a “Poor Man’s” setup.

On the server that you will be adding to the cluster, make sure that you can successfully ping that private IP address of the “main server”.
If tested OK, then still on that server (thus the one that isn’t yet part of the cluster), type:

pvecm add server1

Where “server1” is the “main server” (the one on which you first created the cluster). It will ask you for the root password for SSH for server1, and then does its thing with configuration.

Note: If you have disabled password-based root logins using SSH, you may have to temporarily enable it. Using SSH keys would be a preferred method over passwords.

After this has been done, the node should automatically on your web-based GUI and can be verified from the CLI using:

pvecm nodes

If the nodes show up in the “pvecm nodes” command and GUI, then you have successfully created the cluster.

Note: A note about a 2-node cluster and quorum can be found here.

Containers and VMs

You can now create containers and VMs that can be migrated between the nodes.

You can either assign the private IP address directly (venet, only on OpenVZ containers) or as a network device (veth) attached to vmbr1.

The private IP address should be within the range of your specified netmask on vmbr1. So going by the above example of using 192.168.14.0/23, that’s anything between 192.168.14.1 and 192.168.15.254. Make sure the IP isn’t already used by another VM or a node (see initial notes, re 192.168.14.x for VMs).

If you fire up the VM, its private IP address should be ping-able from any server, and from within the container / VM, you can ping any private as well as public IP address (the latter thanks to masquerading configured with the tinc-up script). If this is not the case, the network configuration was not done correctly.

Final notes

You should now have at least one container / VM with a private IP address. Its good and well if this VM doesn’t need to be accessed from the outside world, but if you want to give it such access, you will need to use NAT on the server. This will instruct the node that incoming traffic on a particular port will need to be forwarded to a particular VM/

For example, TCP port 25 on 123.4.5.6 is forwarded to VM on IP 192.168.14.1:

iptables -A FORWARD -p tcp -i vmbr0 -d 192.168.14.1 — dport 25 -m state — state NEW,ESTABLISHED,RELATED -j ACCEPTiptables -t nat -A PREROUTING -i vmbr0 -p tcp — dport 25 -j DNAT — to-destination 192.168.14.1:25

Note that this is just a simple guide to help you get started. More importantly, it doesn’t include any basic security measures such as a firewall (there are other articles about a firewall for Proxmox on this site [here and here], which I will update when I can).