Polkadot Hello World #4: Security and Availability Aspects of Your Validator Node

This is the fourth article of the series, articles already published on Medium can be found here:

In this article, I will address two topics in the context of the security and availability aspects of your Validator Node. I’m aware that the tips presented are just covering the tip of the “security and availability” iceberg for a PoS validator node. Nevertheless, I find them useful to get a minimal amount of security and availability to your test instance.

Photomontage of an iceberg ( CC by SA-3.0)

Protecting Your Validator Node

Exposing Blockchain or Crypto Services on the internet will always attract attacker who may try to crack your system. So better be prepared to take any measure to reduce the risk to be compromised.

When running my Polkadot Validator I observed quite an extensive amount of attackers who tried to brute force my ssh password.

One of the first things I did, was to install SSHGuard. So what is SSHGuard? As Andrew Schartzmeyer describes in his blog:

SSHGuard monitors servers from their logging activity. When logs convey that someone is doing a Bad Thing, sshguard reacts by blocking he/she/it for a bit. Sshguard has a touchy personality: when a naughty tyke insists disturbing your host, it reacts firmer and firmer.
Now the nice thing about SSHGuard is that, despite its name, it protects quite a different services, and I now use it for SSH, Dovecot, and vsftpd. Unfortunately, the documentation makes it seem that these will be set up automatically, but that is only the case for SSH.

In our case, we use it to protect our SSH port, so the installation is quite straightforward and by default, it will start checking the /var/log/auth.log on an Ubuntu server (in which the SSH attacks etc. are logged).

On an Ubuntu server run the following commands in order to install SSHGuard which will use the iptables as it system firewall. iptables is a program used to configure and manage the kernels netfilter modules.

apt-get update apt-get install -y sshguard

When sshguard blocks any malicious users (by blocking their IP addresses), it will use the sshguard chain. Prepare the chain, and make sure it is also triggered when new incoming connections are detected. Restart sshguard at the end.

iptables -N sshguard 
ip6tables -N sshguard
iptables -A INPUT -j sshguard
ip6tables -A INPUT -j
sshguard service sshguard restart

In the below example SSHGuard started to block an attempt after 4 unsuccessful login attempts and will increase the block time gradually, should the attacker go on with its attacks.

In order to get an overview of the blocked IP addresses run the following command:

iptables -nvL sshguard

With this easy setup, you can be sure that your sensible sshd port is protected against dumb attackers try to crack into your system.

Inbound Ports of the Polkadot Validator

By the way which ports, are required by Polkadot Validator. There are three inbound ports required by a validator:

  • 30333 port for the peer2peer protocol
  • 9933 for RPC
  • 9944 for the WebSocket (WS) communication

Ideally, your validator node would expose just these three ports as well as the sshd port allowing you to login to the system.

Outbound Ports of the Polkadot Validator

Some thoughts about protecting as well as the outbound ports of a Validator. As described in this security.stackexchange thread

… incoming traffic blocking can only prevent unsolicited traffic from reaching your internal network. However, if you get malware on an internal machine (via running an untrusted executable, or through an exploit) you can still be hit.
Blocking outgoing traffic helps limit the damage, by preventing the malware from connecting to a command & control server or exfiltrating data.

So in a production Validator node, this may be considered as vital, but be aware:

… the idea of outbound filtering would seem a natural course in a high-security environment. However, it is a very large and complex undertaking…

To illustrate this I was running a quick exercise by analyzing the communication pattern of a Validator node.

For that, I was using the excellent tool Wire Shark.

“Wireshark is the world’s foremost and widely-used network protocol analyzer. It lets you see what’s happening on your network at a microscopic level and is the de facto (and often de jure) standard across many commercial and non-profit enterprises, government agencies, and educational institutions.”

In order to get the input file for Wire Shark, I had to run the tcpdump command on my validator node …

[email protected]:~# tcpdump -w dump.out -i venet0 -c 1000 -vvv tcpdump: listening on venet0, link-type LINUX_SLL (Linux cooked), capture size 262144 bytes 1000 packets captured 1125 packets received by filter 0 packets dropped by kernel

… and load it into Wire Shark. As one can see in the below screenshot. There a plethora of outbound ports in use by the Validator.

This is expected taken into consideration that the Validator will use a Peer2Peer communication scheme in order to talk to the other around 45 Validators in his network.

In order to protect the outbound ports as well, a detailed know how would be necessary how the underlying P2P library is working, which is out-of-scope of this article (looking at securing a test node).

Nevertheless, it’s worth to mention that the Payment Card Industry Data Security Standard is just asking that from organisations which are offering credit cards (also some kind of validators).

PCI DSS Requirement 1.2.1 focuses around organizations developing policies and procedures that restrict traffic to that which is absolutely necessary, both inbound and outbound, for business purposes. PCI Requirement 1.2.1 states, “Restrict inbound and outbound traffic to that which is necessary for the cardholder data environment, and specifically deny all other traffic.” The goal of PCI Requirement 1.2.1 is to limit traffic to only essential, required protocols, ports, or services and have the business justification for those required elements. ( Link )

Interesting to see, what the minimal requirements will be to run a Polkadot Validator node in Production.

So let’s switch to the second topic of this article.

Increase Availability of your Validator Node

I’m running a Validator POC-2 node in a 7x24 fashion and try to get the slashing (due to unavailability of my node) reduced to a minimum.

Nevertheless, from time to time, the process gets a Killed signal which results that the process must be started.

In order to automate this task and reduce the slashing probability to a minimum, I wrote a small cron job script, which gets executed every minute.

The script (monitorValidator.sh) to be triggered will check if there is no polkadot process running

date +"%Y-%m-%d %T"
a=$(/bin/netstat -tulpn | awk '{print $7}' | grep polkadot | wc -l ) if test $a = "0" then
echo "$(timestamp): Polkadot Validator Down" >> /var/log/run.log
/root/.cargo/bin/polkadot --name $POLKADOT_NAME_POC2 --validator --key $POLKADOT_KEY_POC2 &>> /var/log/run.log

To install the crontab, execute the following command

(/usr/bin/crontab -l ; echo " * * * * * bash -l -c '/root/monitorValidator.sh > /dev/null 2>&1'") | /usr/bin/crontab

This will result in a check of every 60 seconds if there is a validator process still running, if not it will be restarted.

Future Outlook and Cosmos Approach

On the public repo to coordinate the collaboration of teams working in the web3 space an issue was opened to “Create and run a node cluster service for Polkadot” (https://github.com/w3f/Web3-collaboration/issues/43), which would address the availability issue.

Cosmos which is more mature than Polkadot addressed some of the topics, amongst others:

  • Provisioning of the Sentry Node Architecture, which is an infrastructure example for DDoS mitigation on Gaia / Cosmos Hub network validator nodes.
“To mitigate the issue, multiple distributed nodes (sentry nodes) are deployed in cloud environments. With the possibility of easy scaling, it is harder to make an impact on the validator node. New sentry nodes can be brought up during a DDoS attack and using the gossip network they can be integrated into the transaction flow.”
“a lightweight service intended to be deployed alongside the gaiad service (ideally on separate physical hosts) which provides the following:
High-availability access to validator signing keys
Double-signing prevention even in the event the validator process is compromised
Hardware security module storage for validator keys which can survive host compromise”

That’s it for today, at least on the surface of your Validator iceberg, you have done your first step to secure and increase availability.

It will be interesting to watch how Polkadot will evolve over time in the context of security and availability.

Icebear (CC by SA-4)

In the next week, we will look in more detail to the Slashing process and how you can influence it.

Originally published at dev.cloudburo.net on October 31, 2018.