P2P Networking in NION

4 min readAug 30, 2021

One of the most important limitations of blockchain scalability is the ability to disseminate information across the network quickly and reliably. A good example is propagating new blocks, which can be viewed as a message starting from a random node, and ultimately propagates to the entire network.

One of the simplest yet most widely used algorithm is simply to floods the network with the new message. In order to terminate the flood, nodes keep a relatively short history of messages they already saw. A node will inform it’s neighbours about the message if it has not yet seen it, otherwise discard it. Most practical implementations also have a fan-out factor (number of neighbour nodes to inform) to prevent a denial of service and fine tune the speed vs redundancy.

Flood based algorithms are simple to implement and quite fast. Another way to think about it is, if you take all paths, you also take the shortest path. However, there is a lot of potentially redundant network utilization. This can be measured by how many times do nodes received a message they already knew about. The following is a step by step visualisation of the basic flood based algorithm in action. In this simple example, node A starts to propagate a new block by sending it to two randomly chosen neighbouring nodes. Once all nodes are informed, it is clear that even in small networks such as our example, some nodes (i.e. C) received the same message multiple times. Additionally, the addition of a fanout limitation does not guarantee the fasted path anymore.

Figure 1: Flood based message propagation with fanout =2

In general, it should be quite easy to construct a route that guarantees each node only receives a message once. However, in decentralized systems, a node generally only has a local view of the network instead of a global one. In other words, node A might not even know node F exists.
Luckily, proof of stake based blockchains are guaranteed to know a subset of all nodes, at least those in the validator set contributing to the consensus. Assuming every node knows about the same subset of other nodes, we can construct a binary tree, any binary tree will do as long as all nodes can reconstruct the same tree.

For this algorithm we assume the following

There is a subset S of nodes that form a full graph in the overlay network
Each node can be connected to nodes outside of S
Nodes can resolve Id’s to IP addresses using DHT
Latencies between nodes are comparable for simplicity

Using the message hash as a seed for RNG(random number generator), the originating node constructs a random binary tree of the validating nodes. The tree is effectively stored implicitly in the array, so this step only involves shuffling the array of known nodes for each message.

In order to compute the path, each node must:

pseudo randomly shuffle the array of known nodes with the same seed as other nodes
determine it’s position in the tree
compute the position of the neighbour node

Upon receiving a new message, the node shuffles the array using the message signature to obtain the routing tree. It then identifies it’s position in the array, and send the message to it’s two children as illustrated in Figure 2.

Using strictly the number of messages sent as a criteria, this is optimal. However, in practice, we have to worry about other criteria such as fault tolerance, security, guarantee of delivery, and speed.

Clearly, using a binary tree based routing does not tolerate faults very well, In the worst case scenario, the first child of the root drops, fails to inform it’s children, which causes half of the tree to not receive the message.

We address this issue by adding some cross links in the tree in order to check whether sibling nodes were informed, and take appropriate measures in case a fault has occurred. So with the added cross-links, the algorithm has much higher delivery guarantees. However, there is an additional cost in terms of messages sent in order to check those cross-links for failed nodes. Figure 3 shows the impact on the number of messages sent in networks with nodes randomly failing. We observe that as the number of faulty nodes increases, Tree based routing requires more messages. The tipping point seems to be pretty consistent at around 10%, which in practice is quite rare.

Figure 3: Performance comparison in case of faulty nodes

The following is a visualisation of one of the many simulations done before implementation. Blue nodes have not yet been informed, while the green ones have. Nodes experiencing byzantine behaviour are coloured red.

In this blog post, I will not address the other concerns such as security and un-synced validator sets or improvements such as latency weighted sorting. However Nion addresses them and improves upon the basic idea to achieve low network utilization while maintaining the desired properties of flood based algorithms.

Acknowledgment

I would like to thank Domen Vake for contributing his expertise in developing the simulation.

— Aleksandar

About Nion

Nion is a blockchain based cloud infrastructure with a lightweight protocol that aims to build a decentralized and global network.

Website | Telegram | Twitter | Github

P2P Networking in NION

Acknowledgment

About Nion

Written by Nion Network