Orchid Tech: DHTs and Eclipse Attacks

Gustav Simonsson
May 27, 2018 · 10 min read

The Orchid project is committed to open source and open R&D collaboration around technology to build a new, decentralized Internet. The crypto ecosystem is growing exponentially and, as our vision at Orchid is shared by many projects building for the new, decentralized Internet, we want to continuously share our research so that it can cross-pollinate with other projects in the space.

Forced Connections + Signed Routing

While the blockchain/crypto space today has an intense focus on new consensus algorithms, there is less attention paid to the security of the underlying P2P network.

Bitcoin’s P2P network is unstructured and Ethereum’s P2P network is intentionally not part of Ethereum’s formal specification. There are benefits to unstructured (and unspecified!) P2P layers as it can make it easier for a blockchain protocol to be adapted into other types of lower-level networks, such as mobile, mesh and IoT networks. However, with increased reliance on blockchain networks as critical infrastructure, it’s worth thinking about how we can also improve security and performance at the lower layers of the stack.

Two exciting ideas that came up during our research are “forced connections” and “signed routing” — mechanisms we believe can be used to make Eclipse Attacks much harder to carry out against decentralized P2P networks. This has potential applications not only for the Orchid Network, but also for other decentralized, open networks such as Ethereum, Althea and filecoin.

If you are already familiar with (blockchain) P2P networking, DHTs, and Eclipse Attacks, scroll down to find the algorithm drafts. If not, here is a quick intro:

Eclipse Attacks in Blockchain Networks

An Eclipse Attack is more general than a Sybil Attack and named after how it “eclipses” nodes in order to separate them from other nodes.

A Sybil Attack creates malicious nodes. An Eclipse Attack isolates nodes from benign ones.

In distributed networks such as blockchain networks, each node is connected to a number of other nodes. Those nodes are in turn connected to other nodes connected to further nodes, and so forth. Collectively, all nodes and all their connections form what we call the network, which can be seen as a graph of interconnections.

An Eclipse Attack is typically targeted against one or a few nodes to force them to stay connected only to nodes of the attackers choosing. This can be exploited by an attacker to more effectively double spend a transaction, as seen in “Eclipse Attacks on Bitcoin’s Peer-to-Peer Network” (Heilman, Ethan, et al).

Vulnerabilities allowing for Eclipse Attacks have also been present in the Ethereum network, as well as in many other networks (including, not surprisingly, networks based on Bitcoin or Ethereum code bases).

The root of such vulnerabilities stems from the very nature of these types of distributed networks, in which no node has a global view of the network. In blockchain networks the number of connected nodes can range widely, with defaults of 8 outgoing connections in Bitcoin and 25 in Ethereum.

If an attacker manages to force your node to disconnect from certain nodes and connect to others, then over time the attacker can manipulate exactly who you are connected to. If all your connections end up belonging to the attacker, they can censor any new information in the network, such as transactions or entire blocks. This is then used to effectively double spend or otherwise exploit your now censored “view of the world”.

DHTs

Distributed hash tables (DHTs) are frequently employed in distributed networks. They allow data to be stored across all nodes in the network in such a way that each node stores only part of the global data, but can access any global data by querying other nodes. Unlike centralized databases, DHTs scale very well to any number of nodes and can provide strong resilience in an environment of nodes leaving, joining and acting in malicious ways.

In Bitcoin, the network is unstructured and does not use a DHT. However, the Eclipse attacks on Bitcoin lead to the addition of countermeasures in the Bitcoin Core client, inspired by research on the security of DHTs.

Ethereum uses a Kademlia-like routing which, while lacking direct DHT features, has similarities to the original Kademlia DHT. It does not support directly storing and retrieving arbitrary data, but uses the bucket structure in combination with each node having a public-private key pair used to derive the node id and sign packets.

An excellent resource when attempting to mitigate Eclipse Attacks is this survey of DHT security techniques from 2011. It also discusses defenses against Sybil attacks, which we will reference in a future post.

The paper discusses how the most basic form of defense against Eclipse Attacks is to constrain the node identifiers used in routing tables — assuming that node ids are random and that malicious nodes are spread over the identifier space (further assuming there is some difficulty associated with re-creating a node id).

Several techniques based on such assumptions are discussed, including constrained routing, consensus-based node ids and use of multiple (possibly redundant) routing tables. Due to the incredible growth of the blockchain/crypto ecosystem we are seeing an explosion of research and practical tooling for the applied cryptography and higher-level crypto-economic primitives used in currently (and soon to be!) deployed networks. This makes today a great time to study prior research into DHT security and see where we can combine newer primitives around Proof-of-Work (Pow) and Proof-of-Stake (PoS) with prior DHT technology.

Orchid Network Formation

The Orchid Protocol connects peers using a highly structured P2P protocol. All peers (nodes) are equal and after connecting to their first neighboring node (such as a bootstrap server or entry nodes shared by people online), a node learns of other nodes and uses the Chord DHT to handle further connections.

Chord supports only one operation: given a key, map the key onto a node. Consistent hashing is used to simplify the addition and removal of keys — in the Orchid network this is used to store information about nodes (such as their IP, public keys and (bandwidth) market bids & offers). Analogous to a binary search tree, two nodes in Chord, A and B, can find each other using O(log N) communication steps where N is the total number of nodes in the network.

Each hop from A to B cuts the remaining distance in half (or better)

Chord provides keyspace partitioning where we can think of each node as having a position in a virtual keyspace, encoded as an integer between 0 and 2²⁵⁶. The is used to assign neighbors to nodes using a structure based on the keyspace — forming the network topology. This improves security and performance of the network by ensuring that nodes are uniformly distributed (in the keyspace), avoiding connectivity patterns arising from the real-world attributes such as geographical location or latency pertaining to specific networks. This increases robustness and allows the DHT to scale to any number of nodes, simplifying the entry of new nodes, departure of existing nodes and generally handle any node failures without causing disruption to other nodes (as their location in the key space remains unaffected).

The Chord DHT forms the base of the Orchid Forced Connections and Signed Routing algorithms.

Orchid Forced Connections

While Chord itself only maps keys to nodes, the virtual keyspace arising from its consistent hashing can also be used to enforce rules on how nodes are connected to each other. We can use this to derive a set of forced connections a node is required to maintain — that can also easily be verified by other nodes.

In Chord, the keyspace distance of two nodes is defined as the number of identifiers between the two nodes. If identifiers are random integers, we can calculate distance by simply subtracting one identifier from another.

In Orchid, the identifier is derived from a cryptographic hash over the node’s public key, similar to how addresses are derived in Bitcoin and Ethereum. This guarantees that node identifiers are random and uniform in the key space. Since it is trivial for a node to generate new keys until they find one that translates to a specific key space location, Sybil resistance is provided through separate mechanisms. We will discuss our approach to Sybil resistance in a future article.

In Chord, the maximum expected number of peers for a node is log2(N) where N is the total number of nodes in the network. We define the set of forced connections for a node n as:

FC(n) = minLog2 { dist(n + x, m) }

where x is in {1,2,4,..2²⁵⁵}, m is the identifier (address) of another, deterministically derived node (e.g. the closest Chord predecessor of n) and minLog2 returns the set of the smallest (for some defined sort order) log2(N) elements from dist with the least distance to m.

One intuition here is that we derive the connection set from the node’s Chord identifier, one another node and a deterministic function. We can use any function as long as it generates a uniform, deterministic set of neighbors based on the keyspace distance between the node and the neighbors. The deterministic property is important to allow other nodes to verify the forced connection set of another node. This is also a reason for why we need a mechanism for Sybil resistance — to ensure that while the forced connections are derived using a deterministic function they are still effectively random (uniform and unpredictable) and not intentionally generated. At minimum, we need to enforce a reasonable lower cost bound on node id re-generation.

While forced connections on their own can increase network robustness and make it harder for attackers to eclipse nodes, the real security gain comes when forced connections are combined with signed routing:

Orchid Signed Routing

At this point we have roughly the original Chord DHT with some minor modifications (node identifiers are derived from node public keys and we use SHA-3 instead of SHA-1 in the original Chord spec) combined with the forced connections. We now augment this with the following rules:

When node A seeks to establish a forced connection to node B, a random node R is selected (by e.g. querying the DHT for the node with an address closest to Hash(FC(A))). Then, A submits to B the following:

  1. Proof that all nodes in FC(A) can route to R.
  2. Proof that R can route to all nodes in FC(A).
  3. Proof that all nodes in FC(A) are established forced connections.

Each proof is a signed routing table chain, where each node signs a message claiming that it can route to some other node. Informally, we have the following signatures, collected by the new node and verified by the entry node:

  1. For each node n in FC(A) { n_priv.sign(R_pub) }
  2. For each node n in FC(A) { R_priv.sign(n_pub) }
  3. For each node n in FC(A) { for each node m between n and A { [m_priv.sign(A_pub), m_priv.sign(n_pub)] }

In practical implementations, as (3) is recursive, this proof would have to be bootstrapped by introducing a time bound during which B accepts intermediate connection from A in good faith while A is gathering signatures from other nodes. Furthermore, as currently described there is some redundancy in the signing of routing tables; if A signs that it can route to B and B signs the entire FC(A) routing table, then we can skip the B-routing-to-A signature.

If these rules are enforced every time a node joins the network, it forms an inductive proof of the soundness of all connections in the Orchid Network.

DHT ID Regeneration

In practical implementations we need additional rules to handle things like when nodes (within a FC set) leave the network. One solution is to require nodes to re-generate their node keys every b Ethereum blocks or according to some other verifiable timestamp. See upcoming articles for ideas we have around implementing a full (timestamped) ledger within the Orchid Network.

Requiring all nodes to re-generate their node key (or at least the key used to derive their position in the DHT keyspace) makes it even harder for attackers to eclipse a given node, as there is only a limited time window to position new nodes close to a target. If we need long-lived node keys for Sybil resistance (e.g. PoS) then, instead of directly using the PoS key, we can derive rotating DHT keyspace ids as:

H(n_pub || block_hash)

where H is a cryptographic hash function, n_pub is the public key of the node key used for its Sybil-resistant identity and block_hash is the hash of the most recent Ethereum block where the block number modulo b == 0.

Security Analysis

The combination of forced connections, signed routing and regular rotation of node ids in the DHT keyspace appears to make Eclipse Attacks much harder to pull off. Going further, we conjecture that it can be made arbitrarily difficult depending on the strength of node Sybil resistance and total number of nodes in the network.

  • If a node supplies a fake routing table, it will not route to R.
  • If a node is not member of the same network, then R will not be able to route back to the other nodes in FC(A).
  • If A lies about FC(A), node B can verify this and reject the connection.

Let’s say an attacker controls 10% of the network, with the forced connection set size defaulting to 20. If there is 10K nodes, then the probability of one node belonging to the attacker is 1-0.9²⁰ ~= 88%. The probability of the attacker controlling the majority (11) of connected nodes is 0.1¹¹ ~= 1 in 100 billion.

The size of the forced connection set becomes an important security parameter that trades computational and bandwidth overhead for exponentially decreasing probability of being eclipsed. The size of the set can be configurable by nodes. Combined with regular rotation of node ids, we believe this scheme can be made very resilient even if attackers control a significant portion of the network.

References

A good read on how the re-generation of node ids increases security in DHTs is “Induced Churn as Shelter from Routing-Table Poisoning.” by Condie, Tyson, et al. This paper also discusses DHT Eclipse Attacks and constrained routing tables in more depth.

Another good read on this type of defense (and many related techniques) against Eclipse Attacks see “Defending against eclipse attacks on overlay networks.” and A Survey of DHT Security Techniques.

Also see section 4.4 in the Orchid whitepaper where we first discussed these ideas.

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade