“Traversal” — A dive into distributed hash tables on the decentralized web

What has the internet come to? With internet giants constantly centralizing information and communications that we access daily, never before have we needed a distributed web. One where peers connect to peers, where your personal data isn’t stored on a server outside of your control. At the very foundation of the distributed web are Distributed Hash Tables (DHT).

If you haven’t heard about DHTs, you’re missing out! A DHT, at a vey basic understanding is a type of memory with simple PUT and GET commands. You can PUT hashed key/value pairs and GET values back. The DHT for the distributed web consists of key values which are hashed from individual Nodes. Where Nodes can be thought of as each person unique ID when traversing the internet. The values that belong to each specific hash can be are not limited to an individual’s IP address, port, location etc…

BitTorrent is a rather famous implementation of Distributed Hash Tables. Essentially, each person (referred to as a “peer”) is a tracker. This means that they store information regarding other peers in their memory. In the distributed web, peer information generally consists of peers that are near to you. Of course, all of this information is hashed and so relatively, any information is safe.

The idea of BitTorrent is to allow peer to peer data sharing regardless of content or file size. To do this, BitTorrent has to discover connections between peers with the fastest internet speeds which can also be the closest distance. So this is how it all works.

Once a peer connects to the net and begins to torrent a file, connections to a tracker are also established. The trackers then look for node hashes that are “nearby” and also seeding the file. If no node hashes are found, trackers redirect to the hash that is closest to one that is actually seeding the torrent. Again if no hashes are located, the peer is redirected again. This process continues until the starting peer has a stable connection which is either bottlenecked through their internet connection speed or by the limited numbers of peers seeding the torrent.

If this is difficult to imagine. Fundamentally, the traversal of a DHT is similar to a Trie data structure.


Trie / DHT Traversal

In this diagram, the starting peer requests peers from the tracker. If no tracker is found, the tracker redirects the connection to the peers in its network. This process continues until sufficient peers are found and connected to the starting peer.