Reimplementation of Chord — A Scalable Peer-to-peer lookup System in Golang

Josh Zhang
Princeton Systems Course
9 min readMay 24, 2019

This post is written by Josh Zhang and Jace Lu, May 14, 2019. It serves as a final project report for COS 518 Advanced Computer Systems, advised by Professor Michael Freedman. Hope you enjoy the read!

Background:

Peer-to-peer (P2P) computing or networking is a distributed application architecture that partitions the workloads or tasks between peers. There is no centralized control in a peer-to-peer distributed application: every node(or server) is an equally privileged and equipotent participant in the system. High performance is achieved through high parallelism because we can run tasks on many peers(many CPUs, disks, and storage). The system overall is more secure and robust because one single point of failure doesn’t crash the whole system.

Motivation:

The peers in the network should be able to interconnect with each to collaborate on tasks or exchange information. That leads to a fundamental problem in a distributed peer-to-peer system: how do we look up other nodes in the system quickly and efficiently. Chord is a distributed lookup protocol that addresses this problem. It was introduced in a paper by Ion Stoica from MIT in 2001. It was innovative in the sense that previous solutions used a centralized lookup server containing O(N) states or sent O(N) flooded queries to the network, leading to high lookup latency. Chord also addresses the issue that in a large network many nodes could join and leave concurrently, in which case the system should be able to self-manage itself and stabilize quickly to guarantee correct lookup service.

Introduction:

The Chord protocol essentially works similar to a distributed hash table. It provides only one operation: given a key, it maps to the key onto a node. Chord is a scalable protocol for lookup in a dynamic peer-to-peer system with frequent arrivals and departures. Scalability is a significant property of the protocol: each Chord instance stores O(log N) states about other nodes and sends O(log N) messages to other nodes for each lookup request. Load balancing property of the system is achieved via SHA-1 algorithm by hashing keys. In this post, we present our reimplementation of the Chord protocol in go and our evaluation results in terms of load balancing, lookup latency and the number of messages each lookup invokes. We show our system has good scalability as the paper describes.

Figure 1: Structure of an example Chord-based distributed storage system [1]

Chord Overview:

Identifier Circle & Consistent Hashing:

Chord provides a fast distributed computation of a hash function mapping keys to nodes responsible for them. In our reimplementation, we used SHA-1 hash function. Each node gets a 160-bit identifier using SHA-1 by hashing its IP address. A key also gets an ID by hashing the key. Identifiers are ordered in an identifier circle modulo 2^m. Key should be assigned to the first node whose identifier is equal to or follows k in the identifier circle.

Finger Table:

First, we define some terminologies. A successor of a key is the closest node whose ID is equal to the key or follows key immediately. A predecessor of a key is the closest node whose ID is preceding the key immediately (not equal to). Only a small amount of routing information is needed in the Chord protocol to successfully lookup a key in the distributed network. Each Chord node contains a finger table that contains m number of other nodes’ routing information (ID and IP). The first entry is always the successor of the current node — the next closest node in the clockwise direction on the ID circle. For ith entry in the finger table, it contains the routing information for successor(n + 2^i). The lookup algorithm is mostly based on the finger table.

Figure 2: An example of figure table content in our implementation

Lookup algorithm:

The essential goal of the lookup algorithm is to find the node that’s responsible for key, or key’s ID to be more accurately since the key is assigned with an ID. The node that stores that ID is the successor of ID in the identifier circle. To find the successor of ID, we need to find the predecessor of that ID. Then the target node should be that predecessor’s successor because the ID has to reside between the two. In our reimplementation of the algorithm, we modified quite a bit the actual implementation based on the original pseudo code in the paper. There were many edge cases to handle and those were discovered as we went into development stage.

Nodes Arrival/Departure:

We implemented the stabilization and fix-finger protocols in the paper to handle concurrent joins and leaves.

Implementation:

Programming language: go. About ~2300 lines of code

gRPC framework: https://grpc.io/

Github (final): https://github.com/jiashuoz/chord

Github (broken): https://github.com/jiashuoz/COS518_Project

System Architecture:

Figure 3: System architecture of one single Chord instance in our design

This graph shows all the components in one single Chord instance or one single Chord node. We will use both terms interchangeably in this post. Each Chord instance has a ChordServer, which contains all the routing information and functions associated with lookup algorithms. Each Chord instance also plays the role of an RPC server and an RPC client. It should be able to receive RPC requests from other nodes and send RPC to other nodes. Once all the network connections have been established, the Chord instance will launch the stabilization and fix_finger protocols, which run periodically in goroutines. We also injected a logger and a tracer for testing and debugging purposes. On top of Chord service, we built a simple key-value storage module mainly for testing.

Design of Lookup Algorithm:

We implemented the lookup algorithm in an iterative way. In other words, when a node receives a lookup request, it will iterate through its finger table to find the closest predecessor node it knows of to that key ID. If the closest preceding node in the current node’s finger table is not close enough, the current node will send RPCs to that closer node and ask for help to find the correct node. The remote node receives the request and do the same thing and send back the result to the original node. If the result node is still close enough, then the current node will ask this result node again and the process continues until the target node is found. We have the same APIs as in the paper: Find_Successor, Find_Predecessor, and Find_Closes_Preceding_Finger.

Figure 4: A basic lookup procedure

Stabilization and Fix_Finger Protocols:

The idea of stabilization protocol is straightforward and each chord instance runs this protocol in a goroutine. It periodically sends RPC to the successor to verify it’s the correct successor. If a new node joins the network and sits between two nodes, the predecessor node will update its successor by running the stabilization protocol and notify the new successor that it could be the new node’s predecessor.

Fix_Finger protocol iterates through the finger table periodically and sends Find_Successor(n+2^i) RPCs to periodically update finger entries.

These two protocols provide guarantees that each Chord instance has the correct successor node and routing information.

We registered and hosted all of our RPC function calls using the gRPC framework. We defined the service APIs with arguments and responses in the proto file under github/jiashuoz/chord/chordrpc. Using gRPC allows to quickly create and deploy new RPCs.

Figure 5: Stabilization protocol fixes succ and pred (only succ link shown here)
Figure 6: A newly joined node only updates its successor

Test:

We did a lot of manual testing. We also implemented a tracer and a logger to help us with testing and debugging. Because gRPC took care of many connection setups so we didn’t need to do many dirty works on the RPCs. For example, it could timeout on RPC calls.

Before we implemented the stabilization and fix_finger protocols, our testing process was tedious. We manually populated the finger tables with different network topologies and then sent different lookup queries to the system. The tracer sent back the results of lookups. After we implemented, we thoroughly tested our system under concurrent joins and failures.

We also built a key-value store module on top of the Chord service. It provides Put, Get, Delete APIs. We used this key-value store mainly for testing. We sent requests to the key-value store based on Chord and then check the correctness of the functionality.

Evaluation:

Load balancing:

Figure 7: Number of keys/node for 10 nodes network
Figure 8: Number of keys/nodes for 20 nodes network
Figure 9: Number of keys/nodes for 100 nodes network

Setup: all servers run in a goroutine with a local address. The evaluation was run in an EC2 instance on AWS.

The first evaluation we need is load balance. For this experiment, we want to test the number of keys stored per node given a different total number of keys as well as network size. We tested our system using a total number of keys ranging from 1000 up to 5000 and on 10/20/100 servers respectively. This allows us to observe how our system behaves when the number of nodes scales up. From the graph above, we can see that the load balance of our system meets the expectation with a few outlier data points; each data point represents the number of keys stored in the server given a particular total number of keys to store. Even with 100 servers and 5000 keys to store, most the servers end up with keys around the average value.

Latency:

The second evaluation we did is latency. We want to test the scalability of our system by seeing the latency change when we add more nodes. We tested our system both on the local network as well as a wide area network. This is done by the network simulation libraries provided by gRPC which injects a certain amount of latency during RPC. One interesting factor is that our local computing resources were not able to handle 500 servers running at the same time and gave us a lot of difficulties during testing. That is why we hosted and run our program on an EC2 instance where the computing resource is much more powerful.

Figure 10: Lookup latency for different network sizes (LAN simulation)
Figure 11: Lookup latency for different network sizes (WAN simulation)

Here we can observe that the latency of our system does not scale up linearly as the number of the nodes in the system. This is a good sign that our system has expected scalability regards to the lookup speed.

Hops of lookup request:

Lastly, we evaluated our system on path length, the hops it takes to find the correct nodes. We tested our system on different network size ranging from 10 nodes up to 500 nodes with 200 lookups. From the average value in the graph, we can see that with increasing network size, the hops the system needs to take to find the correct node does not go up nearly as fast. This further proves that our system has good scalability with approximately O(log(n)) lookup hops.

Figure 12: Lookup hops with different network sizes
Figure 13: Probability distribution for the number of hops each request took

Summary:

We thought it was easy to build Chord but it is NOT. But here’s our wrap-up for this post. Go is a very powerful language. gRPC is such an amazing framework. If you ever need RPC in your project, we highly recommend gRPC, which can run in almost any environment. With the internal concurrency programming model in Go, we were able to solve many concurrency issues. Implementing this seemingly easy protocol took more time than we expected. There are many things we could do to improve this project. We will continue working on it in the future. In the end, this project was challenging but we learned a lot from it and enjoyed COS518 course.

References:

[1]: Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan. “Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications.” In: Proceedings of the 2001 conference on Applications, technologies, architectures, and protocols for computer communications. San Diego, CA, USA, Augst 27–31. 2001, pp. 149–160.

--

--