Opera Chain: Linear Scalability

A quick primer on a few Lachesis concepts

Event Blocks & 2n/3 Consensus

Event blocks occur when three nodes share transactions. Each time nodes exchange transactions they create a new shared event block. This block contains all transactions shared by all participating nodes and is signed by the creator and the recipients.

Each node has a genesis event block. When a node receives data it will attempt to create an event block with other participants.

The node itself keeps track of specific network data, in particular the height vector (the current index of event blocks created by each node, which is also used in the cost function — explained later) and the in-degree vector (a structure that tracks edges from other event blocks to the top event block from that node).

Using In-degree vector for each node / height vector of each node we can determine a cost function result for each node.

In this particular instance, we can select two random nodes. We will explain more on the cost function soon.

Our random selection returns the green node, and the yellow node as the other parents of our event block and we initiate a synchronization event.

This procedure will continue asynchronously across all nodes as they receive data. As the event block DAG is populated we want to identify event blocks that are shared by 2n/3 of the network. These event blocks we can consider finalized and output their ordering.

To achieve this we would compare which event blocks are known by 2n/3 of the network. We can do this by testing if a previous rounds’ witness is known by the current event block. For our explanation above, the witnesses are the genesis event blocks. (We will explain more on witness selection soon).

This illustration shows three rounds — G, G+j and G+k, each denoted by the first witness event block created.

In round G+j, blue can ‘see’ round G blue, green, yellow, red and purple.

In round G+j, green can ‘see’ round G blue, green, yellow, red, and purple.

In round G+j, yellow can ‘see’ round G blue, green, yellow, red, and purple.

In round G+j, red can ‘see’ round G blue, green, yellow, red, and purple.

In round G+j, purple can ‘see’ round G blue, green, yellow, red, and blue.

In the above example, for round G+j, we would have the following:

  • G+j’s blue knows of G’s blue, green, yellow, red, and purple
  • G+j’s green knows of G’s blue, green, yellow, red, and purple
  • G+j’s yellow knows of G’s blue, green, yellow, red, and purple
  • G+j’s red knows of G’s blue, green, yellow, red, and purple
  • G+j’s purple knows of G’s blue, green, yellow, red, and purple

All the events in G can be finalized (ie shared by 2n/3), but not until round G+k is reached.

With these concepts discussed, let’s look at some physical limitations to vertical scalability.

Physical limitations of vertical scalability

  • Processing (CPU)
  • Memory
  • Network

Processing (CPU)

The most consuming part of processing is transaction validation. A single core can validate several thousand transactions per second (~2.2K). With cryptographic optimizations for GPU this is estimated to be 5x faster than CPU processing. Fully optimized GPU can validate 100K+ TPS on a single node.

Going to extremes, we consider a 72 core instance, This can give us ~158.4K TPS (~2.2K x 72 cores) on a single instance. With GPU optimizations this can reach 792K TPS.

While theoretically possible, this does not equate to a real world implementation of 792K TPS. It’d be possible if a single high-throughput node contains just as many transactions. In practice however, we can see that it is not feasible.

Memory

260 Bytes per transaction, 100KB event blocks. 393 transactions per event blocks. 300K transactions would require ~74 MB of memory.

Network Bandwidth

With a transaction size of 260 bytes, 300K transactions would require a 600 Mbps network. Theoretically possible, but not practical.

Practical Implications

The size of each event block processed by the Lachesis Consensus Algorithm (LCA) is intended to be expanded up to 100KB.

Each transaction is 260 Bytes, a single event block can include up to 393 transactions. The time it takes for each node to create an event block is 0.1 seconds, each node creates 7~10 event blocks per second. Allowing that the number of transactions requested is infinite (equally distributed across all participating nodes) and that 100 nodes are participating, each node would asynchronously and simultaneously create 7~10 event blocks per second or 3930 transactions per second (393 transactions per event block x 10 event blocks per node).

When the number of event blocks reaches 2n/3, the Lachesis protocol adds and verifies another Main Chain Block (fixed finality block that cannot be changed). If 100 nodes are available, around 700~1000 event blocks are created per second. Each stage processes approximately 700~1000 event blocks or 275,100~393,000 transactions per second.

There are quite a few technical disclaimers to these numbers.

Simulation

Node participation effects on propagation

2 nodes
500 transactions every 500 milliseconds for each node 
Event blocks created every 100 milliseconds

Time: 18183
Transactions: 8375
Pending: 58
TPS: 2763.5703679260846
Data:7.778 MiB 
Time To Finality: 657

2 nodes
500 transactions every 100 milliseconds for each node 
Event blocks created every 100 milliseconds

Time: 14940
Transactions: 10334
Pending: 320
TPS: 4150.2008032128515
Data:9.752 MiB
Time To Finality: 621

4 nodes
500 transactions every 100 milliseconds for each node 
Event blocks created every 100 milliseconds

Time: 22021
Transactions: 7257
Pending: 157
TPS: 1977.294400799237
Data:6.811 MiB 
Time To Finality: 1982

We note a decrease in overall TPS, due to an increase in TTF (Time To Finality), as node participation increases, asynchronous processing can increase, but finality also increases.

8 nodes
500 transactions every 100 milliseconds for each node 
Event blocks created every 100 milliseconds

Time: 19726
Transactions: 3563
Pending: 92
TPS: 1083.74733853797
Data:3.393 MiB 
Time To Finality: 2860

The greater the participation of nodes, the higher the asynchronous transaction processing at a trade off of increase finality.

Graph 1: Time To Finality (ms) vs Node Participation
Graph 2: Transactions Per Second (To Finality) vs Node Participation
Graph 3: Asynchronous Transactions Per Second

We will discuss two core elements of the Lachesis Algorithm. The cost function, which achieves faster witness creation, and the flag table, which allows for faster witness selection.

Cost Function

A key difference in the Lachesis protocol is the in-height vector selection of Nodes known as the Cost Function.

To skip the explanation and see the results search for [CF001]

The following is an excerpt from our technical paper:


We define the Cost Function (CF) for preventing the creation of isolated nodes and ensuring the even distribution of connections between nodes. When a node creates an event block, the node selects other nodes with a low cost function result, and we refer to the top event block of the referenced nodes. An equation (1) of CF is as follows,

CF = I/H (1)

where I and H denote certain values of in-degree vector and height vector respectively. If the number of nodes with the lowest CF are more than three, two of the nodes are selected at random. If we select nodes that have the highest H, we have a high probability to create a Witness (H indicates that the communication frequency of the node is high).


I = the In-Degree Vector — the number of edges from other event blocks created by other nodes to the top event block of this node

H = Height Vector — the number of event blocks created by other nodes; a measure of communication frequency


Algorithm 2 shows the selection algorithm for reference nodes. The algorithm operates for each node to select reference nodes. Line 4 and 5 set the min_cost and Sref to an initial state. Line 7 calculates the cost function cffor each node. In line 8, 9, and 10, we find the minimum value cf and set Srefaccordingly. Line 13 selects randomly one node ID from Sref as the selected reference node. The time complexity of Algorithm 2 is O(n), where n is the number of nodes.

After the reference nodes are selected, three nodes communicate and share information. A node creates an event block by referring to the top event block of the referenced nodes. The Lachesis protocol communicates asynchronously.


Let’s practically explain the above

Here we have a set of Nodes A through E which are the Witness set from Generation G.

Height Vector for all Nodes

In-degree Vector for all Nodes

A node receives a set of transactions, it wants to initiate a synchronization event.

CF = I/H (1)

The node has the following height and in-degree vector

Height vector for all nodes

In-degree vector for all nodes

For purposes of CF It calculates Sref as

A: 0 (0/0)
B: 0 (0/0)
C: 0 (0/0)
D: 0 (0/0)
E: 0 (0/0)

We do a random select between B, C, D, and E

The node synchronizes data to node A, & B

Height vector for node A

In-degree vector for node A

Height vector for node B

In-degree vector for node B

Node A created an event block (1), this increases its height vector by +1

Node B event block B is referenced by event block (1) (Edge from 1 to B), this increases the in-degree vector by +1

So we end up with

Height vector
A+1

In-degree vector
B+1

And the events are synchronized between the three nodes.

Node C has the following height and in-degree vector

Height vector for node C

In-degree vector for node C

For purposes of CF It calculates Sref as

A: 0 (0/0)
B: 0 (0/0)
C: 0 (0/0)
D: 0 (0/0)
E: 0 (0/0)

We do a random select between A, B, D, and E

Height vector for node B

In-degree vector for Node B

The above is our starting point.

Node C created an event block (2), this increases its height vector by +1

Node B event block B is referenced by event block (2) (Edge from 2 to B), this increases the in-degree vector by +1

So we end up with

Height vector
C+1

In-degree vector
B+1

Height vector for node B

In-degree vector for node B

Height vector for node C

In-degree vector for node C

Height vector for node D

In-degree vector for node D

For purposes of CF It calculates Sref as

A: 0 (0/0)
B: 0 (0/0)
C: 0 (0/0)
D: 0 (0/0)
E: 0 (0/0)

We do a random select between A, B, C, and E

Node D synchronizes data to node C, referencing event block 2

Node C synchronizes data to node C

Height vector for node C

In-degree vector for node C

Node D created an event block (3), this increases its height vector by +1

Node C event block 2 is referenced by event block (3) (Edge from 3 to 2), this increases the in-degree vector by +1

Height vector
D+1

In-degree vector
C+1

Node C & D

Height vector

In-degree vector

Node A synchronizes data to node D, referencing event block 1

Height vector for node A

In-degree vector for node A

For purposes of CF It calculates Sref as

A: 0 (0/1)
B: 0 (1/0)
C: 0 (0/0)
D: 0 (0/0)
E: 0 (0/0)

We do a random select between B, C, E, and E

Node A synchronizes data to node D, referencing event block 1

Height vector for node D

In-degree vector for node D

Node A created an event block (4), this increases its height vector by +1

Node A event block 1 is referenced by event block (4) (Edge from 4 to 1), this increases the in-degree vector by +1

Height vector
A+1

In-degree vector
D+1

Height vector for nodes A & D

In-degree vector for nodes A & D

Node D synchronizes data to node E, referencing event block 3

Height vector for node D

In-degree vector for node D

For purposes of CF It calculates Sref as

A: 1 (2/2)
B: 0 (2/0)
C: 2 (2/1)
D: 2 (2/1)
E: 0 (0/0)

We do a random select between B, & E

Node D synchronizes data to node E, referencing event block 3

Height vector for node D

In-degree vector for node D

Node D created an event block (5), this increases its height vector by +1

Node E event block (E) is referenced by event block (5) (Edge from 5 to E), this increases the in-degree vector by +1

Height vector
D+1

In-degree vector
E+1

Height vector for nodes D & E

In-degree vector for nodes D & E

Node B synchronizes data to node A, referencing event block 4

Height vector for node B

In-degree vector for node B

For purposes of CF It calculates Sref as

A: 0 (0/1)
B: 0 (2/0)
C: 0 (0/1)
D: 0 (0/0)
E: 0 (0/0)

We do a random select between A, C, D, & E

Node B synchronizes data to node A, referencing event block 6, referencing event block 4

Height vector for node A

In-degree vector for node A

Node B created an event block (6), this increases its height vector by +1

Node A event block (4) is referenced by event block (6) (Edge from 6 to 4), this increases the in-degree vector by +1

Height vector
B+1

In-degree vector
A+1

Height vector for nodes A & B

In-degree vector for nodes A & B

Node C synchronizes data to node D, referencing event block 2, and 5

Height vector for node C

In-degree vector for node C

For purposes of CF It calculates Sref as

A: 0 (0/1)
B: 0 (2/0)
C: — (Self)
D: 0 (0/1)
E: 0 (0/0)

We do a random select between A, B, D, & E

Node C synchronizes data to node D, referencing event block 2, and event block 5

Height vector for node D

In-degree vector for node D

Node C created an event block (7), this increases its height vector by +1

Node D event block (5) is referenced by event block (7) (Edge from 7 to 5), this increases the in-degree vector by +1

Height vector
C+1

In-degree vector
D+1

Height vector for nodes C & D

In-degree vector for nodes C & D

[CF001] We modify our simulation to use our cost function

Graph 4: Gossip Based Asynchronous Finality with Cost Function

Finality still degrades as node participation increases, but the degradation is significantly less severe. The cost function gravitates towards high H nodes, consider H as an index of event blocks, if node A has created 20 event blocks, then node A has an H of 20, the more event blocks created, the higher the likelihood of being selected.

The implications here are

  • Favor nodes with longer network participation
  • Not all nodes have the same DAG view (High H nodes will have higher probability)
  • Faster finality, less decentralization

What is the effect on transactions for those specific high H nodes?

Graph 5: Gossip Based Cost Function Transactions Per Second

Not surprising, since we know throughput increases for the selection of Hnodes while excluding the rest of the network.

Thus far, we have considered the finality based off of the standard 2n/3 propagation as described by asynchronous byzantine fault tolerant consensus algorithms.

Now, let’s consider another key component of the Lachesis system, specifically the Clotho search and flag table construction.

Flag Table

Flag tables are a way of finalizing transactions as soon as possible. They allow for fast creation of witnesses. Witness selection needs to occur to finalize a round. A round needs to be finalized for transactions in previous witnesses to become finalized. This replaces the known rule as discussed above.

We will explain the implementation, if you simply wish to see the results search for [FT001]

The far left witness event blocks are previous witnesses (or genesis events)

Nodes are denoted as A, B, C, D, & E

Flag table of genesis event in A

Height: 1
Parent: A

Flag table of genesis event in B

Height: 1
Parent: B

Event block 1 is created
Event block 1 has awareness of genesis events A, & B

Flag table of 1

Height A: 1 (synced block from A to B)
Height B: 2 (root parent is A)
Parent A
Even Parent B

Parent A 10000
Parent B 01000

Bitwise OR

10000
01000
11000

Add synced parent B 01000

Bitwise SUM

01000
11000
12000

Flag table

12000
11000

Event block 2 is created
Event block 2 has awareness of B, & C

Flag table of 2

Parent B 01000
Parent C 00100

Bitwise OR

01000
00100
01100

Add synced parent B 01000

Bitwise SUM

01000
01100
02100

Flag table

02100
01100

Event block 3 is created
Event block 3 has awareness of B, C, & A

Flag table of 3

Parent 2 01100
Parent D 00010

Bitwise OR

01100
00010
01110

Add synced parent 2 01100

Bitwise SUM

02100
01110
03210

Flag table

02200
01110

Event block 4 is created
Event block 4 has awareness of A, B, & D

Flag table of 4

Parent 1 11000
Parent D 00010

Bitwise OR

11000
00010
11010

Add synced parent D 00010

Bitwise SUM

11010
00010
11020

Flag table

11020
11010

Event block 5 is created
Event block 5 has awareness of B, C, D, & E

Flag table of 5

Parent 3 01110
Parent E 00001

Bitwise OR

01110
00001
01111

Add synced parent E 01111

Bitwise SUM

01111
00001
01112

Flag table

01112
01111

Event block 6 is created
Event block 6 has awareness of A, B, & D

Flag table of 6

Parent 4 11010
Parent B 01000

Bitwise OR

11010
01000
11010

Add synced parent 4 11020

Bitwise SUM

11010
11020
22030

Flag table

22030
11010

Event block 7 is created
Event block 7 has awareness of B, C, D, & E

Flag table of 7

Parent 2 01100
Parent 5 01111

Bitwise OR

01100
01111
01111

Add synced parent 5 01112

Bitwise SUM

01111
01112
02223

Flag table

02223
01111

Event block 8 is created
Event block 8 has awareness of B, C, D, & E

Flag table of 8

Parent E 00001
Parent 5 01111

Bitwise OR

00001
01111
01111

Add synced parent 5 01112

Bitwise SUM

01111
01112
02223

Flag table

02223
01111

Event block 9 is created
Event block 9 has awareness of A, B, C, D, & E

Flag table of 9

Parent 6 11010
Parent 7 01111

Bitwise OR

11010
01111
11111

Add synced parent 7 02223

Bitwise SUM

02223
11111
13334

Flag table

13334
11111

Event block 10 is created
Event block 10 has awareness of A, B, C, D, & E

Flag table of 10

Parent 6 11010
Parent 9 11111

Bitwise OR

11010
11111
11111

Add synced parent 9 13334

Bitwise SUM

13334
11111
24445

Flag table

24445
11111

[FT001] We apply this modification to our simulation.

Graph 6: Gossip Based Asynchronous Transactions Per Second
Graph 7: Gossip Based Asynchronous Time To Finality

The implementation of the cost function allows us to increase node selection to maximize witness creation.

The implementation of the flag table allows us to decrease the time to finality for witness creation.

The time complexity of the Lachesis algorithm means that a much faster performance speed can be achieved with O(n Log(n)).

The performance speed according to the time complexity O(n²) and O(n Log(n)) (n refers to number of nodes) is shown below.

n square = n * n n 
Log N = n * log(n) 
n*n vs n * log(n) 
n vs log(n)
n² vs nlog(n)

If n²=10, nlog(n) ~ 3.64
If n²=100, nlog(n) ~ 23.03
If n²=1,000, nlog(n) ~ 109.22
If n²=10,000, nlog(n) ~ 460.52
If n²=100,000, nlog(n) ~ 1820.35
If n²=1,000,000, nlog(n) ~ 6907.76