Elrond technical progress update — some talk about it, we build it #3
New whitepaper update, architecture highlight, Elrond node and wallet open source release, preparation for testnet launch
1 - New whitepaper update: A more refined technical paper detailing Adaptive State Sharding and Secure Proof of Stake
We’ve come very far from the first technical paper where we introduced Elrond at the end of May 2018, outlining the first viable approach to building not only a sharded blockchain architecture based on Adaptive State Sharding, but also one using a very efficient and robust mechanism called Secure Proof of Stake.
Thus, after a considerable time spent on validating some of the presented core hypothesis, we are quite happy to introduce the second, more refined version of our paper, adding a few more details in order to make the Elrond’s architecture more robust based on the feedback we’ve received. — Whitepaper link.
Here are our main contributions and why they are crucial for the progress of the blockchain space.
a - Why sharding — limitations of vertical scaling
At this point we’ve come to understand that the performance of a blockchain system can be considerably increased in two fundamental ways: “vertically”, by increasing the performance of every node in the network or “horizontally” by splitting the workload among multiple nodes for parallel processing.
Vertical scaling has some limitations, because one can increase the performance of a single machine (node) by adding more CPUs, RAM, GPUs only until a certain point. Given that few nodes in the network will be able to upgrade to keep up, the performance increase on the network will only be marginal; if minimum requirements set for participating in the network become too high on the other hand, centralization risks will arise.
b - Types of sharding — transaction, network and data sharding
Sharding is a concept that has been first used in databases to improve performance and efficiency. A shard is a horizontal partition of the database, which can be stored on a different server, thus spreading the load on multiple machines.
Sharding in blockchains is relatively new and comes in different flavours:
- network/communication sharding: improve the communication by propagating messages in smaller partitions;
- transaction/computation sharding: instead of having every single transaction processed by all machines in the network, split the network into subgroups and have each subgroup process different transactions in parallel;
- state/storage sharding: decrease the storage needs for every node, by splitting the blockchain and state data across shards.
The optimal approach for blockchain sharding, needs to take into consideration advantages from all three sharding types. Elrond’s approach to scalability and increased throughput, called “Adaptive State Sharding”, combines all three sharding types into a solution that will scale almost linearly with the number of shards, improve communication inside the shards, increase performance through parallel processing and reduce storage.
As with every new approach, there are of course a few immediate problems that may come to mind like say, cross-shard communication overhead, or decreased security due to transactions not being verified by all nodes in the network.
But these problems have already been addressed: (1) potential communication overhead can be considerably optimized and (2) security can be maintained by having sufficiently large shards, using random sampling of the consensus group in each shard, and reshuffling of up to 1/3 of validators at the end of each epoch to prevent collusion.
c - State sharding should be dynamically adaptive
Allowing the number of shards to change dynamically according to the available resources (validator nodes) and network usage has a huge impact not only on the throughput, but also on efficiency. In high demand periods, rewards would increase, more nodes could join the network, more shards could be created and more transactions could be processed in parallel. In periods of low demand, with lower rewards, some nodes could leave the network, thus saving energy.
Our Adaptive State Sharding mechanism is based on a binary tree structure, once the number of shards has been computed; this number is used for a deterministic mapping of account addresses to shards. Transaction dispatching in shards is done also deterministic through the mapping of the sender and receiver account addresses to the designated shards.
d - Intuitive demonstration of shard reorganization
For a short demonstration let’s assume our protocol starts with a single shard, Shard 0, and our entire account address space can be represented on 3 bits, meaning we have a maximum of 8 addresses. The number of nodes in a shard is 400, so unless we have more than 800 nodes, we could not split the network in 2 shards.
Once we have more than 1200 nodes, we could have 3 shards and so on. The computation of the number of shards and the mapping of addresses to shards are detailed in Chapter IV - “Scalability via Adaptive State Sharding” in the whitepaper but an intuitive demonstration about the account address mapping and transaction assignment to shards is presented below.
If we have a single shard all addresses will be assigned to this shard and all nodes would process transactions in this shard. Both sender and receiver of every transaction would be mapped to this shard. Once the number of nodes increases above 800, we could make a split and have two shards.
The assignment of account addresses to shards will take into consideration the last bit, all addresses ending in 0 will be assigned to Shard 0, while all addresses ending in 1 will be assigned to Shard 1. The initial nodes from Shard 0 will split in two, half would remain in Shard 0, while the other half would move to Shard 1. Both shards start from the same state and this way the nodes in both shards would be synchronized and could help in the bootstrapping process of the newly registered nodes.
The nodes in the node pool would be uniformly and randomly assigned to the 2 shards and would require an amount of time to synchronize. For every transaction the sender’s and receiver’s shard have to be computed, based on the last bit and sent to the corresponding shard for processing.
Once the transaction is processed in the sender’s shard, a receipt/proof could be sent to the destination shard in order to finalize the transaction. Going further, with 3 shards, Shard 0 would split again in two shards: Shard 0 and Shard 2, the nodes from the previous Shard 0 would split in two, half remaining in Shard 0 and half moving to Shard 2.
e - Proof of Stake — become sustainable or die
Proof of Work mechanisms successfully prevent double-spending, DDoS and Sybil attacks, but do so at the expense of unsustainably high energy consumption.
Hence Proof of stake has been proposed as a promising potential alternative to the above mentioned problems. In PoS the node that proposes the next block is selected by a combination of stake (wealth), randomness and/or age. This mitigates the PoW energy consumption problem but also puts a few other important issues on the table: Nothing-At-Stake attack and Long-Range-Attack. A variation of PoS called delegated proof of stake has improved on the communication necessary for consensus, only to stumble onto a different problem: compromising the network by being heavily susceptible to Bribery and Collusion attacks, that could inadvertently end in cartel wars on the implemented networks.
f - Enter Secure Proof of Stake — making PoS secure and compelling
While understanding and acknowledging these problems Elrond has proposed a novel approach to consensus called “Secure Proof of Stake” combining eligibility through stake, but also rating and random validator selection and an optimal dimension for the consensus group.
The list of attacks presented before are mitigated or entirely removed by the stake locking that happens at the beginning of the validator registration process and punishment by stake slashing partially or entirely in case of malicious behaviors. To be more precise, each node wanting to participate in the consensus has to send a transaction containing more than the minimum required amount to the registration smart contract.
After the transaction is processed, the node is added to an “unassigned node pool”, where it waits for the current epoch to finish. At the end of each epoch a node reshuffling phase occurs, where up to 1/3 of the nodes in the shards and the new nodes from the unassigned node pool will be randomly assigned to shards. The reshuffled nodes are first added to the shard’s waiting list, where they get to synchronize the new state. In the following epoch, the nodes become “eligible validators” and can participate in the consensus group and collect rewards.
The consensus protocol starts by randomly sampling a smaller consensus group out of all eligible validators in the shard (for reduced communication) using a randomness source derived from the previous block’s signature. The randomness source is unpredictable before the signing of the previous block. The sampling is deterministic once the previous block signature is known, meaning that every node can compute the list of validators in the consensus group and the first node to be selected is the block proposer. The block proposer aggregates transactions into a new block and sends this block to the validators in the consensus group for verification.
Each validator will verify the validity of the block, process the transactions and if everything checks out, will also participate in the pBFT consensus. The voting in the pBFT is done for every validator by sending a signature for a multisignature scheme. If the proposer collects more than 2/3 + 1 signatures from the consensus group members, the block is considered validated, the aggregated signature can be added to the block and the block disseminated in the entire shard. The next consensus group will be randomly sampled using the new signature.
Moreover, the consensus protocol remains safe under DDoS attacks, by having a high number of possible validators and no way to predict the order.
The SPoS protocol improves on the cross-shard communication by running a consensus on a block that contains: a ledger block with transactions for the current shard and will be added to the current’s shard blockchain, multiple mini-blocks, each containing cross-shard transactions for a specific shard, two inclusion proofs, one for valid transactions and one for invalid transactions.
The composite block header and inclusion proofs are going to be sent to a Metachain for notarization. The next round every shard will get the information from the Metachain, can check cross-shard transactions for inclusion proofs and so finalize transactions in the receiver’s shard.
2 - The worst prototype except for all the other ones
a - What it isn’t
A prototype is by definition unfinished and unpolished but extremely useful. It requires a rapid iteration and lots of effort to be put in during a short time frame, in order to validate some core hypothesis. It is not meant to be a mainnet or even a testnet launch. It is also not meant for measuring optimal performance, but rather to validate the assumptions on which the theoretical performance claim rest. And if the hypothesis validation succeeds, then that’s a really great feat in itself, and the prototype has achieved its purpose.
b - What it is
In our case the prototype had a clear purpose: it had to validate the hypothesis that we can do state sharding and especially validate our model of doing cross-shard transactions.
Thus, in it we’ve basically built the backbone containing:
- communication p2p module for direct messaging, information broadcast and channel communication;
- cryptographic module dealing with signing and verification for transactions with Schnorr scheme and Bellare and Neven’s multi-signature method for blocks;
- data module to define and describe how the different data types (blocks, transactions, accounts, etc) are connected to each other and stored in the system;
- execution engine used to process transactions and assemble blocks while replicating the state across the network to ensure consistency and security;
- chronology module used for synchronization and time-passing perspective, ensuring a well structured bootstrapping process and proper timeouts;
- consensus module using a rudimentary consensus in a round-robin setting, allowing everyone to take turns in proposing and validating blocks, running independent consensus groups in each shard;
- sharding module using communication topics and enabling both intra-shard and cross-shard transactions;
This took an enormous amount of effort, and was done especially to validate our cross-shard transaction model. While we’ve heard lots of projects discuss sharding, to the best of our knowledge, we were the first to demonstrate that cross-shard transactions can actually work, which is an important milestone in itself.
c - Cross-shard transactions?
Elrond uses an asynchronous model for cross-shard transaction execution which seems to be more elegant and efficient than the two phase commit model. By removing the locking mechanism the communication between shards is improved and availability is gained. Atomic execution still needs to be solved, but since full validation of transactions is the responsibility of the sender’s shard, the only thing left is to enforce execution at receiver’s shard.
How it actually works:
- Transaction is dispatched to sender’s and receiver’s shards
- Transaction is validated and processed in sender’s shard, generating inclusion proof (accumulator)
- Sender’s shard block is notarized in Metachain with inclusion proof
- Receiver’s shard verifies the inclusion proof and adds cross-shard transactions into block
- Receiver’s shard block is notarized in Metachain with it’s own inclusion proof
The prototype used a slightly different approach to achieve the asynchronous processing of cross shard transactions. Instead of doing this through the Metachain with inclusion proofs, we used inter-shard communication channels to send the transactions from sender’s shard to receiver’s shard.
d - Adding a wallet, benchmarks and explorer
In addition to the core blockchain part we have released, we’ve also built a very user friendly wallet, allowing for transactions to be sent, some basic performance to be benchmarks to be tested, and an block explorer allowing account balances to be checked. And we’ve been extremely excited, as we’ve released the first version of the prototype at the end of July 2018.
As we are approaching the release of the first version of the testnet, we want to share what we’ve learned and make the prototype open source allowing anyone to check, review and play with it.- Elrond github link
e - How to detect when someone is selling you BS about their blockchain performance
While our prototype is still in infancy, and has likely many undiscovered bugs and many ways it can be improved, it has served as an significant first step and validation point for our architecture. Although we have not focused on measuring transaction performance, we’ve made sure to be as close as possible to real world scenario refusing to artificially inflate TPS as was common with a non-negligible number of projects in the space.
If we want decentralization to become more than a mere marketing plot, it is critical that the protocol should run on average consumer grade computers (dual core, 4 GB RAM) with realistic internet speeds (10MB/s).
Our initial testing environment used T2.medium AWS instances, where with 10 shards and 250 nodes where we exceeded 1000+ TPS.
To understand how ridiculous latest performance claims have become, let’s assume that a transaction is 0.3 KB in size and we could process 1 million transactions per second.
Each second we would generate 300MB of transaction data that would need to be synchronized across shards/the network. Additionally, other data like blocks (headers, receipts, signatures, tx hashes), consensus messages, requests/responses for synchronization of new nodes would impose even more overhead on communication.
Without sharding, every node would have to synchronize all this information (>300MB) each second. Newly added nodes would need to have a bandwidth much greater than 300MB/s in order to catch up to the current state.(!)
An approach where not all sharding types (network, transaction and state) are combined, would have a small improvement on communication and storage requirements. 300 MB/s would translate to 17.58 GB/min, ~ 1054.6 GB/hour and ~ 25312.5 GB/day. This would mean that at least 3Gbps network is required, but for desynchronized nodes definitely more. The huge requirements for network speed, processing power and storage lead to centralization.
3. How we’ve built things — moving from prototype to testnet
“If I had asked people what they wanted, they would have said faster horses”. — Henry Ford
The task of creating something new, opening the eyes of people to something they thought was essentially unsolvable until recently, is extremely difficult. Indeed, moving one step further and transforming the theoretical experiment into a working product, now that’s a different game altogether.
In the process of building our testnet we’ve researched literally hundreds of papers, debated and discussed countless potential solutions, reviewed and rewritten our code multiple times and stumbled into numerous roadblocks. But we are excited, indeed very excited to be so close to releasing our first version of the testnet.
But before telling you more about what we’ve gained by rewriting everything for our testnet, let me share some of the pains of building the first blocks.
a - Some context on the prototype
Creating our prototype has been tricky and at times a frustrating task. After deciding everything and publishing the first version of the whitepaper, our focus moved to the implementation of the data structures for the blockchain, dissemination of these structures over the network, reaching consensus through Secure Proof of Stake and execution of cross-shard transaction in our Adaptive State Sharding approach.
We have chosen Java as the preferred language for building our prototype. The reason behind this decision was that we wanted a mid-high level language, with cross platform capabilities, native support for threads, web servers and perhaps even more important we already had experience with it, so we could iterate really fast.
Another reason was the availability of reliable libraries for components out of the scope of the prototype like: peer-to-peer networking (TomP2P), database (levelDB), cryptography (spongycastle) etc.
In hindsight this turned out to be somewhat counter-productive due to some problems we ran into while trying to complete our first iteration.
While the implementations of database and cryptography libraries went pretty smooth, things changed with the TomP2P library. Sending and receiving blocks and transactions between tens of nodes in our testing environment worked fine, but after adding the functionality for a proto-consensus and sharding, the limitations of TomP2P became obvious.
For weeks in a row our whole focus was on bug fixing and by-passing problems in external libraries, to the point of trying to get in touch with Tom, the dev in Zurich who had built the TomP2P library for his Master thesis a few years back.
But our efforts have paid off! We solved the problems, and presented a demo of our prototype with cross-shard transactions with an average 120 TPS per shard, bringing our combined throughput to over 1000+ TPS with 10 shards and 250 nodes.
All in all we learned a good deal through this implementation and are happy to conclude that most of our assumptions proved right. More important: we’ve seen that our network model significantly decreases communication and storage needs, and dramatically increases throughput.
Having learned all these valuable insights, we’ve already moved the next phase of our project: rewriting everything from scratch in GO for our testnet launch!
While it was useful for building the prototype, we’ve learned that Java is not the most suited tool for our testnet task.
Preliminary benchmarks already show an increase of 2–5x in performance, by using more optimized go-libraries for p2p networking, cryptography, database etc.
Things have changed a lot from our first paper release in May, presenting the first viable solution for Adaptive State Sharding and Secure Proof of Stake.
Since then, we’ve refined things and have released the first version of our prototype at end of July, demonstrating cross-shard transactions.
We are very pleased to finally make this prototype open source today, as we continue our intensive work on the next implementation in GO.
Based on latest results and feedback we’re also publishing a more refined version of our technical paper.
As exciting as this all is, it’s only the beginning. We cannot wait to share our next milestone: the testnet launch which should hopefully happen sometime in December.
We are strong believers in the blockchain technology and its future impact so despite the current market sentiment,we’re even more focused on innovating and developing Elrond.
Huge gratitude goes to our whole core team for putting in the effort and seeing things through.
We’d love to hear your thoughts and feedback.