How to reduce the cost of running a blockchain full node? There is a new light node solution here

ULTRAIN
ULTRAIN
Published in
13 min readMay 13, 2020

With the operation of the blockchain network, the amount of node data is getting larger and larger, and the cost of running the full node of the blockchain is getting higher and higher. Now a full node of the blockchain, the amount of block data storage is often hundreds of G Up to T, on the one hand, it directly leads to a significant increase in the storage cost of the blockchain computer running the full node. If we want to run the blockchain node on a common PC, the ordinary PC is used for the light blockchain data. Most of the hard disk storage space is obviously difficult for people to receive; on the one hand, it also greatly limits the application scenarios of blockchain technology. Whether it is a mobile phone or an IoT device, it does not currently have the ability to load a few T storage hard disks. .

In order to solve this problem, different blockchain platforms have proposed their own light node solutions, among which the typical ones are Bitcoin’s SPV scheme and Ethereum’s status verification scheme. However, there are certain deficiencies in these two schemes. The SPV scheme of Bitcoin can verify that the transaction has actually occurred, but it cannot verify the specific value of the account at a certain moment; the status verification scheme of Ethereum can verify that the transaction has occurred, and The specific value of the account can be verified at a certain moment, but because this scheme requires data to be written to the block header, the performance of consensus and the robustness are reduced.

Based on Bitcoin’s SPV scheme and Ethereum’s status verification scheme, Ultrain proposes a new light node scheme, which can not only meet the functional requirements of verifying the occurrence of transactions and verifying the specific values of accounts at a certain time, but also Meet the consensus requirements in terms of performance and robustness, where the performance is more than 10 times that of the Ethereum solution.

Nodes in the light node mode do not need to synchronize huge amounts of block data. With the Merkel verification method of the world state, the nodes can check the correctness of the state without replaying the historical blocks, which can enable more applications. Scenarios (such as running light nodes on embedded devices with limited resources). This article will introduce the world state verification problem, the Bitcoin SPV program, the Ethereum world state verification program, and the specific program adopted by Ultrain.

1, the amount of node synchronization data

The blockchain node network can be understood as a distributed state machine. Each node starts from the same creation state, and updates the creation state block by block according to the transactions contained in each block, forming a constantly updated world state. Over time, the data stored locally at each node will continue to grow, mainly including historical block data and constantly updated world state data. The cumulative data volume of the blocks produced by Bitcoin since its creation in 135 months (2009/01 ~ 2020/03) has reached 263.6GB, and is now growing at a rate of more than four GB per month.

When the Ethereum full node is synchronized in full mode (the node will synchronize all block headers, block bodies and replay the transactions in the block to generate world state data) from the network, the current data to be synchronized has reached two or three hundred GB, If you archive the state of the historical world, the amount of data has exceeded 4T. Ethereum nodes also support fast mode for synchronization. The difference between this mode and the full mode is that the transactions in the block are not replayed to reconstruct the world state, but the state data is synchronized from the network. In addition, the Ethereum node also supports the light synchronization mode. In this mode, the node only synchronizes the block header data, and does not synchronize the block body and status data. It only obtains from the network when the corresponding block and status are needed.

From the above data of bitcoin and Ethereum, we can see that the amount of data needed for deploying and running a whole node is increasing, which leads to higher and higher requirements for the hardware specifications of node machines and the threshold of network bandwidth. It takes a few days for a new node to complete the historical data synchronization, and then it can start to participate in the consensus process to generate blocks. The cost of running all nodes is relatively high. In the bitcoin white paper, the SPV scheme that can be used by light nodes is mentioned, which can also be used for payment verification without synchronizing all block data. The payment verification here is only to verify that the transaction payment has occurred, not to verify the legitimacy of the transaction (such as whether the account balance is greater than the transfer amount, etc.).

2、 Bitcoin SPV scheme

SPV (simple payment verification) is described in the bitcoin white paper. SPV is a technical scheme which can verify the validity of payment without running all nodes and saving all block headers in light node environment. The block structure in bitcoin is divided into block head and block body. The block head contains the necessary attributes of the block, which is only 80 bytes in size. The block body contains all transactions under the current block. Generally, a block contains hundreds of transactions, each transaction generally requires more than 400 bytes.

Bitcoin uses the data structure of the Merkel tree to organize the transactions contained in the block: group all the transactions in the block in pairs, hash the transactions, and then hash the resulting hash, and repeat the process until only one hash is left, which is called Merkle root. The node at the bottom of the Merkel tree is the hash value of the transaction data. Each parent node is the combined hash value of two child nodes. Finally, the root node is calculated by layer up calculation. This Merkel root value is stored in the block of bitcoin.

After the light node obtains all the block heads on the longest chain from the blockchain network and stores them locally, it can carry out SPV payment verification: the light client first calculates the hash value of the payment transaction to be verified; then locates the block containing the hash of the transaction, and obtains the hash value sequence needed to construct the Merkel tree of the payment transaction to be verified from the block; according to the waiting for correction Check the hash value of the transaction and its required hash value sequence to calculate the Merkel root; compare the calculated Merkel root value with the Merkel root in the block head, if it is equal, it will prove that the transaction is real.

The size of bitcoin block head is always fixed 80 bytes, about 6 blocks per hour, 52560 blocks per year, and the new storage capacity is about 4m bytes per year. SPV scheme greatly saves the storage space of light nodes. From the above process, we can see that this SPV scheme of bitcoin only ensures that the transaction of verification has happened, but it cannot be used to verify the changes in the world state caused by the exchange. Ethereum stores the Merkel value of the transaction in the block head, which can be applied to the SPV scheme similar to bitcoin; in addition, Ethereum also stores the Merkel value of the world state in the block head, which can be used to verify the world state value.

3、 Ethereum world state verification scheme

The account of Ethereum contains four attributes, nonce, balance, storageroot and codehash. Ethereum uses stateobject to manage the account status, and the account is uniquely marked with address. All account objects are inserted into the Merkle patrica trie (MPT) structure one by one to form statetrie. The root field in the block header data structure stores the root value of statetrie, which is the hash value of the world state.

We assume that there are two accounts a and B on the Ethereum blockchain, and the initial account balance is 10 ether coins. In the block with the number of 1000, there has been a transaction: account a transfers one ether coin to account B, changes the balance of account a to 9 ether coins, and the balance of account B to 11 ether coins (excluding miner’s fees). In the Ethereum light client similar to bitcoin SPV, we can use the Merkel root data of the transaction stored in the block with block number of 1000 to verify that account a has indeed transferred an Ethereum to account B, but only using the Merkel root data of the transaction, we cannot verify whether the balance of account a is 9 Ethereum (or the balance of account B I can’t check the specific value of the world state.

The Merkel root value (field root) of the world state of the block head in Ethereum can be used to verify the specific value of the world state. This field is the RLP hash value of the root node of “state trie” in the Ethereum statedb. In the block, each account is represented by a stateobject object, and the account is uniquely identified by an address. Its information is modified during the execution of related transactions. All account objects can be inserted into a Merkle patrica trie (MPT) structure one by one to form a “state trie”.

In eip-1186 of Ethereum, the ETH [getproof] interface is added to return the information (Merkle path) required by Merkel verification of account and its storage values to complete the verification of world state values. Eth? Getproof has three input parameters:

1) Data: 20 bytes — address of account (external account / contract account)

2) Array: 32 bytes — address of parity status data

3) Quantity|tag: — is the specified block number or string “latest” or “earliest”

The data returned by this interface is as follows:

1) Balance: account balance

2) Codehash: hash value of contract account code; fixed value is returned for external account

3) Nonce: the nonce value of the account, indicating how many transactions have been sent or how many contracts have been created

4) Storagehash: Merkel root value of status value

5) Accountproof: Merkel sequence array required for account verification

6) Storageproof: an array of information required for status value verification, including the following fields:

Ø key: address corresponding to status value

Ø value: specific value of status

Øproof : Merkel sequence array required for state value verification

After the light node obtains the above return information through the ETH ﹣ getproof interface, first, according to the four returned attribute values [nonce, balance, codehash, Storagehash] build the account object, perform the hash operation after RLP coding, and the value obtained can be combined with the account proof to verify with the Merkel tree root value of the world state stored in the block head; after the verification is passed, it is equivalent to confirming the correctness of the storagehash field, and then combining with the Merkel sequence in storageproof to complete the verification of the state value. The relevant verification example codes are as follows:

4、 Ultrain world state verification scheme

Ethereum uses MPT tree (Merkle patrica trie) to manage all account objects. Statetrie stores information of all accounts, such as balance, number of transactions initiated, virtual machine instruction array, etc. with the execution of each transaction, statetrie In fact, it has been changing. After all transactions in the block are completed, the real-time status of all account information will be recorded in the block header.

Ultrain uses chainbase to store all world state information, and intra block transaction execution will update the state value of related objects in chainbase, so the real-time information of world state is stored in chainbase. When the client queries the information in the chain base through the get table records interface, it actually trusts the interface provider, so as to trust the correctness of the data it provides; moreover, the interface can only provide real-time status information, and cannot query the high status value of the specified block. The ultrain world status verification scheme can specify the block high query status information and provide the Merkel path information required for the status information verification, so as to remove the trust dependence on the interface provider.

In the ultrain world state verification scheme, each miner node serializes all the changes made to the world state by all the transactions in the block (i.e. the changes to the data stored in the chainbase) at the end of the block transaction execution, and organizes them into the leaf nodes of the Merck tree in a certain order, and then calculates the Merck root value of the world state change of the block layer by layer, The calculated Merkel root value is saved in memory. When the block height reaches a certain interval (for example, every 100 blocks), make another Merkel root numerical calculation for the world state change of all blocks in the interval, and write the result (called the world state change cumulative Merkel root value) and the block height of each block in the interval and the corresponding Merkel root value into the file system. In order to prevent the performance of the main thread from being affected, this part of the logic is completed by a separate thread. All miner nodes will write the accumulated Merkel root value of the world state change into the system contract through the on chain transaction, and the value will take effect only when more than 2 / 3 miners report the same Merkel root value.

In order to provide the query function of world state verification, it is necessary to set the nodes that provide the query service of world state. Select a non miner node (that is, a node that only receives transactions and blocks and does not participate in the consensus process) and store all the changes to the world state in each block to the file system after the completion of the transaction execution, and respond to the data request of the verification query node and send the state change data to the query node for storage. The query node uses rocksdb to store the modification of each block to the world state, including block num, sequence, and modification content (the modified specific state content); while storing each block of data, the query node, similar to the miner node, calculates the Merkel value of the corresponding world state change of each block, and stores the value in Rocksdb. In addition, the query node provides a query interface for the light node to respond to its query request.

When a light node queries and verifies a specific world state, it needs to go through the following three steps:

1)First, the get table rows interface is used to query the specific world status values. The interface needs to input the data table information (code, scope, table) where the world status is to be queried and the block height information

The returned data includes the specific information of the status (such as the account balance of the data field in the example) and the binary bytes corresponding to the information (raw field in the example, which can also be generated by the light node according to the ABI file of the contract); in addition, the interface will return the block ﹐ num and sequence information corresponding to the modification of the world status.

2) According to the block height and sequence number information returned by get table rows interface, obtain the corresponding sequence information of Merkel tree required to verify the status. The light node calls the following interface:

The paths field returned by this interface is the Merkel tree sequence information:

3) According to the raw data returned from get table row and the paths data returned from get table row proof, the light node can call the following interface to calculate the Merkel tree root value (the light node can also implement its own logic of calculating Merkel root according to the raw and paths data):

The interface returns the calculated Merkel root value, which can be compared with the corresponding value written into the system contract by the miner node through transaction. The same value means that the verification is passed (in this example, the height of the block with the world state change is 34186, if the block interval selected by the miner node is 100, the Merkel root corresponding to the block height of 34200 should be read from the system contract Comparison of values).

To sum up, the interaction process between light nodes and query service nodes is shown in the following figure:

The SPV scheme of bitcoin, the state verification scheme of Ethereum and the state verification scheme of Ultrain are described in this paper. We can see that the light node does not need to resynchronize a large amount of data to verify that the transaction actually happened, and to verify the specific value of the world state at a certain time. Therefore, the storage and computing resources required by the light node are greatly reduced Running light nodes on devices, such as Internet of things devices, realizes the combination of Internet of things and blockchain technology, enabling more business scenarios.

--

--