XuperChain RD
10 min readJun 28, 2019

XuperChain: A blockchain system that supports smart contracts parallelization

Nowadays, blockchain technology is not only used for crypto currency transactions, but also plays an important role in a wide range of fields, such as evidence preservation and commodity traceability. However, many existing blockchain systems perform poorly and fail to meet the needs of the business, because smart contracts are executed sequentially. We’ve created a new blockchain system that supports the parallel execution & verification of smart contracts.

Source code:https://github.com/xuperchain/xuperunion

1. INTRODUCTION

Satoshi Nakamoto invented bitcoin in 2009[1], the first large-scale successful application of blockchain technology. In 2014, Vitalik Buterin invented Ethereum [2]. Compared with bitcoin, Ethereum supports smart contracts, allowing developers to develop applications with solidity language based on blockchain. In 2015, the Linux foundation launched Hyperledger [3], which allows developers to develop smart contracts in popular languages like Go. In Baidu, we also used Ethereum to develop blockchain applications previously, but found that its performance could not meet the needs of our business. In essence, the reason is that the smart contracts in Ethereum are all executed serially, and verified serially, that is, for a single node, only a single core of CPU is utilized. In this paper, we propose a novel blockchain data model: XuperModel. Based on XuperModel, our blockchain system XuperChain can use multi-core computing to execute and verify smart contracts.

2. Performance problem of smart contracts

In Ethereum [2], the miners first extract an array of ordered transactions from a transaction pool, then execute the smart contracts in a series of transactions, and finally package them into a block. The verification node that receives this block also needs to execute the smart contract one by one according to the order of transactions in the block, which is the same as the order in which the miners were packaged previously, so the final state is expected to be consistent between the miner node and verification node.

Obviously, the serial execution mode of the smart contract restricts the performance improvement of the blockchain system, because the serial execution mode cannot make full use of the computing ability of multiple cores.

Hyperledger Fabric [3] proposes an approach in which the smart contract is first executed on multiple endorsement nodes in advance to obtain the read-write set and the signature of endorsement node. Then, the order node sorts the transactions and packages them up the chain. The advantage of this approach is that some of the transaction execution can be fully parallelized, but it also has some limitations: the output of the unconfirmed transaction is not immediately visible, and the read dependence of the new transaction can only be the confirmed transaction in blockchain. This makes a delay from the initiation of the contract call to the data update taking effect.

3. XuperChain make smart contract paralleled

3.1 Architecture Overview

XuperChain is mainly composed of virtual machine layer, data bridge layer and model layer. The virtual machine layer interprets the bytecode of the contract, and we currently support solidity and webassembly. The bridge layer handles system calls made in the contract code, such as Get, Put, Iterator, and so on. The model layer handles the transaction commit, rollback, and data query.

Figure 1 Architecture of XuperChain

The whole process goes like this. Statge-1: first, the client side triggers the pre-execution of the smart contract, the bytecode of the smart contract is parsed and executed by the virtual machine, and the system calls like Get, Set during the execution process are intercepted by the bridge layer, which will record the read and write sets in the execution process and finally return to the client side. Stage-2: the client assembles the transaction with the read-write set and attaches his or her signature, and then submits it to the model layer. The model layer will verify whether the version of variables in the read-set matches with local state, and finally the variable’s value in the write set will be updated to the state database.

3.2 XuperModel

In order to describe read-write sets, we need to define a new transaction model called XuperModel. This model is an evolution of the UTXO model of Bitcoin. In Bitcoin’s UTXO model, each transaction needs to reference the output of an earlier transaction in an input field to prove the source of the funds. Similarly, in the XuperModel, the data read by each transaction needs to reference the data written by the previous transaction. In the XuperModel, the input of a transaction represents the source of data read during the execution of a smart contract, that is, from which transaction’s output. The output of a transaction represents the data that the transaction writes to the status database and that will be referenced by future transactions when they execute the smart contract.

To illustrate the XuperModel, consider two transactions: tx1 and tx2, in which tx1 assigns a value of 1 to the variable a and tx2 assigns a value of 2 to the variable b. then there is a third transaction, tx3, which calls a contract to swap value of two variables, and finally its output is a=2 and b=1, exchanging two variables. So, in the input to tx3, it will refer to tx1 and tx2, because the previous values are assigned by tx1 and tx2.

Figure 2 A illustration of XuperModel

3.3 Smart cache of XuperModel

In order to get the read-write sets for contracts at runtime, a smart cache is provided for each contract when it is being pre-executed. This cache is read-only to state DB and it can generate read-write sets and results for contract pre-execution. Meanwhile, it can also be used for contract verification. The cache consists of four parts, a write-set instance, a read-set instance, a state DB reference and a penetration flag marked whether query can be penetrated to DB.

When pre-executing a contract, the penetration flag is set to be true, and the bridge gets a cache for the contract, through which the data in state DB can be read, and the data queried will be cached in the read-set of the cache. Also, the contract can write data, and the data written will take effect in the write-set of the cache. After pre-execution, the read-write sets can be fetched together from the cache and returned to the client side.

When verifying a contract, the penetration flag is false, the verification node generate a new cache instance initialized according to the transaction’s read-write sets. The node will execute the contract again, but the contract can only read data from the read-set prepared. Similarly, data writing only takes effect in the write-set of the cache instance.

To illustrate the cache, consider a contract call. XuperChain call a contract by launch a transaction. The smart cache is generated as a three-level storage object when contracts are pre-executed. Assuming that the contract invoke a Get operation with variable name as parameter, the cache will first read the corresponding data from the its write-set, if not found, then from its read-set, and if still not, then from the DB and record variable name and version in read-set. When verifying contracts, the cache is generated as a two-level storage object. Assuming verifying a contract which invokes Get operation, the cache will first read the latest data from the write-set, if not found, then from the read-set.

Figure 3 A illustration of XuperModel’s Cache

3.4 Transaction conflict processing

We mentioned earlier that the smart cache can extract the read-write set generated by the smart contract pre-execution, which is an important part of the transaction information. In XuperChain, a read-write set consists of a read-set and a write-set. Where, the read set is composed of tuples: {variable name, data version}, which reflects the variable state read when the contract is running, and the write set is tuples: {variable name, data value}, which represents the change to the state database after the contract is executed.

The data version is a tuple: {RefTxid, RefOffset}, indicating the id of the last transaction that modified the variable and the output offset of the transaction. The id of a transaction is the sha256 digest for all the fields of the transaction. The data version in Hyperledger Fabric [3] is also a tuple, but its version is made up of {BlockHeight, TxNumber}, so Fabric cannot support immediate availability of unconfirmed transaction output.

When a node receives a transaction, it first constructs a temporary cache with the read-write set carried by the transaction to verify that the result of the smart contract execution is correct. After passing this check, it verifies that the version of the variable in the read-write set is consistent with the version recorded in the local state database, and if not, it rejects the transaction. It is worth noting that XuperChain requires that all variable names that exist in the write-set must be given the corresponding data version in the read-set, if the variable has no value before, the version field should be left blank. In addition, if there is a read-write set conflict between a transaction in the local unconfirmed transaction pool and a confirmed transaction in the block, the unconfirmed transaction will be rolled back. With a clear read-write set, the rollback action is to restore the variables in the write-set to the previous version declared in the read set.

3.5 Parallel pre-execution of smart contracts

Through the XuperModel, smart cache and the versioned data mentioned above. We can execute smart contracts in parallel. Bridge will generates a new context for each contract call. This context contains an instance of model cache, and it is valid only for the life cycle of pre-executing. Read and write operations in pre-executing will take effect on this cache, so the pre-execution is independent of each other. Since the pre-execution of contracts is a process that does not affect each other, the pre-execution of contracts can be carried out in parallel.

Figure 4: Parallel pre-execution of smart contracts

The figure above illustrates how contracts run in parallel. Suppose that contract1, contract2 and contract3 are initiated simultaneously. Through XuperBridge, three Cache instances are initialized. Cache records the read-write sets during contract execution and returns it to user.

3.6 Parallel verification of smart contract

When the contract has been pre-executed, user in client side gets read-write sets from XuperChain node. Then, user can assemble a complete transaction with read-write sets and signature locally and submit it to XuperChain immediately. XuperChain node will then verify the contract. The verification process is shown in the following figure.

Figure 4: Parallel verification of smart contracts

The verification of contract mainly includes three steps:

Step 1: XuperBridge initializes a new context and fills the cache in the context according to the read-set of the transaction submitted. If a specified version of data in the read-set is not found in local state DB, it means that the data is changed preferentially by other contracts, the initialization of cache will be failed, and the contract will be failed.

Step 2: The node re-executes the contract to verify whether the write-set is the same as the write-set contained in the transaction.

Step 3: If the set is same, the transaction will be confirmed, otherwise the return will be failed.

Conflict contracts will be failed on Step1. Since the verification of each contract is running inside separated cache, the validation of different contracts is independent of each other, the verification of contract can be verification in parallel in Step2. In the third step, we also need to double-check that the version of variables in the read-set is still valid.

3.7 Limitations

With the XuperModel we mentioned above. Our system can make contract paralleled, and the output of the unconfirmed transaction is immediately visible. However, the model also has some limitations. First of all, each access to state DB during contract execution will generate a record into read-set. So if a user accesses too much data during a contract call, the read-set will be too large in bytes. What’s more, the storage requirement of this model is large and needs further optimization.

4. Conclusion

This paper proposes a new blockchain architecture for parallel execution of smart contracts, which is characterized by dividing the execution of smart contracts into two distinct phases: pre-execution and verification, and resolving the transaction conflicts in parallel environment through a data model XuperModel evolved from UTXO [1] model. Currently, there are a lot of researches on how to improve the performance of smart contracts in the academic research. Some of them start from the underlying storage engine [4], and some of them use STM technology [5] to parallelize contract execution. Our approach based on read-write sets is deterministic and easy to implement. In the industry, Hyperledger Fabric [3] has also proposed a model based on read-write sets. The main disadvantage of Fabric’s model is that the data version is bound to the block height, so that the output of the unconfirmed transaction cannot be used immediately. In conclusion, we believe that the architecture proposed by XuperChain is meaningful for people to explore the parallelization of smart contracts. In the future, we will continue to solve some limitations of this model and improve performance.

5. REFERENCES

[1] Satoshi Nakamoto, Bitcoin: A Peer-to-Peer Electronic Cash System, 2008.

[2] Ethereum White Paper. A Next Generation Smart Contract & Decentralized Application Platform,available:https://github.com/ethereum/wiki/wiki/White-Paper, November 12, 2015.

[3] Fabric docs. available: https://github.com/hyperledger/fabric/blob/release-1.4/docs/source/whatis.md, 2015.

[4] S. Wang, T. T. A . Dinh, Q. Lin, Z. Xie, M. Zhang, Q. Cai, G. Chen, B.C. Ooi and P. Ruan: An Efficient Storage Engine for Blockchain and Forkable Applications. PVLDB 11(10): 1137–1150

[5] T. Dickerson, P. Gazzillo, M. Herlihy, and E. Koskinen. Adding concurrency to smart contracts. In PODC, 2017.