After three months into “a very long bear market”, the increase with EOS, starting in the middle of April, has led to a broadened landscape of a big bull market in digital currency. However, after hackers exploited a loophole in BeautyChain（BEC）smart contracts and illegally obtained credits at will, and after a similar loophole was taken advantage of again in SmartMesh（SMT）, and large-scale abnormal transactions took place on Okex, the entire market then immediately entered into a state of dramatic fluctuation. Why has such a small loophole caused such a huge fuss? We cannot help but to ask this question after understanding the incident.
Technical flaws in smart contracts and their solutions
The two technical flaws in smart contracts
Actually, this incident has extensively revealed the two flaws in the technology of blockchain 2.0, represented by Ethereum:
- Smart contracts are not smart enough;
- The lack of a mechanism for ensuring security as well as security tools, in smart contracts.
At the core of blockchain 2.0 are smart contracts, but it indeed akin to shaking the foundation of a whole building when hackers can easily exploit the loophole in smart contracts and do as they please, causing inevitable panic to the market and the rest of the network.
The additive overflow loophole: a bloody incident caused by simple addition!
We can sum up the lesson of the SMT loophole in one sentence: exploit the loophole about additive overflow to avoid security checks and obtain a huge amount of income. First, look at this section of code, the crucial point is in line 206 (Figure 1):
Etherscan link is as follows:
The attack by the hackers and the results are as follows:
The wealth obtained by the hacker:
It can be seen that the hacker has acquired a huge amount of wealth in the form of balance without basis. This number, at this moment, has exceeded the existing monetary total issued throughout the whole world. What great wealth! However, as a result, the monetary total of SmartMesh collapsed instantly. This amount of wealth instantly exceeded the limit on the total of SMT.
The SMT incident can be summarized in a few words: a bloody incident caused by simple addition.
The loophole about additive overflow: a bloody incident caused by simple multiplication!
Likewise, there is a similar case for the process of BEC. In the code line 257 in Figure 2, there is a problem of a huge integer overflow with multiplication:
Contract code address: https://etherscan.io/address/0xc5d105e63711398af9bbff092d4b6769c82f793d#code
The attack launched by the hacker is as follows, along with transfer records:
Instantly, the whole world belonged to this hacker. Again, a bloody incident caused by simple multiplication.
It can be seen from this that the security of smart contracts dramatically affects the foundation of blockchain 2.0 as a whole.
The current smart contracts, from a user’s perspective, are unattended procedural applications which are mechanically executed to provide automatic guarantees. They are able to simply and automatically release and transfer funds when particular conditions are met. Technically speaking, smart contract is a this kind of network service — that is, to accomplish the execution of the program for particular contracts, network consensus must be reached on the blockchain. Because it is based on common understanding, any smart contract code in the blockchain and its status must be public and subject to historical reviews. However, any hacker can view, calmly and quietly, every line of code that could be exploited, just like fierce lions in the jungle who are always roaming in the deep of grasslands, but find the poor antelopes (in this case, contracts) and catch them by surprise. Even though contracts are exploited by hackers, poorly written contracts are still shamefully presented in many cases. Some of the viewers feel pity, some poke fun, some give out a deep sigh, and even one or two of the younger ones say: “This is how it should be!”
As we know, there is a security loophole in roughly every 1000 lines of open-source code. In this regard, with the best version of Linux kernel 2.6, the security bug rate is 0.127 per-thousand-lines of code. Smart contracts are a new thing, however. Relevant programmers haven’t gone through strict training and tests, so their code is reliable only to a very limited degree. In Table 1, we have statistically accounted for the internal function calls of over 8000 contracts deployed on Ethereum, from January to April 2018.
Table 1 Statistics on function calls for smart contracts on Ethereum
It can be seen that the contracts that use the security function in addition, subtraction, multiplication and division, only account for a small portion of contracts, and in general, each contract has the ability to perform the function of transfer. In terms of greatest probability, the best days for hackers have yet to come, and the breakout of security breaches in the market of digital currency could be more regular than the arrival of a subway train. Ethereum is the only blockchain to record the results from Dapp execution, not providing, per se, the UXTO model needed for the double-entry bookkeeping of encrypted currency. Ethereum as such also uses balance to express account balances and, essentially speaking, using single entries is the most primitive way of bookkeeping since ancient times.
How can we change the status quo? As Mr. Lu Xun said, “a real, brave warrior dares to face a miserable life.” Specializing in blockchain, we firmly believe that smart contracts are a concept that goes beyond this era, but the existing methods of its realization need to change.
The three challenges in the face of smart contracts
The existing smart contracts need to have three issues resolved:
1.The issue of security;
2.The issue of reliability;
3.The issue of ease of use.
For the issues of reliability and ease of use, we can rely on artificial intelligence and other relevant technologies for solutions, and this article mainly deals with how to solve the issue of security.
Intelligent contracts, the solution to smart contracts
If you want to really solve the issue of security with smart contracts, you have to design a complete, comprehensive protective system, and continue improving it. Details must include:
1.Pre-protection: standardization in the process of writing codes and loophole detection in code release;
2.In-process verification: the execution of codes and dynamic security check are carried out in the virtual machine for smart contracts;
3.After-incident compensation: the results of the execution of smart contracts are audited to insure that no errors occur during execution, and execution results are within a reliable range. Stakeholders can file a compliant at any time, which will be ruled upon with proper consideration.
We call these kinds of smart contracts, which support a complete protective system, intelligent contracts.
If BEC and SMT are deployed using intelligent contracts, the contract will receive protection at several levels, so as to obtain several times the opportunity of “starting over again, thanks to the kindness of the Gods.” Typical opportunities include:
1.Verification and examination when codes are shaped and released: whether a programmer is willing or not, each code released will go through automatic regular verification and examination, so as to ensure that static codes can pass the examination, and that typical overflow loopholes will find nowhere to hide.
2.Dynamic verification of nodes during the execution of contracts: this dynamic verification will cover the verification of the contract and associated contracts, and the state of the process of execution will be examined, so as to make up for any loophole in execution. Even if a hacker creates a loophole, all the contract executors will carry out strict examinations, and suspend the process.
3.Rational judgment after the execution of contracts: the results after the execution of a contract will be judged through certain rules, and at the same time, artificial intelligence shall be introduced to analyze the reasonable range for the execution of contracts, so as to decide on the output of final results and whether intent was met. For example, to examine accounts in a double-spend scenario, and on a higher level;
4.A mechanism for relevant stakeholders to file complaints, and the technique for automatic judgment: among the nodes for the deployment of intelligent contracts, each has a built-in judging mechanism and artificial intelligence examination mechanism, supporting vote-based decision-making, so as to ensure certain opportunities for retrieving a loss.
In fact, the following technologies are essential for the basic framework of intelligent contracts:
➢ Syntactic checks, based on a rule library
➢ Transaction model identification and security check based on semantic analysis
➢ Security check of smart contracts based on AI formal verification
➢ Dynamic verification and security optimization based on deep neural networks
Realization of the advanced technology of MATRIX intelligent contracts
Syntactic check, based on rule library
Given the program of smart contracts, MARTIX’s built-in compiler constructs a BNF-based AST as an internal representation. For smart contracts that have been compiled into Bytecode documents, MATRIX first disassembles the binary code and then produces a corresponding BNF. Based on our rule library, which is built with domain knowledge and historical experience, the compiler uses recursive descent parsing to check the AST for any security vulnerabilities.
At the syntactic level, MARTIX’s compiler identifies the respective finite state machine and data flow graphs from the program. It then performs rule based checking and code revision.
Typical examples include:
1. Supplementing all conditional clauses to prevent execution problems due to incomplete conditions;
2. Analyzing all public members and functions that are called to determine the level of exposure in contracts;
3. Checking whether transaction steps are complete to make sure that conditions are met and complete.
Transaction model identification and security check based on semantic analysis
At the semantic level, MARTIX’s compiler provides contextual checks to determine operations that do not satisfy rules or are not safe. Typical examples include:
1. Checking objects and methods that have to be exposed to external environments to check their necessity and potential flaws;
2. Validating whether contract branches or processing of ORACLE are completed and whether there are other abnormal operations when a contract is called;
3. Checking the same conditions in different options to avoid anomalies as a result of different call sequences.
With the above static semantic analysis, MATRIX can generally eliminate all the logic flaws on the surface layer caused by human-written smart contracts, but still cannot solve the various logic problems that occur during dynamic execution. These problems include:
1. Failure to deal with combined contract conditions resulting from inaccurate and incomplete code;
2. The relatively large difference between personal, intended contract objectives and real, written code;
3. Because a contract is executed in a distributed manner, there is a difference in the sequence of executed code between all the nodes. As a result, when any abnormality happens to a local contract, other contracts can call or change all kinds of modes of the contract, giving rise to various non-security-related problems.
At the core of MATRIX is AI-aided computing, and there are built-in AI features at all levels. Therefore, for verification of contracts, the formal verification based on AI aid, as well as the method of checking with dynamic restriction, are used to solve the above security problems. Its core fixes include:
4. Using pattern matching to obtain users’ real demand constraints: basic pattern matching is performed on the basis of the rule-compliant, abstract syntax tree formed through semantic analysis, to obtain possible basic models of transactions by users. This method can obtain the local matching of most abstract, syntax branches in a static manner. MATRIX determines candidate models or model combination according to the extent of specific matching, so as to add transaction closure and transaction affirmation based on the model.
5. Model-classifying the abstract tree formed through static semantic analysis, according to the Bayes classifier, which is MATRIX’s AI engine, to determine that every section of tree branches belongs to its corresponding category. In MATRIX, however, for each transaction category, there should be corresponding static and dynamic constraints.
6. Obtaining all the static and dynamic constraints of the current contract, according to the results of pattern matching and the results of artificial intelligence classification, immediately before generating affirmation of contract codes based on these constraints, and carrying out formal verification and dynamic verification based on these results.
For the contracts where model matching fails or classification fails, MATRIX will give a security warning against the contract’s reliability to users, and carry out stricter checks during execution.
MATRIX supports semantic examination at the Bytecode level. At the core is the process of disassembling. Then, an abstract syntax tree is generated, before AI can be utilized for the matching of the syntax tree.
Security examination of smart contracts with formal, AI-based verification
The MATRIX blockchain is equipped with a formal verification framework to validate the security properties of smart contracts. Based on a functional programming language, the framework integrates an SMT solver and has a multitude of models and tools. It has been used for the verification of various software and encryption programs.
Figure 3 shows the formal verification flow for smart contracts. The verification tool-chain is capable of processing contracts at both the source-code and bytecode levels. The source code will be translated into an equivalent program in a functional programming language. The adoption of the functional programming model is to expose the hidden logic and ease the succeeding formal operations. The bytecode will be run on the MATRIX virtual machine, disassembled and transformed into an equivalent functional program. An equivalence checking operation and other consistency checking will be accomplished on the two functional programs.
With the functional programs, a set of property checkers and theorem provers will be applied and validate various security properties (e.g., whether the return value of send()function has been checked).
One key characteristic of MATRIX is its use of AI to automatically identify program syntax to detect typical models and then automatically produce properties that satisfy security requirements. Given the smart contract program, MATRIX’s AI engine will automatically detect partial matching and complete matching to predict the behavior model of codes. Based on such models, the AI engine will produce a set of relevant constraints for in-depth, formal verification.
These patterns can be syntax or structure patterns (or a combination of the two). The former usually contains grammar and function features, while the latter contains structural features.
Dynamic verification and security optimization based on the deep neural network
Table 2 lists the Ethereum smart contracts’ vulnerabilities at three levels, namely: high-level programming language, bytecode, and blockchain.
Table 2. Vulnerabilities of Ethereum smart contracts
To address these issues, MATRIX has resorted to two dynamic approaches:
1. Generative adversarial network (GAN) based security verification;
2. Distributed concurrence-based dynamic model verification.
Security verification based on GAN
MATRIX only requires users to input the core elements (e.g., input, output, and transaction conditions) of a contract with a scripting language. Then, a code generator based on a deep neural network is able to automatically convert the script into an equivalent program. MATRIX adapts the recently developed generative adversarial network to accomplish dynamic security verification. As illustrated in Figure 5, the dynamic verification procedure can be coupled with the code generation framework in a closed loop. The GAN framework consists of two RNNs. One RNN is used to revise existing programs for smart contracts, while the other learns to generate hacker programs from random samples of a given probability distribution. After smart contract programs are generated, they will be deployed in the “sandbox simulation network (one that simulates a blockchain and where experiments can be conducted in a controlled manner)”, together with the corresponding hacker codes. The cost functions of these two networks are tied together so that the overall optimum result is achieved when the whole system reaches a Nash-Equilibrium. At this point, the revised program for smart contracts will have the highest level of security.
For the process of generating the smart contracts codes in Figure 5, the code-generating tool based on a recursive neural network is used to convert script into smart contract codes, where the recursive neural network needs to use the existing smart contract program and its input and output results as the template for training.
Distributed concurrence-based dynamic model verification
Besides the above general security verification and enhancement techniques, MATRIX also deploys customized tools for attacks, as follows:
（1）Contract sequence attack
This attack takes advantage of the fact that the execution of smart contracts is asynchronous and subject to dynamic change. Even if a contract is statically secure, it is still prone to dynamic attacks, unless the contract is designed as dynamically immutable. MATRIX uses machine learning techniques to defend contracts against such attacks. These techniques include relation-checking of the contract, to identify relational contract transactions. MATRIX also provides an asynchronous simulator to help identify “anomaly indicators” in this type of attack.
（2）Timestamp dependence attack
The root cause of this type of attack is due to the excessive discretion of miners. MATRIX uses AI to dynamically check timestamp dependence or random number dependence to avoid such behaviors.
（3）Mishandling of exceptions and reentrancy attacks
These attacks are in essence caused by anomalies triggered by the function calls of smart contracts. MATRIX uses a deep neural network to find the coding patterns leading to such vulnerabilities, obtain codebook signature libraries similar to hacking methods, and conduct static and dynamic reviews of the code base. The dynamic review is based on the constraints in the formal verification, the dynamic production of feature vectors, and the detection of targeted defects.
As competition becomes increasingly fierce, all kinds of user demands are changing dramatically, and each new technology has a very short shelf life. From the perspective of blockchain industry development, smart contracts perfectly represent this “fast-changing world,” and no one knows what will happen next. What we do know is that the way to deal with our “fast-changing world” is to find what doesn’t change in a changing world, so as to calmly face the challenges that may arise at any time. At the core of this solution are intelligent contracts — a safe method of managing risks, based on artificial intelligence and proven through conventional financial systems.
About Bill Li:
A top, Chinese expert in chip design, and holding several patents on chips, Li, served as a chief designer and designed the China’s first model of the WiFi chip. Later, as a member of the chief engineering team and the chief engineer in the base band project, he designed the communications, dispatching and commanding system for China’s first large-scale naval aircraft carrier. He has played a leading role in designing the commercial chips that have been put into mass production, and has won several awards for scientific advancements at provincial and ministerial levels. He authored the book “Communication IC Design”, which ranks as the bestseller in its category and was chosen by first-class universities, such as Beijing University of Posts and Communications, as the teaching material for graduate students in chip and communications systems design. He is an IoT expert and is currently technical director for a major 5G IoT national project. Bill serves as chief network architect for MATRIX AI Network.