Program Synthesis — Smart Contract Security 101
After reading this post you’ll know about — smart contract security, an application of Program Synthesis developed by Synthetic Minds and prospects of this industry. My previous post describes in details what Program Synthesis is and why you should care.
To recap the first post — Program Synthesis allows developers to create programs that in turn create other programs. The promise is very appealing but there are challenges that limit the applicability of Program Synthesis. One way is to find problems where Programs Synthesis adds significant business value. One such problem is smart contract security.
You have probably heard about blockchain technology — distributed systems technology that allows participating entities to transact with each other without a central authority. Without going into details, the main breakthrough of this technology is that participants do not have to trust each other in order to transact. The validity of actions is supported by math, encryption, and economic incentives. One of the most promising features within many blockchains is “smart contracts”. Smart contracts are perpetually running programs, i.e., apps, providing services independent of any controlling entity.
Smart contracts are de-facto computing scripts, written in Ethereum Virtual Machine (EVM) code or higher languages built on top of EVM, and typically implement specific functionality. For instance, a distributed computing platform on blockchain, such as Golem or Sonm, is a smart contract that governs the distribution of computation power for those who need it and rewards the sharers. Many things, from a small reminder app to a platform such as AWS, can theoretically be built as a smart contract.
In the most simplified case (permissionless blockchain, all-digital assets, etc) smart contracts have clear advantages. Because, IF properly developed, they are:
- Self-executable–smart contracts can be triggered by events, e.g., “IF event X happens, THEN do Y”. For instance for a decentralized stock exchange, “If I send money to a certain account, then change the owner of a certain stock to my name”. This might have some challenges if an external data provider is involved, but ideally, all data would be trustless and can sit on the blockchain. In this case, self-execution ensures that contracts execute independently of any third-parties, i.e., the stock exchange in the case above.
- Unequivocal–since the set of IF and THEN statement is clearly defined and encoded in the software, there should be no ambiguity about what happens when.
- Immutable–once the contract is deployed on the network, it cannot be changed. An exceptional cases is the underlying blockchain hard-forks in which the majority of network participants agree to implement a change.
The main challenge is that big and bold “IF properly developed”. One part of this challenge is that it is a “contract”. There should be clear business logic. Sometimes it is not trivial to design a contract that accounts for all possibilities. But the other part of the challenge is that it is “smart”, meaning that it is a program and therefore it brings a whole layer of digital security requirements.
The bugs in smart contracts of popular systems such as DAO and Parity (1st and 2nd attacks) alone led to more than $350m being stolen or frozen. Also, with smart contracts, because their executions are immutable, money can just become stuck and inaccessible, as has happened in the past.
Smart contract security
Software developers have been testing their software in different forms for years. So a natural question is–if smart contracts are pieces of software, why do we need a specific solution for smart contract security? Wouldn’t regular software testing suffice? There are a few things that differentiate software development from smart contract development.
- Smart Contract code is permanent. The modern software development practices such as CI/CD allow developers to release updates almost on a daily basis. Smart contract execution is immutable, and a developer cannot just release a patch to fix bugs. Theoretically, it is possible to have the entire network come to consensus about reverting, but it is a huge pain. For instance, the DAO held 14% of all ether when it crashed, motivating the Ethereum community to hard-fork. In the non-blockchain world, the analogy is to think about asking Amazon to change AWS’ underlying infrastructure to fix bugs in a customer’s cloud machines. Thus, the software development philosophy of continuous improvements is not entirely applicable to smart contract development.
- Smart Contracts’ code is public. In traditional software, customers only see the user-facing elements of applications and therefore have a limited-to-no understanding of the internal structure of the software. The deployed code for smart contracts is publicly visible, allowing decentralized execution. But this also adds more capabilities to hackers, as they have more information to find and exploit vulnerabilities. For many smart contracts, in addition to the deployed code, the source code is also public. Of course, open source is not a blockchain-specific phenomenon as many traditional products from Linux to Drupal are open-sourced. In contrast to those mature systems, standards and best practices are still being developed in this nascent domain, alongside the additional moving target of development happening at the infrastructure layer.
- A high value of dollars at risk per one line of code ($/LOC). The DAO was about 1200 lines of code and this is an average-size smart contract. The DAO breach was pegged at $180m lost, which represents $150k per line of code. Compared to non-blockchain software (e.g., Intuit Quickbooks is 10 million lines of code) the value of assets per line of code differs by orders of magnitude and therefore smart contracts require much more stringent security checks.
Companies addressing smart contract security
Within 3 last weeks, I have spoken to 16 companies from a broad range of industries from a security token platform in the US to a marketplace for classified goods in Singapore. There was unanimous agreement about the need and desire for smart contract-security, but the reason and methods differed from company to company.
There are two main ways how companies go about smart contract security — use smart contract security internally or hire audit firms. Most companies that I spoke to used some combination of both.
- Security tools. There are a plethora of tools that developers could use internally to check the security of developed smart contracts. The frequency of checks might range from every major build to every major release based on the complexity of smart contracts and needs. These tools are mostly open-sourced. Main tools that frequently came up in conversations are Mythril.ai, Securify, Oyente, QuantStamp (marketplace), KLab, Trail-of-bits, Octopus, Solidified. A common thread across these tools is that they check for known vulnerabilities. Depending on the method it might be more or less sufficient, but definitely not the ultimate security companies want. The output of such tools is usually the type of mistake and the line where it occurs. Companies frequently complain about the high level of false positives. False positives arise due to over-approximating the check against known patterns of bugs and not considering the intent amongst other issues.
- Audit firms. A gold standard in the blockchain industry is to hire auditors that do the code review. These firms might be: 1) smart contract security specific companies such as Hosho, 2) traditional big house such as EY or 3) blockchain native enterprises such as Consensys. Besides focused audit, the main value that customers get is a stamp of approval, i.e., an audit certificate, that augments the code’s credibility. One of the clients, that I talked too, was struck by the idea that in the case of a breach, audit firms bear little responsibility other than reputational risk. The financial burden lies with the developers. The result of the audit is usually a report that describes discovered flaws and classifies them by severity. Both DAO and Parity had (allegedly) passed audits before the attacks happened.
What about a different approach?
All companies that I talked to use at least some security tools. These tools are useful, but the tools do not understand the meaning and semantics of the code in sufficient depth. Hence the need for an expert in smart contract security to review the output of the tools. Very few companies have such experts in-house, and that is why audit firms thrive. Moreover, audit firms’ employees have seen hundreds of different cases and can conduct tests that are scoped a bit beyond known bugs. But still, it is not the authoritative security that companies desire. What if we had another method that combined the best of both worlds and did automated end-to-end verification exploring the attack space?
End-to-end verification for general software is close to intractable, and very hard without extensive annotations. However, smart contracts run within a restricted execution environment, e.g., EVM. This makes their verification/synthesis tractable. Even with upcoming blockchain scaling solutions, smart contract executions will always be restrictive. The reason is slightly technical, but it has to do with the fact that the stability of the entire network, i.e., the ability of miners to make progress, will be under question without restrictions. What this means is that the property that makes smart contracts possible on the blockchain, i.e., truncated executions using gas limits or similar techniques, also enable proofs about them.
I spent this winter at Synthetic Minds as employee #2. This is a YC company that raised $5.5m from Khosla Ventures and uses Program Synthesis to provide a managed service for smart contract security. In a nutshell, Synthetic Minds’ product is able to automatically read smart contract code, understand it, and then explore the space of potential attacks, both known and unknown. Moreover, using Program Synthesis the system synthesizes attacker’s code to allows customers to understand and examine how vulnerabilities might be exploited. Smart contract security is a vivid example of an area where Program Synthesis might solve the problem better and in a much more usable way than with other means.
As a side note, Synthetic Minds’ way of doing synthesis is different from the baseline CEGIS described in Post 1.
As with any applications of Program Synthesis, writing specifications for the smart contract’s functionality might be complex. The insight here is that specifications becomes simpler when written assuming an adversary. For instance: “don’t leak balance”, “keep user metadata disjoint”, “non-admins cannot update identities” can be stated in simpler ways by describing what the attacker can and cannot do. Once these requirements are stated, the automated system proves whether the conditions are met, or if they are not (=bug) it synthesizes an adversary’s code. Such method yields significant business advantages:
- Higher confidence because the system explores potential attackers, including those that we don’t yet know. With such verification, companies get guarantees which they don’t get with testing/auditing.
- The system doesn’t require specifications for each function and instead does end-to-end verification. To start using the system requires very little manual work.
- The output is explicit attacker’s code rather than a set of functions and labels in the code. Possible exploit scenario allows developers to speed up the debugging process, and use them as future test cases.
- Fewer false positives that helps developers focus on the important issues.
The system takes as input a smart contract in Solidity and either 1) assertions in the code or 2) plain English description of the properties e.g. “keep metadata disjoint” (which the Synthetic Minds team converts into assertions). Source languages besides Solidity will be added soon.
Geeky stuff or where is Program Synthesis?
Under the hood, the system works as follows. For every smart contract A and a number of properties (x,y,z) Synthetic Minds create a generic user X that is given access to functions within A. The system then uses symbolic execution over unknown programs to automatically analyze A across potential transactions and time. It then produces a number of theorems and proves them using Z3 theorem solver.
This solution explores the attack space and evaluates to either 1) “no violation”, in which case it generates an independently (machine) checkable proof-certificate or 2) synthesized code for an adversary. Synthetic Minds’ solution is looking for property conformance from first principles, rather than checking against past exploits. This is one of the major differences from other solutions. If there are no flaws, then Synthetic Minds produces a proof-certificate that companies can independently verify. The proof-certificate is mechanically check-able, and not just a PDF.
Program Synthesis is a very exciting space that has a huge potential to change not only specific applications such as smart contract security but the whole Software Development industry. For instance, another problem that Program Synthesis could solve is the scarcity of blockchain-focused software developers. From late 2017 to late 2018 the number of openings for blockchain-skilled developers rose by 400% to 12,000+. Moreover, the salaries are higher by an average of 10%–20% versus non-blockchain fields. Use of Program Synthesis could not only augment blockchain developers but also give capabilities of smart contract development to people without coding skills.
But even outside of blockchain there are a plethora of opportunities for Program Synthesis. Many applications from the pricing of derivatives to the training of robots are being tackled. I am sure that in the next years we’ll see many great products released by Synthetic Minds and other emerging companies in the space.
Synthetic Minds is a fast-growing company that has paying customers and huge backlog waiting to be on-boarded. They raised $5.5m in Oct 2018 from Y Combinator, Khosla Ventures, and Pantera Capital and are hiring engineers.