CertiK DeepSEA — a Talk by Dr. Vilhelm Sjöberg at ETHIS
Hello, I’m Vilhelm Sjöberg from CertiK.
First, I’ll start with… what is CertiK?
CertiK is a startup that was founded from a research group at Yale University, led by Professor Zhong Shao. We have been doing Formal Verification in academia, and now we’re applying it to smart contracts and blockchains.
Why is Formal Verification important?
The proposition of Formal Verification and the idea of smart contracts are actually very much related — they both intend to move trust from people to programs. The technology of blockchain allows the same program to be run on many independent computers, leveraging consensus algorithms to create an immutable record in which transactions can execute upon the occurrence of certain business logic rules. This is because blockchains enable trustless applications — -for example distributed exchanges or sharing-economy cloud computing — -which are based around smart contracts. The customers can audit the contract source code and know exactly what is going on, and the company can not cheat us by changing the rules after the fact.
However, if you move the trust from people to programs, then the programs need to be very trustworthy. And right now they are not. We all know lots of examples where a very tiny fault — -for example an integer overflow or a reentrancy error — -caused a contract to be completely insecure and led to huge losses. The root cause here, of course, is that once you have uploaded a smart contract, there is no way to fix errors in it, and it’s also fully open for hackers to search and uncover problems. This is not a coincidence — -auditability and immutability are the fundamental reasons of using smart contracts in the first place. But it means that the security demands in this environment are really high. Ethereum contracts developers are not particularly bad programmers, but the industry-standard software development methods do not provide high enough quality for these applications.
In particular, if you consider traditional quality assurance methods, there are a few common ones in industry. You can do manual code audits. But this is expensive, because you need an expert developer to look at each line of every program, and there are no guarantees that you find all errors.
Or you can use various forms of testing. Most industry software is tested this way, but the problem is that you never have any guarantee that your test cases have good enough coverage to find all bugs.
Sometimes you can use a runtime monitor, such as an operating system, to check each operation as it happens. In the blockchain setting this is not so easy, both because the runtime checks may be too expensive, and because many security problems can’t be detected at runtime, due to the fact that the monitor doesn’t have enough information to know if a given action is good or not. In general, standard software development methods do not provide the high level of quality that we need.
Fortunately, computer science actually has an method that can help here, which is called Formal Verification. The idea is that you would use a mathematical proof to show that your program behaves correctly, and then run software checks on the proof. If you prove correctness as a theorem, then you know that it will hold for absolutely all inputs, not just for a few specific test cases. Among all the methods of ensuring that software works correctly without bugs or security flaws, Formal Verification is the gold standard, providing the most reliable assurance. However, although Formal Verification has been studied for decades in academia, it requires very specialized skills and has not yet been widely used in industry. With the rise of fully automated smart contracts, the ideal use-case seems to have been created.
The foundation of CertiK began when our research group at Yale used Formal Verification to develop a multi-core operating system kernel. This was one of the most complex pieces of software that had ever been Formally Verified, and it is on the cutting edge of computer science research. It has led to quite a few conference papers and is widely considered a very impressive accomplishment.
The key technical innovation that let us verify the CertiKOS kernel was something we call “deep specifications and abstraction layers.” The idea is that, when you write a piece of code, you also write a specification, which states exactly what the code is doing. Then, you write a proof that shows that, indeed, the code correctly implements this specification. But because the specification is “deep,” you only have to write such a proof once, and afterward, all the work can be done using only the specification, without having to deal with the code again.
We then have a library written in the Coq proof assistant, which allows us compose such layers together. For example, you can take two layers and put them next to each other, in order to get a module which implements a larger interface. Or you can stack two layers on top of each other, in order to satisfy the imports of the upper layer from the lower layer. As you do this, the layers library will also join together the proofs for the individual layers, so you automatically get a correctness proof for the entire layer.
We used this method to verify the CertiKOS kernel. The kernel is made up of hundreds of these layers, which are composed together to form the full kernel.
The beginnings of this project go back to 2015, when we started developing the deep specification concept to use for OS verification and worked on programming language support to do this kind of development. In 2016, this method of deep specification had started having an impact in academia. Also in 2016, we began thinking about applying Formal Verification in industry, and started a startup to do so. And from 2017 onwards, we decided that, currently, the most important applications for Formal Verification are in smart contracts and blockchains, and have been focusing the CertiK efforts on that.
CertiK is already a commercial company, and you can contract with us to apply Formal Verification to your contracts. But at the same time, we also intend to do more forward-looking research, which will be published in research papers to advance state-of-the-art blockchain verification.
For example, one of our research projects, which is currently in progress, is called the DeepSEA programming language.
To set the stage, let me start by saying that there is a whole range of methods that fit under the label “Formal Verification.” If we zoom in on that circle, there is actually a spectrum where you can put in more effort in order to get more perfect assurances.
In all these cases of Formal Verification, the software will give you a guarantee that a security property holds for all possible inputs. But depending on how much manual work you do, it is possible to prove more interesting properties.
Firstly, at what I would call the most shallow end, is verification of generic correctness properties. A good example is integer overflows. This is a good property to prove, because it doesn’t require any context. If a particular operation in your program may overflow, that’s probably a problem, so you can say that that particular line is bad, without knowing very much about what it’s supposed to do. These kind of properties can often be proved completely automatically, using tools such as SMT solvers.
For example, CertiK has a tool in which you can enter a program written in Solidity, and it will either prove that there is no possibility of integer overflows, or it highlights the lines of code that look suspicious because overflow may happen. This can run without any user interaction at all, you simply upload your Solidity program and click a button.
Second, at an intermediate point, there is the idea of pre- and post-conditions. This is the idea that you write down comments in your program saying “this is what should be true before calling this method, and this is what will be true after calling it.” CertiK also has a tool that can do this: you add comments to your Solidity contracts, and then the tool proves that they are satisfied. In this case, you need a person to think hard about what properties should be true for the program, but once you have annotated it, the proofs are again done automatically at the click of a button. I think this kind of technology is very useful for proving the correctness of individual contracts and individual methods, so they are suitable for cases where your application is implemented by one stand-alone contract.
Lastly, at what I would call the “deep” end of the spectrum, we might want to prove arbitrary, complex functional specifications. This is the case in which the specifications you are proving are sufficiently complicated, so not only do you need a human developer to think about what you want to prove, but the proofs themselves are so non-trivial that you can’t expect a computer program to do them automatically. In this space, you use interactive proof assistant tools, the most famous examples are Coq and Isabelle. Those are the kind of tools that we used to verify the CertiKOS kernel. This method is suitable for verifying entire systems.
By “systems” I mean this: if you open any exciting white paper, you will find diagrams like this, where you do not have just a single contract, but many contracts calling each other, off-chain participants occasionally calling the on-chain contracts, multiple interacting chains.
And of course, this is also the type of diagram that you see in operating systems. We believe that, once you start getting into these kind of applications, you need to use interactive proof assistants, as well as the technology that we developed for the CertiKOS kernel.
One of those technologies is DeepSEA. It is a way to import contracts into the Coq interactive proof assistant, so you can work with it. The example program on the slide is a very simple payment channel contract. You probably know how those work: the contract stores a deposit, two parties can interact off-chain, and in order to tear down the channel, you call the “trigger” method to begin the settlement phase, which presents the latest off-chain message.
To write this in the DeepSEA language, we write the contract as an “object”, and each object has some fields which will be put in Ethereum storage. Then, we have have some contract methods. This is very similar to a Solidity program. But the key point is that for each method, the DeepSEA compiler will automatically generate a corresponding function written in Coq’s native programming language. The generated method is a pure function that takes some contract state and returns the updated state. This provides a model of the contract, which you can directly load into Coq and write proofs about.
Furthermore, the implementation of DeepSEA contains a verified compiler, which proves that the EVM byte code generated by the compiler corresponds to those Coq functions. In the end, you can reason about your contract as if it was written in a high-level functional language, but rather than having to trust that the compiler for that language was implemented correctly, the only thing you need to trust is a very small and simple model of the EVM virtual machine itself.
Once you have the program loaded in Coq, you can prove theorems about it. So for instance, for the payment channel, you may want to prove that no matter what your counterparty does, there is always some way for you to force the contract into a good state where you get your deposit back. This is actually a nice example of why deep Formal Verification is useful, because if you think about it, this property is somewhat subtle. You expect that the settlement phase will usually be very simple: you just call Trigger, update with your last message, and then you are done. But you don’t actually know what your counterparty will do: at any point in this transaction she may also call the contract, or try to send you messages, and so on. As a result, you don’t know exactly how many steps the settlement will take, and you just want to prove that it will finish after some finite number of steps. The Coq language is rich enough that you can directly write down that kind of property saying that “this process will eventually finish.”
DeepSEA is one ongoing research project. We are also interested in several other research projects. First, we want to apply our technology to verify more interesting contracts. We particularly think that our technology would be suitable for ambitious applications involving a mix of on- and off-chain interactions.
We also want to reuse the work on verified compilation, both by reusing our verified EVM backend to implement verified compilers for more programming languages, as well as porting our tools to more targets than just the EVM. And finally, we are also interested in more general projects to develop an authoritative model of the smart contract environment, which we, and others, could build on to build verified contracts.
Of course we cannot do all of these tasks alone, so if you want to collaborate with us on any of these initiatives, please come talk to us!
To request the audit/verification of your smart contracts, please send email to email@example.com or visit our official website to submit your request today.
CertiK is a blockchain and smart contract verification platform founded by top Formal Verification experts from Yale and Columbia University and former senior software engineers from Google and Facebook. Different from the traditional testing approaches, CertiK attempts to mathematically prove blockchain ecosystem and smart contracts are hacker-resistant and bug-free. CertiK’s key features include a layer-based decomposition approach, pluggable proof engine, machine-checkable proof objects, certified dApp libraries, and smart labelling.