What is Program Analysis?
If you are interested in blockchain technology, you’ve likely heard of many incidents in which large amounts of funds were stolen due to an attacker exploiting some vulnerability in the application code. For example, the infamous DAO hack from 2016 allowed attackers to withdraw more money than they were eligible for by exploiting a vulnerability known as “reentrancy”. Another more recent incident is a flash loan attack that occurred as recently as April 17, 2022 and resulted in $182 million dollars in losses. While all of these attacks are enabled by an underlying security vulnerability in the source code, the good news is that there is program analysis technology that can be used to detect such vulnerabilities. In the next few blog posts, we explain what program analysis is and how it can help catch security vulnerabilities before deployment.
A Brief Intro to Program Analysis
Program analysis refers to a class of techniques that can be used to detect security vulnerabilities (as well as other kinds of logical errors) in programs. Program analysis comes in two main flavors, namely dynamic and static. Dynamic program analysis aims to detect problems by executing the code, while static analysis analyzes the source code without actually running the program. However, among these techniques, only static analysis can be used to guarantee the absence of a vulnerability in the program. In contrast, dynamic analysis can prove the presence of a problem but, unlike static analysis, it can never disprove it.
At first glance, static analysis may sound very mysterious: on the face of it, static analysis seems to violate a fundamental principle summarized by Rice’s theorem, which states that every non-trivial semantic property of a program is undecidable. Here, a semantic property is one about the program’s behavior (as opposed to its syntax), and a non-trivial property is one that holds for some programs but not others. More relevant to our topic at hand, the presence of a security vulnerability is a prime example of a non-trivial semantic property. Thus, given the question “Does this program have a security vulnerability?”, Rice’s theorem tells us that there is no algorithm that always terminates and correctly answers this question.
So, how is static analysis even possible then? The answer lies in the following observation: It’s true that we cannot have an algorithm that will always give the correct yes/no answer but we can have an algorithm that always answers “yes” if the program has a security vulnerability but it may sometimes also answer “yes” if the program does not have a vulnerability. In other words, as long as we are willing to tolerate some false positives, we can get around Rice’s theorem and undecidability!
Fundamentals of Static Analysis
Let’s now take a look at how static analysis works from a very high level. The fundamental principle underlying static analysis is to over-approximate the set of states that the program can be in. Let’s think of a program state as a mapping from variables to values. In general, it is impossible to have an algorithm that can determine the exact set of program states that might arise in any program execution. However, it is possible to approximate this set as illustrated below:
Here, the blue irregular shape corresponds to the actual set of states that can arise in some program execution, and the red region corresponds to “bad states” that indicate an error or security vulnerability. Due to undecidability, we can never have an algorithm that can determine exactly what the blue region is, but we can design algorithms that over-approximate this blue region in a systematic way, as indicated by the regular green region above. As long as the intersection between the green and red region is empty, we have a proof that the program doesn’t do something bad. However, if our over-approximation is not precise enough, it may overlap the red region even though the intersection of the blue and red regions is empty, as illustrated below:
This situation results in so-called “false positives”, which are spurious errors reported by the analysis that do not correspond to real problems. Generally speaking, the holy grail of static analysis is to construct over-approximations that (1) are sufficiently precise so we don’t get a lot of false positives in practice, and (2) are reasonably efficient to compute so the analysis scales to real-world programs we care about.
As a side note, it is also possible to design static analysis algorithms that under-approximate program behavior as depicted below:
In this case, the green region (computed by the static analysis) is contained inside the blue region (depicting the actual states) as opposed to the other way around. Such an analysis is unsound, meaning that it can miss real programs errors: as we see in the diagram above, the intersection of the green and red region is empty, so the analysis will not report problems even though the program is actually buggy. This results in so-called false negatives, which are real bugs that are missed by the static analysis.
In general, if we want to have provable security, we want sound static analyzers that never suffer from false negatives, while also being precise enough to not report too many false positives in practice. However, the good news is that decades of formal methods research has shown that designing such static analyzers are possible. In the next blog post, we will look into how static analyzers work in a little bit more detail!
Program analysis is a useful technology for catching security vulnerabilities in all kinds of programs, including blockchain applications. Furthermore, sound static analyzers that over-approximate program behavior can guarantee the absence of entire classes of vulnerabilities.
Veridise offers thorough and comprehensive security audits for blockchain applications. Leveraging our expertise in automated program analysis, Veridise provides state-of-the-art solutions for ensuring security of decentralized finance applications. If you’re interested in learning more information, please consider following us on social media or visiting our website: