Concolic Execution and Code Coverage with Triton

Published in

Salesforce Engineering

8 min readJan 10, 2018

In this post, we wanted to introduce Salesforce RElabs group, one of our teams in charge of finding 0-days in third party applications that we use every day. We will discuss how we use Triton, a dynamic binary analysis framework, to execute code coverage analysis in compiled programs that help us to find low level vulnerabilities. We’ll examine the basic concepts of Triton and discuss the limitations and successes of Triton.

Code coverage

What is code coverage? According to Wikipedia:

…code coverage is a measure used to describe the degree to which the source code of a program is tested by a particular test suite. A program with high code coverage has been more thoroughly tested and has a lower chance of containing software bugs than a program with low code coverage. Wikipedia — Code Coverage

Optimally, you want to maximize code coverage for any given program. In the case of Triton, there is no source code, so one should maximize code coverage with a compiled program.

How does one maximize code coverage? There are different techniques. For example, you can use fuzzing to try to reach as much code as possible. Below is an example of fuzzing for a simple code.

void f(int x, int y) {
  int z = 2*y;
  if (x == 100000) {
    if (x < z) {
      assert(0); /* error */
    }
  }
}

One does not have many odds to satisfy the two conditions randomly fuzzing the inputs x and y. In fact, there are just 1/4.294.967.296 (232in a 32-bit processor) odds to satisfy just the first condition.

Symbolic Execution

There is a better approach called symbolic execution:

In computer science, symbolic execution (also symbolic evaluation) is a means of analyzing a program to determine what inputs cause each part of a program to execute. An interpreter follows the program, assuming symbolic values for inputs rather than obtaining actual inputs as normal execution of the program would, a case of abstract interpretation. It thus arrives at expressions in terms of those symbols for expressions and variables in the program, and constraints in terms of those symbols for the possible outcomes of each conditional branch. Wikipedia — Code Coverarge

The idea is to be able to detect the different conditions in the code and solve the formulas to satisfy all the possible paths. Using the previous example, a symbolic execution engine would detect the first condition (x == 100000), solve the condition, and assign to the variable x the value 100000, all in just one execution.

But how is this symbolic execution engine able to solve this equation or complex ones? It uses a SMT solver (Satisfiability modulo theories). Some of the most popular SMT solvers are Z3 from Microsoft and STP. Wikipedia has more information.

Concolic Execution

Symbolic execution is not new. Since 1975 there has been references to this concept, but it has become more popular in the last ten years with the new SMT solvers and evolved versions of Symbolic Execution. One of them is called Concolic execution (a portmanteau of CONCrete and symbOLIC execution) or Dynamic Symbolic Execution.

What is the difference between Symbolic Execution and Concolic execution?

Implementation of traditional symbolic execution based testing requires the implementation of a full-fledged symbolic interpreter for a programming language. Concolic testing implementor noticed that implementation of a full-fledged symbolic execution can be avoided if symbolic execution can be piggy-backed with the normal execution of a program through instrumentation. This idea of simplifying implementation of symbolic execution gave birth to concolic testing*.

The idea is use instrumentation with Pin or DynamoRIO to concretely execute a program and only symbolicly execute the part we are interested in, saving a lot of resources.

Triton

What is Triton?

A Pin-based dynamic symbolic execution (DSE) framework. Although Triton is a DSE framework, it also provides internal components like a taint engine, a snapshot engine, translation of x86 and x86–64 instructions into SMT2-LIB, a Z3 interface to solve constraints and, the last but not least, Python bindings. Based on these components, you are able to build program analysis tools, automate reverse engineering and perform software verification*.

Triton is an Open Source project sponsored by Quarkslab and maintained mainly by Jonathan Salwan that lets you perform Concolic execution with compiled programs for Linux, OSX and Windows for Intel x86 and x86–64 architectures. Originally Triton did only support x64 binaries. At RELabs we worked to support x86 binaries. As well we worked in the Windows compatibility and now Triton can be used in Windows systems. The Triton project has some example tools to show its effectiveness. For example, it has a tool to show how to find format-string vulnerabilities using the taint engine, other tool to find use-after-free vulnerabilities or a tool to do code coverage in a program.

A code coverage tool

This last tool is explained in detail in the following post. (recommended reading)

That example uses a basic crackme that looks for the string “elite” in the first argument after a basic arithmetic operation.

char *serial = "\x31\x3e\x3d\x26\x31";
int check(char *ptr)
{
  int i = 0;
  while (i < 5){
    if (((ptr[i] - 1) ^ 0x55) != serial[i])
      return 1;
    i++;
  }
  return 0;
}

After six executions (all the possible paths), Triton is able to find the winning string, as shown below

[+] Model (bound 4) :
  - SymVar_0 (argc):0x2
  - SymVar_1 (argv[1][0]):0x65        e
  - SymVar_2 (argv[1][1]):0x6c        l
  - SymVar_3 (argv[1][2]):0x69        i
[+] Restoring snapshot
!!!!!!!!!!!!!!!!!!! WE'RE AT THE MAIN FUNCTION !!!!!!!!!!!!!!!!!!!!!
****** Bound of the current input is: 4
[+] Memory set with the SMT results
[+] Tainting argc (2) and argv 0x006cb7a0
[+] argv parameters tainted
[+] Branch detected at 0x401058 (crackme_xor.exe) [0]: 0x40105a or 0x40105f, Taken:Yes
[+] Branch detected at 0x40102e (crackme_xor.exe) [1]: 0x401030 or 0x401037, Taken:Yes
[+] Branch detected at 0x40102e (crackme_xor.exe) [2]: 0x401030 or 0x401037, Taken:Yes
[+] Branch detected at 0x40102e (crackme_xor.exe) [3]: 0x401030 or 0x401037, Taken:Yes
[+] Branch detected at 0x40102e (crackme_xor.exe) [4]: 0x401030 or 0x401037, Taken:No
fail[+] Model (bound 5) :
  - SymVar_0 (argc):0x2
  - SymVar_1 (argv[1][0]):0x65        e
  - SymVar_2 (argv[1][1]):0x6c        l
  - SymVar_3 (argv[1][2]):0x69        i
  - SymVar_4 (argv[1][3]):0x74        t
[+] Restoring snapshot
!!!!!!!!!!!!!!!!!!! WE'RE AT THE MAIN FUNCTION !!!!!!!!!!!!!!!!!!!!!
****** Bound of the current input is: 5
[+] Memory set with the SMT results
[+] Tainting argc (2) and argv 0x006cb7a0
[+] argv parameters tainted
[+] Branch detected at 0x401058 (crackme_xor.exe) [0]: 0x40105a or 0x40105f, Taken:Yes
[+] Branch detected at 0x40102e (crackme_xor.exe) [1]: 0x401030 or 0x401037, Taken:Yes
[+] Branch detected at 0x40102e (crackme_xor.exe) [2]: 0x401030 or 0x401037, Taken:Yes
[+] Branch detected at 0x40102e (crackme_xor.exe) [3]: 0x401030 or 0x401037, Taken:Yes
[+] Branch detected at 0x40102e (crackme_xor.exe) [4]: 0x401030 or 0x401037, Taken:Yes
[+] Branch detected at 0x40102e (crackme_xor.exe) [5]: 0x401030 or 0x401037, Taken:No
fail[+] Model (bound 6) :
  - SymVar_0 (argc):0x2
  - SymVar_1 (argv[1][0]):0x65        e
  - SymVar_2 (argv[1][1]):0x6c        l
  - SymVar_3 (argv[1][2]):0x69        i
  - SymVar_4 (argv[1][3]):0x74        t
  - SymVar_5 (argv[1][4]):0x65        e
[+] Restoring snapshot
!!!!!!!!!!!!!!!!!!! WE'RE AT THE MAIN FUNCTION !!!!!!!!!!!!!!!!!!!!!
****** Bound of the current input is: 6
[+] Memory set with the SMT results
[+] Tainting argc (2) and argv 0x006cb7a0
[+] argv parameters tainted
[+] Branch detected at 0x401058 (crackme_xor.exe) [0]: 0x40105a or 0x40105f, Taken:Yes
[+] Branch detected at 0x40102e (crackme_xor.exe) [1]: 0x401030 or 0x401037, Taken:Yes
[+] Branch detected at 0x40102e (crackme_xor.exe) [2]: 0x401030 or 0x401037, Taken:Yes
[+] Branch detected at 0x40102e (crackme_xor.exe) [3]: 0x401030 or 0x401037, Taken:Yes
[+] Branch detected at 0x40102e (crackme_xor.exe) [4]: 0x401030 or 0x401037, Taken:Yes
[+] Branch detected at 0x40102e (crackme_xor.exe) [5]: 0x401030 or 0x401037, Taken:Yes
Win
No more inputs, exiting...

Improvements

There are some improvements that can be added to this tool including:

Detect string/buffer length restrictions In the previous example the tool needs to know how big is the string the program is using — six characters, in this case. But it’s possible to detect when a program is checking for a string terminating null byte (\0) and use this to generate variable length strings.
Only evaluate tainted conditions Currently the tool is trying to solve all the detected conditions. This does not make sense if a condition is not tainted by user input; these conditions are avoidable.
Only generate symbolic expressions for tainted instructions If the goal is only to solve tainted conditions, you can omit all non-tainted instructions and conditions, saving a substantial amount of RAM
Simplifies expressions Triton is generating large expressions, but in the next release, some simplifications will be added.
Track every possible user input Use all the user-controlled data, like the arguments, the file reads, or the socket read.
Avoid overloads Do not translate code inside external functions without tainted arguments.

Use cases

Concolic execution can be used for different purposes including:

Code coverage to generate templates for an input, generating the maximum code coverage possible. You can use these templates to fuzz a program.
Taint all the user-controlled data and detect when a format string vulnerability appears.
With some big improvements, it can be used to detect stack overflows and heap overflows.

Source: http://triton.quarkslab.com/files/sthack2016-rthomas-jsalwan.pdf

Limitations

It’s important to understand the limitations of Concolic execution to know when it is appropriate to use it and when it is not.

The main disadvantage with Concolic execution is the path explosion. Here is an example:

#include<stdio.h>int main(int argc, char *argv[])
{
  int help_flag, file_flag, write_flag = 0;
  char *option = argv[1];
  while (*option != '\0')
  {
    switch (*option)
    {
      case 'h': help_flag = 1; break;
      case 'f': file_flag = 1; break;
      case 'w': write_flag = 1; break;
      default: printf("Invalid option\n");
    }
    option++;
  }
  return 0;
}

There are (string length)4 possibilities. For an argument 10 characters long, there are 10,000 possible paths. There are some algorithms used to approach this problem like path merging, RWSet or prioritizing certain paths to cover the maximum possible code.

Another problem is loop management, when the loop iteration length is user controlled. For example in the following code:

int seqSearch(const T list[], int listLength, T searchItem) {
    int loc;
    for (loc = 0; loc < listLength; loc++)
        if (list[loc] == searchItem)
            return loc;
    return -1;
}

If the user controls listLength there are 232 possible paths (all possible lengths). A technique to approach this problem known as “Loop-extended Concolic Execution“ is explained in this paper.

Success Stories

There are some similar tools using concolic execution with very good results including:

SAGE

SAGE is the Microsoft implementation with these statistics since 2007:

200+ machine years (in largest fuzzing lab in the world)
1 Billion+ constraints (largest SMT solver usage ever!)
100s of apps, 100s of bugs (missed by everything else…)
Ex: 1/3 of all Win7 WEX security bugs found by SAGE
Bug fixes shipped quietly (no MSRCs) to 1 Billion+ PCs — Millions of dollars saved (for Microsoft and the world)
SAGE is now used daily in Windows, Office, etc

Mayhem

Mayhem is a project originati/rom Carnegie Mellon University. Now it is the main product from the company For All Secure:

In June 2013, we downloaded 22,678 programs available through Debian Linux stable’s package management system and ran Mayhem for only 5 minutes per program. Mayhem generated over 5.5 million testcases automatically and found 4,801 bugs. ForAllSecure — Mayhem

Mayhem can detect automatically vulnerabilities and it is also capable to exploit them automatically.

Mayhem played and won the DARPACGC competition, an automated CTF challenge hosted by DARPA with a final prize of $2 million.

If you are interested in DARPACGC, you can watch the youtube videos from DEFCON 24.

Conclusion

As we have seen, Concolic Execution is a very powerful technique, and everyday it’s becoming easier to use with open-source projects like Triton.

In the next posts, we will talk about how we implemented some of the proposed improvements and we will show some real use cases.

(This post was originally published on 8–12–2016)