A review of “Return-Oriented Programming: Systems, Languages, and Applications.”
This post is a short review of the paper by Roemer et al. published in ACM Transactions on Information and System Security in March 2012.
The authors introduce a new attack named return-oriented programming. It allows adversaries to induce arbitrary behavior in the targeted system and evades many defenses like the W+X protection model because the attack does not introduce new code.
The paper makes the following three contributions:
- algorithms are provided to find snippets essential to designing return-oriented programming attacks in the machine’s existing code.
- a catalog of such snippets is provided for two architectures (Intel x86 and SPARC)
- an exploit language and compiler is offered through a programming interface
The W+X model marks memory either as writable or executable but never both simultaneously. It was implemented in various operating systems and processors to mitigate attacks like buffer overflows [Aleph One 1996] and its variants (e.g., format string vulnerabilities). In these attacks, the strategy is to subvert the program’s control flow before redirecting its execution to injected malicious code. Return-oriented programming was designed to evade code injection defenses like W+X protection.
Return-oriented programming is a generalization of the return-into-libc attacks, which only focus on redirecting the program’s execution to existing code instead of injecting new code: return-into-libc attacks used routines already found in the standard C library libc (e.g., wrapper for system calls). Thus, they evade defenses that are based on preventing adversaries from injecting new malicious code in the program’s runtime environment through attacks like buffer overflow.
A moderately effective defense against return-oriented programming is address space layout randomization. The most effective defenses have to prevent the program’s control flow itself from being hijacked. They are known as control flow integrity systems and induce limited overhead on the runtime of programs [Erlingsson et al. 2006, Abadi et al. 2009].
The attacks are constructed by linking together short code snippets found in the program’s address space. The linking is performed by controlling the stack and using the return instructions found at the end of each snippet. Using linked groups of such snippets, the adversary constructs gadgets that perform a well defined task when invoked.
An attacker capable of constructing a Turing-complete collection of gadgets is able to execute any arbitrary behavior on the machine.
Algorithms are designed specifically for the purpose of finding the snippets of code (cf. supra) used to build gadgets. Two strategies are presented to find snippets ending with a return instruction. The first is to perform static analysis on a disassembled library code. Each return or restore termination is used as a starting point for a reverse search of useful instruction sequences. The second is to find unintended instruction sequences by jumping into the middle of existing instructions and finding useful sequences terminating with a return instruction. Instructions found are stored in a prefix tree.
The paper presents implementations of return-oriented programming attacks on two different architectures: Intel x86 and SPARC. The catalog available for each architecture includes gadgets to perform: load/store (or memory and assignment), arithmetic, logic, control flow, system call, function calls, shell code. Using this catalog, attackers can force arbitrary behavior.
Finally, the authors designed a gadget exploit language and compiler to simplify the creation of return-oriented programs. The programming interface allows a programmer to assemble an exploit payload made up of as many gadgets as desired. The payload is then placed in a string buffer overflow payload on which a vulnerable application is invoked.
Since they re-use existing code, return-oriented programming attacks are difficult to defend against. Instead, the most prominent avenue for defenses is to stop a program’s control flow from being subverted, for instance with control flow integrity systems.