Security Analysis of Graphics Drivers
…and a research opening in my team at Imperial College London
I’m really excited about a new project that we just started, sponsored by Google’s Chrome University Research Program, on looking for possible security-related defects in graphics drivers.
I am hiring for a postdoctoral researcher to work in this area (and will consider predoctoral candidates who have enough experience). Please spread the word to anyone you think might be interested!
To complement the regular job ad, I thought I’d write a few notes on why I think this is an interesting and challenging area for research.
Drivers are hard, graphics drivers are harder
It’s well-known that getting drivers right is difficult, and a lot of energy has been put into improving driver quality. A really notable example is Microsoft’s Static Driver Verifier project, which allows one to check that drivers written against Microsoft APIs satisfy a number of basic, but important, properties.
Graphics drivers — for programming models such as DirectX, OpenGL and Vulkan — are arguably the toughest sorts of driver to get right. A graphics device is really complex, so interaction between user applications, the OS and the graphics driver is intricate. Furthermore, in most systems the display is critical to system usability. This means that operating systems and drivers have to include complex logic to defend against rogue use of the GPU, to avoid display freezes and corruption.
Another issue is that video memory often contains personal data, so that driver issues that compromise this memory are particularly serious.
The web makes drivers an open attack surface
Did you click the link and play with the page a bit? What happened?
WebGL is a really exciting technology that brings 3D graphics to the browser. It’s an enabler for really exciting next-generation applications.
The problem with WebGL, though, is that it creates a pathway through which arbitrary, untrusted GPU code from the web can run directly on an end users system, interacting with their GPU driver and hardware.
The link is to a simple WebGL page that tries to run a graphics shader that will execute a very long-running loop nest. What happens when a shader runs for a long time is pretty much undefined. If you reloaded the page a few times you may have experienced some browser hangs, perhaps some glitches in your display, and (I experienced this when writing this story) in the worst case a system freeze.
We have found in our past work that graphics driver issues can lead to some bad stuff, including:
- A remote information leak security issue affecting Samsung Galaxy S6 phones (with ARM Mali GPUs) — Google awarded a bug bounty for this
- Rendering of garbage (on Apple iPhone) — read the story; check out Apple’s fix in CVE-2017–2424
- System freezes (on NVIDIA) — read the story; check out NVIDIA’s fix in CVE-2017–6259
- A blue screen of death (on AMD)
We found these issues via a project trying to find shader compiler errors (see this OOPSLA paper), rather than by explicitly going after security-related problems.
The plan now is to look at security directly.
Let me finish by saying a few words about some of the research challenges that I think make this an exciting problem to work on:
Determining when a graphics driver issue has occurred ranges from easy to very hard. If the drive crashes then that’s easy to see. For open source drivers, or open source enclosing APIs, we can look at using sanitizers, such as asan and tsan, as partial oracles.
The oracle is a lot harder if we want to go after information leak security bugs, where stuff from one part of the systems gets rendered in another part of the system. There’s plenty of room to be creative here, and we have some ideas about approaching that problem that we’d like to explore in the project!
Understanding and managing undefined behaviour
When testing a program such as a compiler for regular bugs, one wants to steer well clear of undefined behaviour, which can confound the results of testing.
However, when testing graphics compilers and drivers with security in mind, we need to think about undefined behaviour as something that will arise in untrusted code, and we need to be able to specify reasonably rigorously what the worst-case impacts of this undefined behaviour are, so that we can detect when the overall system has failed in defending against undefined effects. The edge cases of graphics APIs are often pretty ill-specified, so there’s room for formal or semi-formal modelling here to bring clarify.
Compositional API and program fuzzing
Fuzzing is a really promising method for finding bugs in driver and API implementations. The trouble is that (a) all fuzzers kind of look the same, and (b) all fuzzers end up being a great big mess.
We’d like to be able to fuzz a bunch of different APIs for graphics, including Vulkan, OpenGL and DirectX. These APIs are very different in their details but have a lot in common. Can we devise some elegant abstractions that make it possible to build fuzzers for a collection of related yet distinct APIs under one umbrella? Could we conceive of a domain-specific language for writing an API fuzzer at a high level of abstraction, coupled with mappings that allow this language to target the fuzzing of a number of concrete APIs?
Similarly, if we want to apply fuzzing to the shading languages of graphics programming models — to test shader compilers — is there something clever we can do to avoid having to write and maintain separate, yet very similar, fuzzers for a range of languages?
These problems of how to raise the abstraction level of API and language fuzzing tools are, in my view, really fundamental and have application way beyond the domain of graphics. But I always think it is good to start concrete, and graphics provides a great domain in which to study a collection of rich, yet related, APIs.
Want to apply?
I hope the info and links above are interesting, and if they excite you to the point that you’d like to apply for the postdoc position I have open then please get in touch!
Alastair F. Donaldson (Ally) is a Reader in Computing at Imperial College London, where he leads the Multicore…multicore.doc.ic.ac.uk
Experience in relevant areas, such as testing, security, graphics, software engineering or verification, would be great, but the main thing I’m looking for is someone who is passionate about research, with solid programming skills, and lots of enthusiasm for and curiosity about testing.