Crashes, Hangs and Crazy Images by Adding Zero
Fuzzing OpenGL shader compilers
Imagine: you visit a web page expecting to see this:
Instead, you get this:
What went wrong?
This is the first in a series of stories in which I’m going to talk about a technique we’ve been designing at Imperial for automatically finding bugs in graphics drivers. Specifically, our technique aims to find bugs in shader compilers — the components of graphics drivers that allow custom programs to run on graphics hardware.
This work has been done jointly with my PhD students Andrei Lascu and Paul Thomson. It has been supported by an Impact Acceleration Award from the UK EPSRC, a TETRACOM Technology Transfer Project in partnership with dividiti, and the EPSRC-funded HiPEDS CDT.
After a little background I’ll delve into some interesting issues we’ve found — including blue screens, machine freezes, wrong images and device reboots — when testing shader compilers from AMD, ARM, Imagination Technologies, Intel, NVIDIA and Qualcomm.
What is a shader?
Modern OpenGL requires the software developer to program the way their scene is transformed and rendered by writing shaders — programs that run across the cores of the GPU. OpenGL provides a C-like language, GLSL, for writing shaders. You’re probably using sophisticated graphics shaders all the time if you play video games or use a phone. In these posts we restrict attention to fragment shaders, which are responsible for pixel colouring.
The WebGL example above was rendered by shading a rectangle, with the colour of each pixel being determined according to its coordinates using a fragment shader.
What is a shader compiler?
A shader program isn’t GPU-specific: it should work, and have the same overall effect, on any GPU that supports the associated OpenGL version. Each GPU designer — NVIDIA, AMD, Intel, etc. — has to provide a shader compiler to translate a shader from GLSL into machine code for the particular GPU. The shader compiler is part of a GPU’s device driver, and shaders are usually compiled at runtime.
Our approach
Robust, quality graphics matters, so shaders need to be reliable. Unfortunately, shader compilers are hard to test. How do we validate the image that is rendered via a shader?
Our proposed solution is inspired by a recent method for testing C compilers called EMI, described in this paper from Vu Le, Mehrdad Afshari and Zhendong Su, and also in this blog post from John Regehr. In a nutshell, what we do is:
- Render an image from an existing fragment shader, e.g. a shader from GLSLsandbox or shadertoy — call this the original image.
- Apply some transformations to the shader that should have essentially no impact on what the shader renders. As two simple examples, we can add dead statements, or insert “+0.0” into an expression.
- Render the resulting image — call this the variant image.
- Compare the original and the variant. They might differ a little (due to slightly different floating-point optimisations), but if they differ a lot then something is likely wrong in the shader compiler — a “zero impact” change should not lead to a big image difference.
- Reduce the difference between the original shader and the variant shader to find a minimal change that causes a big difference in rendering.
You may be surprised to learn that the above examples of adding dead statements and inserting “+0.0” have both led to significant differences in the rendered images! We will cover these examples in upcoming stories.
We’ve implemented this in a tool, GLFuzz. A preliminary write-up of what we do is here, and we’re working on a full paper. Our technique is an example of metamorphic testing.
Our aim is to help!
Shader compilers are particularly complex pieces of software, so it’s unsurprising that they exhibit bugs. The aim of GLFuzz is to help GPU designers in testing and — most importantly — debugging their shader compilers.
The Plan
We’ve been testing shader compilers from The Big Six GPU designers; in alphabetical order: AMD, ARM, Imagination Technologies, Intel, NVIDIA and Qualcomm. I’m going to initially post six stories, one per designer, each discussing some issues we’ve found when testing devices that incorporate one of their GPUs.
[Edit: we found out that Apple write their own drivers for Imagination’s PowerVR GPUs, so we’re covering Apple separately too.]
We’ll link to GitHub for the ingredients necessary to reproduce each issue, with the exception of issues that we believe could be exploited in a malicious way. So please have a go at reproducing them (at your own risk), and let us know how you get on!
Stops so far: