NVIDIA: System Freeze via WebGL

and the mystery of the disappearing squares

[Part of a series of stories on GPU shader compiler bugs.]

[Previous stop: Intel]

We’ll talk here about our experiences applying GLFuzz to two NVIDIA systems: an Ubuntu PC with a GTX Titan, and an NVIDIA Shield TV box with a Tegra X1, testing NVIDIA’s proprietary drivers in both cases. (We also tested a Windows platform with a GTX 770, and an Ubuntu machine with a GTX 1050, with similar results.)

Ubuntu, freeze!

In a similar vein to what we observed when testing AMD devices under Windows, we encountered various WebGL shaders that cause system instability when rendering with NVIDIA devices under Ubuntu. One example, on our Ubuntu GTX Titan platform, is shown in this video:

Interestingly, we found that this issue did not bring down the machine completely — the machine could still be accessed remotely via ssh, suggesting that this is an X server freeze. We have also found WebGL shaders that lead to the user being logged out, killing all applications, and sent to the login screen (but the system then seemed to be stable upon logging in again).

When you visit a web page, a WebGL shader hosted by the page can run without warning on your GPU, so these bugs make it possible for a malicious web page to freeze the user’s display or force a logout. In the interest of responsible disclosure, we will not reveal the shader but have reported the issue to NVIDIA:

https://github.com/mc-imperial/shader-compiler-bugs/issues/46

NVIDIA have reported: “We have filed an internal bug to resolve this issue and we will keep this thread updated for further progress.”

The squares have gone!

On our NVIDIA Shield TV box, we found an example where adding three redundant ternary expressions led to a significant change. This changed the original image from this:

Image rendered by an original shader, obtained from GLSLsandbox.com.

to this:

(Non-)image rendered on the Shield TV after a GLFuzz mutation that should not have any effect on rendering.

In other words, the squares have gone!

The following code fragment illustrates the change that GLFuzz made to the original shader — the code in bold is what was added:

vec3 hsbToRGB(float h, float s, float b) {
return b * ((false ? (--s) : 1.0) - s)
+ (b - (false ? (--s) : b * (1.0 - s)))
* clamp(abs(abs((false ? (--s) : 6.0) * [...];
}

This is the only change made to the shader, and the change should clearly be semantics-preserving: each occurrence of (false ? ( - - s) : X) should evaluate to X. The (- - s) expressions should never be evaluated, because they are guarded by false, so should not have any side-effects.

If you’ve got a Shield box then see whether you can reproduce the bug:

  • this web page should cause the original image (with the squares) to be rendered;
  • this web page should show the same image, unless your system is prone to the bug.

We have reported the issue to NVIDIA:

https://github.com/mc-imperial/shader-compiler-bugs/issues/12

but have not yet had a response; we’ll update this post if we get more info.

The fog has come!

You should find that this WebGL page leads to the following image being rendered:

Image rendered via a fragment shader before mutation using GLFuzz.

Try it out and let us know if it doesn’t!

Now, if you visit this WebGL page instead, you should see exactly the same rendered image.

But if you’ve got an NVIDIA GPU running Ubuntu then, depending on your particular driver version (ours is 370.28), you might find that you instead see this:

Applying a semantics-preserving transformation leads to this foggier image being rendered instead, when there should be no difference in rendering.

The fog has come.

To induce this bug, GLFuzz changed this statement:

vec3 p3 = … / 0.5;

To this:

vec3 p3 = … / (injectionSwitch.x > injectionSwitch.y ? 0.5 : 0.5));

Notice that the ternary expression is actually compile-time redundant! It so happens that at runtime (as discussed in previous posts) we set injectionSwitch to (0.0, 1.0), but that’s irrelevant here since both results of the ternary operation are identical in any case.

We reported this issue to NVIDIA on their forum; here is the corresponding issue in our GitHub repo:

https://github.com/mc-imperial/shader-compiler-bugs/issues/56

We haven’t had a response yet, but will update this post if we get one.

Next up: Qualcomm.