Intel: locking up Safari, bluescreening Windows…

…and the impact of adding zero

[Part of a series of stories on GPU shader compiler bugs.]

[Previous stop: Imagination Technologies]

In this story we report on some issues GLFuzz turned up when applied to Intel’s proprietary drivers on Windows and Mac systems. We haven’t yet tested Mesa drivers for Intel GPUs.

The perils of adding zero

You might remember from my first story that I promised a 14 character change that would cause this image difference:

Desired image, on the left, turned into the nice-but-wrong image on the right, when “+ 0.0” was added to an expression. Fixed in recent Intel drivers.

We’ve been reporting bugs to Intel from the start of the GLFuzz project, and they’ve been very receptive. Because Intel have fixed bugs identified by GLFuzz, the rate at which our tool finds bugs in their later drivers has gone down — this is the case with any fuzzer.

The issue illustrated by the above image pair was one of the first ones we reported. This was with an Intel HD Graphics 520 GPU, and driver version 20.19.15.4326, under Windows 10.

GLFuzz found that changing this line of code:

vec3 ray = normalize(cSide * p.x + cUp * p.y + cDir * targetDepth);

to:

vec3 ray = normalize(cSide * p.x + cUp * (p + vec2(0.0)).y + cDir * targetDepth);

i.e., changing p to (p + vec2(0.0)), led to a beautiful-but-wrong image being rendered. Interestingly, the wrong image looks sort of similar to a wrong image we looked at in our story about ARM — here they are side-by-side:

The modern art of bugs: wrong images rendered by Intel (left) and ARM (right) GPUs, from different original shaders, have a similar visual appeal!

The above “adding zero” issue is fixed in Intel’s more recent drivers for the HD Graphics 520. But the latest driver, version 21.20.16.4542, exhibits what we believe is a related problem, documented by this issue on GitHub. This fragment shader leads to the following image being rendered:

Image rendered by an original shader on an Intel HD Graphics 520 GPU

Try it for yourself via WebGL:

http://htmlpreview.github.io/?https://github.com/mc-imperial/shader-compiler-bugs/blob/master/Intel-HD-Graphics-520-Windows/wrong_images/large-v440-2f9974b474297737_inv_variant_52/webgl_viewer_recipient.html

This time, GLFuzz changes the following line of code:

return ObjUnion(obj0(p), obj1(p));

to:

return ObjUnion(obj0(p + vec3(0.0, injectionSwitch.x, 0.0)), obj1(p));

This is another example of “adding zero”, because we set injectionSwitch to (0.0, 0.1) at runtime, so that injectionSwitch.x is 0.0. So the change should have no effect (modulo floating-point round-off).

On our HD Graphics 520 GPU, we find that a black image is rendered instead:

Making the small change of adding a zero, in such a way that the compiler does not know statically that zero is being added, causes this black image to be rendered instead

If you’ve got a similar Intel GPU setup then try this out via WebGL :

http://htmlpreview.github.io/?https://github.com/mc-imperial/shader-compiler-bugs/blob/master/Intel-HD-Graphics-520-Windows/wrong_images/large-v440-2f9974b474297737_inv_variant_52/webgl_viewer_variant.html

Under Windows I get the black image using Firefox and Chrome (which use ANGLE), as well as using Edge (which doesn’t use ANGLE). This is interesting because it shows that the bug can be exposed via DirectX (recall that ANGLE translates OpenGL calls to DirectX on Windows, as this is typically more reliable than using OpenGL drivers).

We have reported the issue to Intel.

What’s the key difference between the original issue, where we added vec2(0.0, 0.0), and this issue, where we added vec3(0.0, injectionSwitch.x, 0.0)?

The use of injectionSwitch means the compiler cannot optimise away the addition. It is possible that the bug that caused the first issue was not actually fixed, but rather the compiler learned to optimise away “+ vec2(0.0, 0.0)”. The compiler cannot optimise away “+ vec3(0.0, injectionSwitch.x, 0.0)”, because it doesn’t know statically what injectionSwitch will hold at runtime.

More problems with control flow

Got a Windows setup with an Intel Graphics HD 520 GPU, with driver version 21.20.16.4542? Then when you visit this URL you will probably see a black image rendered:

http://htmlpreview.github.io/?https://github.com/mc-imperial/shader-compiler-bugs/blob/master/Intel-HD-Graphics-520-WebGL-ANGLE/wrong_images/large-v100-webgl-e404f04495815667_inv_variant_71/webgl_viewer_variant.html

when really you should see something similar to this image:

What small change do you have to reverse in the fragment shader linked above to get this, instead of a black image?

If you’ve got a similar Intel GPU then you might also see this bug, and it may well be present with other drivers.

Challenge: based on our posts so far, can you spot the one-line, zero-impact change that GLFuzz has introduced to cause this issue?

Clue: the issue is related to statements that affect control flow.

Answer: Check out our issue on GitHub to find out.

Security issues

During our testing of Intel GPUs, we found three security-relevant issues. We’ll give a brief tour of these, but with responsible disclosure in mind we won’t provide instructions on how to reproduce! We have reported these issues to Intel.

Locking up Safari via WebGL

In this video, Paul shows how rendering shaders in Safari using WebGL causes complete system freezes:

As usual, the shader he is attempting to render is one the GLFuzz has constructed from a human-written shader by applying semantics-preserving transformations. While these transformations can lead to a large and long-running shader that might miss its rendering deadline, one would hope that it should not lead to this kind of locking up behaviour.

Safari UI glitches via WebGL

Here, Paul shows how rendering GLFuzz-generated shaders in Safari can cause glitches to appear in parts of the Safari interface that are outside the WebGL drawing area:

At 00:47 you can see evidence that this is definitely a rendering issue, because parts of the rendered image appear elsewhere on the Safari window.

Bluescreening Windows (but not via WebGL)

In this video, I demonstrate how GLFuzz’s injections lead to a bluescreen under Windows 10:

Bluescreening a Windows machine with an Intel HD Graphics 520 GPU by rendering a shader generated by GLFuzz

Interestingly, the desired image is actually rendered before the bluescreen occurs! The error code associated with the bluescreen is WHEA_UNCORRECTABLE_ERROR. Acccording to this Microsoft documentation: “This bug check indicates that a fatal hardware error has occurred”. We tried this on two different machines with Intel HD Graphics 520 GPUs, and got the bluescreen on both, so it seems unlikely to be due to a broken GPU.

Unlike the bluescreen we reported in our post about AMD, we could not reproduce this one using WebGL.

We’ve been having a fun time doing automated reduction of these bluescreen bugs, which is challenging and somewhat unnerving! I’ll write another story about that sometime. But although we found this and the AMD bluescreen using very large variant shaders, we were able to minimise these shaders to really small examples triggering the same issues.

Next stop: NVIDIA (but not until after the holidays!)