Sora Thompson
8 min readDec 23, 2016

--

Virtual Reality is the, or rather one of the, new high tech goldrushes on the tech investment scene. With it come new engineering challenges, and one of the most cited at the moment is a need to increase the resolution of today’s VR headsets, and consequently the rendering resolution used to display VR content as well.

The standard resolution today is 2160x1200, used by both the Oculus Rift consumer version 1 and the HTC Vive. Unfortunately that’s about the equivalent of viewing a 17" monitor from an average distance with a resolution of 320x240. For perspective it’s been more than 20 years since QVGA monitors have been particularly popular. Obviously if people are to buy the experience of virtual reality as truly “real” we’ll need to increase resolution by a lot. But just by how much ends up being a tricky question.

First lets check what math we’ll be using to see what resolution we need. To save time we’ll skip the biology of the human eye and just refer to this helpful and well referenced article from Clarkvision: http://www.clarkvision.com/articles/eye-resolution.html

Accordingly we (assuming 20/20 vision) have a visual resolution of 0.3 arcminutes per “pixel”. Or rather, at the very limits of normal human vision we can distinguish “line pairs”/aka a black pixel from a white pixel side by side at some high level of contrast every 0.3 arcminutes. For further reference, as a measure of angle an arcminute is just 1/60th of 1 degree. If you are sensing that the human eye is very high resolution as a camera then you are dead on. For pixels per degree we would need, as an ideal, 198ppd.

Now let’s look at the typical VR headset today and see how it compares. While the exact Fov for each VR headset varies, and will vary for person to person (see here as to why) we’ll take an ideal of the Vive, 110 ° horizontal binocular and 113 ° vertical fov gets us a 158 ° diagonal fov. Over the 2160 x 1200 panel this nets us a diagonal PPD of near enough to 15.6ppd. Which is 1/13th that of our resolvable resolution, meaning we need around a 20k by 20k display to match human vision perfectly. Ooouch.

For reference a top of the line $2,000 5k monitor (as of this writing) has, at a typical viewing distance giving it a 50 ° Fov 117ppd, still below the resolvable resolution of the human eye. For various reasons mass manufactured displays are limited more by total resolution than PPI, which tells us that not only are single panel/single view displays not up to matching human eye resolution, but that we’re still quite far away from them for VR.

But we’ll leave that aside for now to focus on rendering. The first thing we might want to ask when facing rendering is “Do we really need 21,780 x 22,374= 487megapixels of resolution?!” The first instinct might be to anti-alias a much lower resolution target and call it good enough.

Unfortunately the answer here is, yes, for a naive implementation that truly does match the limits of the typical human eye, we do actually need all that resolution. Our initial estimations of human eye res tests individual element contrast detection, the second we start blurring these contributions together, no matter how smartly, is the second we lose this resolvable detail from individual elements. Consider a single pixel, it doesn’t matter how perfectly it’s anti-aliased, the resulting signal is still 1 pixel and to make the obvious point, humans can see more than one of these.

Now there is a fortunate part, because you’ll notice the word “naive” appeared there. And that’s because the human eye doesn’t have the same resolution over our entire field of view. In fact it gets progressively, and over a complex slope, worse. For a fun view XKCD has a handy eye chart for you: https://xkcd.com/1080/

But for our purposes we need something a bit more mathematical, so we’ll turn this this curve:

At twenty degrees from the center of the human pupil resolution drops to just one tenth of our original target resolution, or rather just under 20 pixels per degree! The (approximated) integral of this line shows, if taking a wide field of view into account, the total resolution of the human eye over the fovea is just around 20% of what it would be if we had the same resolution over our whole vision as we do at the center of it.

The above is the reason for what is known as foveated rendering, or rather: Attempting to match the rendering resolution for a VR/AR HMD to the matching resolution of the human eye dependent on where the eye is pointed; the farther from the center of the pupil the less you technically have to render!

Initially this seems a brilliant answer to rendering far less for an HMD. As long as the gaze direction of the user can be measured you can cut down on the rendering time dramatically by matching the final render target to the above curve. But there are problems that come in as soon as we take even a closer look. The first is that number: 20 pixels per degree at 20 ° from center. Taking a look at our individual eye FOV from above we see field of view numbers in the mid 40s to low 50s at most. Meaning today most of the HMD field of view is still below our perceivable resolution, even accounting for our exponential loss of perceivable resolution away from the center of our gaze.

The above also does not take into account another factor: In order to widen the field of view from a reasonably sized head mounted screen today’s HMDs utilize lenses to distort the screen, trading off edge resolution for a higher FOV. Now this is beneficial in one sense, giving the center of each eye’s view the most PPD and thus, as long as the user is looking straight ahead, a kind of foveated rendering anyway. But it also further lowers the PPD for the edges of our screens. The approximate formula for the Rift lens distortion can be found here. But the point is, thanks to these lenses even if we stare directly at the center of our HMD the edges can still get below our target PPD.

EG Foveated rendering doesn’t help at our current resolution at all!

But there’s a potentially very important question that we have to take into account : At what point is the PPD “good enough” and shouldn’t be worried about anymore? This, as it turns out, is a tough question to answer. Or rather, the answer depends on a lot of variables, including what display environment you’re looking at and what content you’re viewing.

A quick example: Upon the switch from 1080p screens in cutting edge smartphones to 1440p in smartphones of the same size most reviewers reported little observable difference between the two. Which, if we take the average viewing distance of a 12.7cm (5") smartphone at about 31cm (1ft) we get a PPD of around 110. Aha, we have our number!

Unfortunately this magic number is A. Just based on a not so random sampling of anecdotes. B. Is for viewing of text on a smartphone. Text has a maximum density of visual information, the spacing between each letter can (depending on text scaling) end up fixed regardless of resolution. So while looking at a smartphone without touching the text scaling might not reveal any difference, shrink the text on both and you would (and should) end up revealing an obvious difference of PPD between the two screens.

There’s also the question of viewing environment. A VR HMD is going to look, and feel, very different from glancing at the screen of a phone. Nor is it going to be displaying text primarily. So the question of where, if anywhere, a dramatic reduction in PPD can be gotten away with, is simply unanswerable from casual observation.

But there’s another question: whether PPD we care about also drops off going away from the center of the fovea outwards. EG can we go even lower in (rendered) PPD than our human eye fovea curve shows? Some excellent research from a foveated renderer prototype suggests that this is so! We can indeed decrease, if not visibility frequency, then at least shading frequency as we get away from the center of the pupil’s gaze in comparison to our ideal PPD frequency.

But there some unfortunately large assumptions made in the above. Namely it is assumed that shading frequency will not contribute as much of a contrast difference as visibility frequency, an assumption that appears to have held true for their specific scene. Yet this is not an assumption that can be made for all scenes. As soon as visibility frequency overwhelms their pre-filtering strategies then their proposed system will break and the unwanted aliasing artefacts will return. Another example is when shading frequency matches visibility frequency, a common scenario here would be any mirror like material. So while the above may be useful for low end, constrained scenes it can not be applied to all scenes.

This gets us straight back to needing to match our ideal PPD for both shading and visibility frequency. But perhaps, as with the phone example, we can lower the entire PPD without the average end user caring (much)? This question is answerable, but will have to be tested carefully. Scenes with high frequency and high contrast detail in terms of both shading and visibility will have to tested in order to ensure that the end user has a smooth experience without the developer needing to limit themselves in yet another way in the already tricky task of VR rendering.

And now drawing to our conclusion: that line of a person’s natural foveated rendering puts as at a 80% shading AND visibility reduction! Matching gaze and rendering quality to this line will preclude the current technique of inter-frame temporal reprojection; used to get a smooth headtracking response at <90fps (the lowered resolution would cause obvious flickering upon eye saccade). But even with that the savings on rendering can obviously be immense! Unfortunately this lowered rendering quality comes from our ideal PPD as well, and with HMD’s current extremely low PPD foveated rendering is simply not worth it at the moment.

Never the less foveated rendering could be well worth the development time, HMD cost, and rendering overhead on future HMDs. Without a practical (or cheap) way to shrink accurate and low latency gaze trackers yet anyway foveated rendering need not be an immediate concern for anyone shipping VR titles. But once the hardware challenges are overcome HMDs should be able to scale to vastly higher resolutions without scaling to vastly higher rendering costs, and given the current extremely low effective resolutions targeted that is exciting indeed!

Edit- Please note, the above assumes that 90fps is also fast enough to account for eye saccade without revealing unwanted artefacts. If this is not enough to benefits of foveated rendering will drop accordingly. EG beyond 360fps the entire idea of foveated rendering becomes moot from the point of view of saving on rendering times.

--

--

Sora Thompson

What is best in life? Cookies, sunshine, ponies, oh and physics too. Physics is pretty cool.