How we made the Audi AI:ME’s flying VR experience — Part 3
VR performance challenges and creating realistic fog in Unity
This is part three of a three-part series, where we explain the details of some of the large challenges we faced while making this Photogrammetry VR Experience we made for Audi’s CES 2020 demo.
In this article, I will go over our goals, performance considerations which we had to take into account, and finally implementation of a realistic fog system.
Aims of the Project
The main objective was to create an immersive VR experience where the passenger of the Audi autonomous concept vehicle could wear a VR headset to escape from the loudness of city life to enter a beautiful, relaxing, nature-filled environment. To ensure that no motion sickness was introduced, it was essential that the car’s navigation system and the VR system were in sync (to match the movement in VR to the G forces felt of the car navigating). We could access this information through a prototype SDK provided by Audi that is now being further developed for the mass market by Holoride.
Having no set path for the autonomous vehicle to follow, we couldn’t just film a 360 video and map a 1:1 flight path of how the car would maneuver through a set course. Photogrammetry scanning entire mountain ranges and valleys to create the VR experience in Unity gave us the flexibility to build an experience that works with almost any navigation path.
Given the how large the Photogrammetry captures were, maximizing geometry and texture quality while minimizing performance overhead was a critical challenge. Luckily, with continual performance testing with the end-hardware, we gained a better understanding of what specs to process our scans to.
Typically, Photogrammetry scenes can look quite static (as they are static meshes). Having just come from capturing these landscapes and experiencing the beautiful rolling fogs through the valleys, we knew we needed to have a performant solution to rendering realistic fog to make the landscape come to life.
Since I knew the end hardware and had it on my desk, I did tests to optimize for that spec.
The first thing was getting the amount of draw calls right. Increasing the draw calls was lethal for our case, since the fog required a depth pre-pass, which effectively doubles the amount of draw calls. It should be noted that, draw calls themselves are a poor indicator of the overhead cost, since the validation in the drivers after each state change is where the real overhead is. Looking at our render loop, the things we were frequently changing were texture uniforms and the meshes.
The relationship between the number of draw calls and the overall performance wasn’t as simple as it seemed. Our main bottleneck was the Render Thread in Unity, which was responsible for communication with the underlying graphics API. Since this communication had to be in one thread, multi threading and graphic jobs didn’t improve our frame time. A draw call in our case was directly connected to a mesh or a batch of them. We could create bigger chunks in our pipeline, hence less draw calls per scene, however this increased the amount of data that needed to be passed to the GPU per frame as well as fragment overdraw.
You can picture it like this, as chunks get bigger, the frustum culling is applied to bigger and bigger bounding volumes. Since CPU frustum culling is per mesh and not per triangle, bigger meshes mean more unwanted vertices are passed to the GPU. These vertices are later clipped, however since the clipping happens after each vertex is transformed in the clip space, a vertex shader program still runs for these unwanted vertices.
The second problem is the fragment overdraw. Given a mesh, the GPU renders the triangles in the order which the index buffers, or the triangle stripes indicate. This order is typically optimized for the vertex cache and has nothing to do with which triangle is in front of the other with respect to the camera. Meshes are sorted based on the distance of their bounding volumes to the camera. In the scenario of each mesh being a different draw call, the GPU uses the content of the ZBuffer to avoid fragment overdraws. Given Mesh A, B and C, the CPU first sorts them based on which is at the front (for example B) and as it renders A and C, it skips the fragments that are occluded by B. However if we combine mesh A, B and C, there is no guarantee that the triangles of mesh B are rendered first.
By reducing the amount of draw calls through creating bigger chunks (as described in Part 2), there is a point where the Render Thread stops being the bottleneck and is moved to elsewhere as the result of the two reasons stated above. Through experimentation with the final hardware, I attempted to find this sweet spot. I tested the cost of fragment overdraw, the increase in VBO size and increase in the number of vertex shader invocations in isolated scenes as well in the final scene we wanted to render.
After extensive testing, we settled for what effectively became around 500 draw calls and 2 GB of textures per scene. The 2 GB already included the mip chained and was streamed in as required.
Dealing with textures was a thing of its own, as typically, Photogrammetry environments are composed of the same texel resolution across the entire scene, which can amount to gigabytes of texture data.
If left unattended, we would need to load around 80 GB of textures for all the scenes, and 10 GB in the GPU memory to render the entire landscape at the highest resolution. Textures loading time caused spikes and the device memory couldn’t hold so much data all the time, which caused very poor performance.
We tackled this problem from two sides. The first solution was using texture streaming provided by Unity, which we adjusted to match our fog parameter well to hide any artifacts caused by streaming.
Second, was developing an adaptive and gradual unwrap using our pipeline (as described extensively in Part 2), based on preserving texel quality in areas-of-interest and gradually decreasing the quality for the surrounding geometry, being obscured somewhat in fog. This reduced the texture memory foot print, as well as cache coherency in the texture fetches.
Photogrammetry meshes are special in that sense, since the typical fragment shader of an unlit mesh doesn’t have any arithmetic to hide the latency of fetches through pipe lining, the bottleneck is the texture fetch itself. Improving unwrap improves the execution time of the forward pass through better cache coherency.
I also undertook some general improvements such as splitting the fog effect in several passes to reduce register pressure. There was also a tipping point there between the overhead of the extra draw calls and the time gained because of better thread occupancy.
The final performance measure was to move arithmetic whenever possible from screen space (such as color adjustment) to the fragment shader of the objects, since the pipelining would use the latency to calculate these values without added time.
Rendering Realistic Fog in VR
Learning From References
A typical implementation of fog consists of a linear interpolation of the rendered scene color, and a uniform fog color, based on the distance between the camera and the pixel. This implementation creates the feeling of aerial perspective with a few line of code, but fails to represent the scattering nature of fog.
Looking at the reference images we took on site, it was clear to me that I needed a different solution. In particular, there are three visual properties of fog which I wanted to have in our implementation.
Below is an image captured on site by our capture team. You might notice quickly, that the fog causes the details of what is behind it to blur. This is a blurring of the edges of the objects, as well as the texture of the geometry.
A striking scene in the landscape was the movement of height fog close to the ground. Its shape consists of both low and high frequency details and its movement is like that of fluids along a river.
The last of the visual properties which I wished to capture in the fog was the spreading of light in a form of a halo, as it travels through the fog.
Understanding the Physical Properties of Fog
The underlying physics behind how light works is a vast topic. For those who want to dive deep in the topic, I highly recommend reading the blogs and papers I have cited in the further reading section below.
Here, I will try to go over the basics necessarily to understand how to reproduce the desired properties of fog.
When light travels through the air, it collides with particles which causes it to scatter. The direction in which it scatters is determined by its phase function. The exact direction which each beam scatters to is a matter of probability. For example, a phase function might scatter the light uniformly in all directions (an isotropic scattering) or favor a certain direction such as the same direction as the incoming light (anisotropic scattering).
Scattering in fog favors the forward direction. It is more probable for light to scatter in the general direction of the incoming light. This is the reason why we observe halos around light sources in fog. The width and falloff of this halo has been measured in different weather conditions, and there are real time approximations which model this spread function. The more scattering events happen, the more the objects behind fog become blurrier.
As light scatters out of its path, it can also scatter back in the direction which we are looking. For every direction we see through fog, the light that reaches our eyes is a composition of two parts. Those beams which have not been scattered and go directly from source to our eyes, and those which have scattered in the path. Some of the light along this path has been lost either due to absorption or scattering. This attenuation of light as it travels through fog can be modeled using Beer’s Law which uses an exponent of distance and two decaying constants for out scattering and absorption.
As the density of the fog increases, or the total distance traveled through it increases, so does the number of scattering events. As this number gets large enough, the anisotropy of fog loses relevance, and light will come uniformly from all directions. This is the reason behind the uniform white color of fog in far distance. Interestingly enough, this is the only property of fog which the classic fog technique in the fragment shader correctly imitates.
TL;DR (too long; didn’t read)
In summary, fog scatters the light which travels through it. This causes the object in fog to become blurry. Fog scattering favors the forward direction, which causes a distinct halo around light sources, with a specific fall off. If enough scattering takes place, everything will become white.
Now that we know why and how fog blurs things, how do we imitate it? The classic linear interpolation fog in fragment shader wouldn’t work because the desired blurriness is a global effect which goes beyond a single fragment, triangle or even draw calls. I did play around with the idea of baking the fog in the mip chain of the texture and cleverly use that, but that still left me with the object edges I needed to blur.
First naive suggestion might be to try to imitate the physical phenomena. This is impractical even for offline renderers, since there is no closed analytical solution for a given eye direction and a numeric solution requires trillions of calculations. Keeping in mind that the application needs to render vast landscapes in VR, so another approach was necessary.
Another possibility is raymarching only few of the scattering events. This is still too slow for VR, although there are some possible cool tricks to imitate mie scattering without actually doing it or cheap integrals for height fog and such.
There are hybrid techniques which games such as Horizon Zero Dawn or Sea of Thieves have implemented for their clouds. While those techniques correctly recreate the fluffiness of the clouds, they don’t blur the geometry behind them and their implementations are not trivial.
A clever solution is what was used in Assassin’s Creed IV. Using a low resolution frustum aligned 3d textures and async compute shaders, you could calculate per voxel the in and out scattering amount, and accumulate the contribution of fog with raymarching through the 3d texture on a screen space pass. This has several advantages, such as acceptable render costs, only one 3d texture for both eyes, easy filtering, volumetric shadows by integrating the shadow map and possibility of an inhomogeneous (varying) fog density. However from experience, I knew that implementing techniques from AAA games typically requires a large overhead for resource management, which I wanted to avoid.
The last technique, which is the method I used at the end is a screen space approximation. This technique has been used in various forms and degrees in different games, such as The Witcher 3. The idea is to imitate the scattering in a screen space pass. Screen space effects have the advantage of being simple to implement, require few resource management, and their performance is only dependent on resolution, not scene complexity. However, they also have a serious disadvantage: they lack information about objects that are not visible on the screen. Implementing a physically accurate screen space scattering is to my knowledge impossible, however the imitation was good enough for our purposes.
I went back to the underlying physical principles but this time looked at it from the perspective of my application and attempted to build an imitation.
Every pixel on the screen, represents a path from the camera into the fog. For each path, I would like to determine the incoming color, which consists of the light that has been scattered in and a decayed light that has traveled from the source to the camera. I treated each pixel as a light source of its own.
finalColor = originalColor * decay(dist, dConst) + inScatteredLight;
After the scene has been rendered normally, and the frame buffer has been grabbed for our screen pass, each pixel on it is a light source. So the originalColor represents the incoming light that has not been scattered.
For the decay function I used Beer’s law, a good reference was Alan Woelfe’s code for his raytracer or Inigo Quilez’s for his raymarcher. The decayConstant is imitating both absorption as well as out scattering. I separated these constants since the out scattering constant was being used somewhere else to calculate the inScatteredLight amount.
float decay(float distance, float decayConstant)
The exp(-x) is e^-x. Looking at its graph, one can see that as distance gets larger, the original scene color’s contribution to the final color of the pixel approaches zero, and the pixel color is determined by the inScatteredLight.
So far, so good. The main question for me was how do I approximate the inScatteredLight? One thing I knew was that after enough scattering happens, fog’s scattering becomes isotropic. Basically in all direction only white/ gray would be visible. I imitated this behavior like this:
inScatteredLight = fogColor*(1.- decay(dist, dConst));
finalColor = originalColor * decay(dist, dConst) + inScatteredLight;
So as the originalColor decays in black, the final pixel color slowly goes to the fog color. This color is in my case white since fog particles absorb all wavelengths more-or-less the same, and their scattering is not wavelength depended.
What I had above is basically the classic implementation of fog. What I wanted to have is for the neighboring pixels to contribute to the inScatteredLight. Given a pixel at position (x,y), I wanted to collect the contribution from all neighboring pixels. Ideally, a spread function could be used which approximates what actually happens. This function would take things such as density of fog (scattering constant) or distance traveled through medium into account.
To read and accumulate all the neighboring pixels in native resolution and in one pass will result in poor performance. The trick is to build a hierarchy of blurred screen texture. The filter used is a Gaussian filter with a standard deviation which has been determined by the spread function. Then, based on the amount of fog the pixel should receive, the desired texture could be sampled from the hierarchy (or two textures with a bilinear interpolation) and be added to the inScatteredLight. How exactly the orignalColor, the fogColor and the contribution from neighbouring pixels should be combined, is up to the implementation I wanted to do. Since the exact physical representation is impossible, I gave it some artistic control with different parameters which we adjusted to make it look “good” and as similar to the reference footage as we could get it to be. The only thing to keep in mind was energy conversation. The total amount of light is supposed to stay the same (including absorption). If the original color decays faster than the inscattered color or the fog color, the screen will look unnaturally dark.
I won’t go in to the specifics of the implantation, since that would require a few extra blog posts. The standard depth of field screen space effect in Unity could be used as reference, and the fog can be implemented on top. This paper is a good reference, since it goes over artifacts such as illumination leaking in details and offers suggestions for it. Also as reference you can look at this implementation on Github which is based on Unity’s bloom. It suffers from some of the artifacts which the approach in Elek’s two papers doesn’t have but will be good for most use cases.
One final element to consider is Rayleigh scattering. Ralyleigh scattering is still happening, but the visual characteristics of the scene are mainly dominated by the wavelength independent scattering in the fog, especially as the number of scattering events increase. However, looking at the reference photos under certain foggy condition where the fog is not too thick and on the areas where the fog is transitioning to its uniform isotropic form, there is a tint of blue. This can be modeled in the same equation. Either by adding a different decay constants per color channel or just adding the blue light and controlling its amount artistically with a costume curve which is what we did.
With the above technique, I finally had the softness I wanted. However the Height fog was still to be addressed. Since in the screen space pass the world space is typically reconstructed, adding a smooth base line in height where the fog starts was not hard. Alternatively the analytical approach suggested by Quilez based on ray direction is also valid.
However, that is not enough. Looking at the references, there were two essential visual properties that I wanted to have:
- The irregular shape of the height fog, which contains both high and low frequency details (rough and fine details) and second
- The fluid-like motion of the fog.
For the former, I already knew the solution. Whenever I need variation in a visual detail, first thing to try is a Perlin noise. And whenever I need frequency variations in those variations, the answer is usually a Fractal Brownian motion (fbm). I ended up baking an fbm texture based on Perlin noise, however if you do generate the noise per frame, you could use your fwidth to insure your noise octaves don’t go pass the Nyquist limit. I simply modified the Zbuffer readings with the noise before I passed it on for further calculations.
The second property is a bit harder. Best, of course, would be if I could analytically model fog as a fluid. But once again this was not possible for real-time (at least for our use case). Second option was to simply pan the noise in a direction. Whilst this does add movement, the movement lacks the swirl and rotation of a fluid. And the directionality and monotony of the movement becomes very obvious.
At this point, I remembered the water rendering techinique used in Portal 2 presented by Alex Vlachos. Since it imitated fluids so well, I decided to try it for my fog. It added both the swirl I wanted and got rid of the monotony.
Since the topology is static, I could have baked a flow map, or even render particles in a render texture to manipulate the flow map for dynamic objects, similar to the water rendering technique in Far Cry.
However, since our valleys were quite wide, I realized I can give the illusion that the fog is respecting the topology as a body of fluid, by simply changing some run-time parameters per scene.
Here is the final look of the fog compared to the reference image of the scene, which we were quite happy about.
I hope you enjoyed reading this series of How we made the Audi AI:ME’s flying VR experience!
Part 1: Planning and executing a large-scale aerial Photogrammetry project
Part 2: Photogrammetry processing and optimization
Part 3: Performance challenges and creating realistic fog in VR (this article)
Download the free Realities app on Steam
Explore the results of our Photogrammetry workflow in beautiful and detailed environments from the around the world in VR.
Download from Steam for Oculus Rift / HTC Vive.
Follow the Realities.io Team
The Realities.io works on the cutting edge of VR+Photogrammetry. Follow us on Twitter.
Resources and Further Readings
- Gamasutra Article from Bartlomiej Wronski on atmospheric scattering in Assassin Creed 4: https://www.gamasutra.com/blogs/BartlomiejWronski/20141208/226295/Atmospheric_scattering_and_volumetric_fog_algorithm__part_1.php
- Siggraph Slides for the Wronski’s talk on AC4 atmospheric scattering solution: https://bartwronski.files.wordpress.com/2014/08/bwronski_volumetric_fog_siggraph2014.pdf
- Oskar Elek et al. Real-Time Screen-Space Scattering in Homogeneous Environments, 2003: https://cgg.mff.cuni.cz/~oskar/projects/CGA2013/Elek2013.pdf
- Oskar Elek et al. Real-time Light Transport in Analytically Integrable Quasi-heterogeneous Media 2018: https://cgg.mff.cuni.cz/~oskar/projects/CESCG2018/Iser2018.pdf
- S.G. Narasimhan and S.K. Nayar, Shedding Light on the Weather, 2013: http://www.cs.columbia.edu/CAVE/projects/ptping_media/
- Wojciech Jarosz, Efficient Monte Carlo Methods for Light Transport in Scattering Media, 2008: https://cs.dartmouth.edu/~wjarosz/publications/dissertation/
- Alex Vlachos, Water Flow rendering in Portal 2, Siggraph 2010: https://alex.vlachos.com/graphics/Vlachos-SIGGRAPH10-WaterFlow.pdf
- Alan Wolfe, Raytracing Reflection, Refraction, Fresnel, Total Internal Reflection, and Beer’s Law, 2017: https://blog.demofox.org/2017/01/09/raytracing-reflection-refraction-fresnel-total-internal-reflection-and-beers-law/
- Wikipedia article on Beer Lambert law: https://en.wikipedia.org/wiki/Beer%E2%80%93Lambert_law
- Jensen et al. A Practical Model for Subsurface Light Transport. 2001: https://graphics.stanford.edu/papers/bssrdf/bssrdf.pdf
- Premoze et al. Path Integration for Light Transport in Volumes 2003: https://www.researchgate.net/publication/220853010_Path_Integration_for_Light_Transport_in_Volumes
- Premoze et al. Practical rendering of multiple scattering effects in participating media. 2004: https://cseweb.ucsd.edu/~ravir/HRPIEG.pdf
- Quilez, Better Fog: http://www.iquilezles.org/www/articles/fog/fog.htm
- OCASM, Screen Space Multi Sampling: https://github.com/OCASM/SSMS
- Schneider & Vos, The Real-time Volumetric Cloudscapes of Horizon Zero Dawn, Siggraph 2015: https://www.guerrilla-games.com/read/the-real-time-volumetric-cloudscapes-of-horizon-zero-dawn
- Valentin Kozin, Sea of Thieves: Tech Art and Shader Development, GDC 2019: https://www.youtube.com/watch?v=KxnFr5ugAHs&t=1631s