How To Play Back Volumetric Point Cloud Animations In UE4

Ross Beardsall
XRLO — eXtended Reality Lowdown
8 min readJun 16, 2021

--

For a recent virtual production project, the Magnopus UK team was tasked with building a workflow for supporting the playback of animated volumetric point cloud data. The dataset in question was a 45-second volumetric capture encoded at 60 frames per second, with 4 million points per frame, to make up a total of 10.8 billion unique points of data. All of this was required to play back in real-time for use on a virtual production stage.

For this article, I have synthesised some animated point cloud data via a dynamic particle simulation in Houdini, as open-source animated point cloud data at 4 million points per frame is, unsurprisingly, hard to come by. Although it is not truly animated volumetric point cloud data, it should demonstrate the approach all the same.

4 million points per-frame synthesised animated point cloud data, using a photogrammetry scan of the National History Museum in London as a basis, played back in realtime in full HD at ~11ms on a GTX1080. Hintze Hall, NHM London [point cloud] by Thomas Flynn on Sketchfab.

Thought experiments

Before we even begin to consider how we could achieve real-time playback, we first need to better understand the structure of the point cloud dataset.

There is for all intents and purposes a consistent point count of 4 million points per frame, throughout the entire animation. Although each point is unique in each frame, we will never need to render more than 4 million of them at any given time, Therefore, we only need to spawn an initial burst of 4 million particles, which we can then treat as a ‘pool’.

A pool of 4 million particles, let’s dive in!

Then, for each frame of the point cloud animation, we can read in the position and colour data for each point, and update each of our particles to construct the resolved point cloud.

Example of resolving a frame of the point cloud data

As each point is completely unique between frames (there is no relationship between either a point’s position or colour between frames), there can be absolutely no interpolation, as this would result in sporadic and undesirable particle movement between keyframes.

With UE4’s GPU-accelerated particle system, Niagara, Unreal has no problem in spinning up 4 million particles. The real bottleneck is always going to be the data throughput, so we need to consider ways we can speed up the transfer of the point data to the particle system.

Encoding point data into a texture

4 million points sound like a lot, but somehow, that figure becomes instantly more digestible when you consider that a 2000 x 2000 pixel texture has exactly 4 million pixels. In fact, you’re pushing nearly 8.3 million pixels just reading this article, if you’re on a 4k monitor!

A pixel in a texture can store 3 values (4, if you have an Alpha channel) — one value for the red channel (R), one for the green channel (G), and one for the blue channel (B) — so it stands to reason that we could serialise the positional data (X, Y and Z) for each point into each pixel of a 2000 x 2000 pixel texture.

Let’s simplify the problem, before applying the same rationale to our larger dataset.

A point cloud cube made up of 8 points

Taking the above-pictured point cloud, we can easily represent these 8 points as 8 XYZ coordinates, as per the below data table.

Now, taking these 8 XYZ coordinates, we can serialise this data as 8 RGB colour values, and write these to pixels in a texture.

An example of how point locations can be serialised as colours in a texture

We can write this data to a signed 16 bit EXR, so that we have a good amount of precision and can support both positive and negative pixel values. This operation is handled as an offline process in Houdini.

That’s the point position data taken care of, but our volumetric capture data set also requires colour data.

We could enlarge our texture to 2k by 4k, and sample it twice for each point we wish to render (once for position, once for colour), but this would double the memory footprint of each of our frames, and have a significant impact on runtime performance.

Another option would be to write to another 2k by 2k texture, and read from each concurrently to resolve the particles, but this is arguably worse than the 2k by 4k solution, as it incurs the same memory footprint cost, alongside worries that when playing back multiple frames, the concurrent datasets could become out of sync and therefore result in a mismatch between the positional and colour data.

As noted earlier, we do have another channel available to us (the Alpha channel), so would it be possible to encode the RGB colour into a single channel? It’s time to get a bit shifty…

Encoding 16 bit colour into the Alpha channel

As mentioned, we are encoding our location data into 16bit EXRs. This gives us 16 bits per channel. The RGB channels of the 16 bit EXR are reserved for the positional data, so we are left with one 16 bit Alpha channel to play with.

Example of how an RGB colour can be composed of 3 8 bit values

Our raw colour data is 8 bits per channel (totalling 24 bits of colour data), which leaves us 8 bits short of being able to squeeze the full colour into the Alpha channel. Instead, we can use bit-shifting to ‘resize’ the colour data to fit into a single 16 bit float.

We can drop the last 3 or 2 bits of each channel to afford us the ability to pack the data into a single 16 bit value

We can reduce the bit depth of each colour value from 8 bits to 5 bits each, thereby only requiring 15 bits total to store our RGB colour value. As we have 16 bits though, we can throw an extra bit of precision to the G channel (as the human eye is receptive to more shades of green than red or blue).

Example of how the 16 bit Alpha channel is composed of the 3 reduced bit-depth RGB values

All that’s left to do is unpack it back into three RGB colour values in Unreal.

Of course, this approach is lossy, but for our use case, it is preferable to using two concurrent 16 bit datasets. Another alternative would be to encode the EXRs at 32 bit floating point precision, giving us ample precision to fit our 24 bit colour into. However, Unreal does not currently support 32 bit EXR streams out of the box, and this would also double the memory footprint.

Using Unreal’s Image Sequence Media plugin to stream the data textures

At around 30MB per frame, it’s not realistic to expect that we can just wholesale load all of our frames into video memory. We will need an efficient means of streaming this data from disk, pre-caching upcoming frames and loading them into memory, before discarding them and freeing up valuable resources.

This is where Unreal’s Image Sequence Media plugin comes in. Typically used for playing back image sequences for video-style playback, we can hijack this system to play back our datasets that have been encoded into textures.

An example of a single frame representing all of the positional and colour data of 4 million particles

Now, we only need to index into the media texture in our Niagara particle system to resolve the point location and colours of each frame.

Indexing into the data stream

So, now that we have formatted the data, how do we go about reading it?

We can use Niagara’s Sample Texture module to directly sample the media texture.

For each particle, we index into a pixel in the texture sample and read its RGBA values.

We construct the UV coordinate from each particle’s persistent ID, which amounts to a particle index. Starting at the top left of the texture, we traverse horizontally until our persistent ID is larger than the width of our texture, at which point, we wrap back to the beginning, traverse vertically by a single pixel, and start over again.

This kind of logic was made trivial by Niagara’s powerful interface, and the whole indexing process as described above is pictured below.

Example of how we construct the UV arguments programmatically inside the Sample Texture node using Niagara.

Now that we have the pixel values, we interpret the RGB values as XYZ particle positions, and unpack the R5G6B5 values from the A value and interpret it as R8G8B8 particle colour.

One important ‘gotcha’ is that textures can come in as bilinearly filtered.

Bilinear texture filtering vs unfiltered — interpolation between points does not make sense, so unfiltered would be preferable

Unreal doesn’t support unfiltered media textures as of 4.26, but the data is still there — we can index into the half-texel of the filtered texture and that will return us the correct, raw value.

From left to right — zoomed-in bilinear filtered pixel, unfiltered pixel, an example of unfiltered pixel data preserved at texel centroid

Putting it all together

Now we have a data stream and a particle system that reliably indexes into it, it’s just a case of playing back the data stream, sitting back and watching those points ‘do their thing’! This works well for our animated point cloud data, but could also be equally as effective for baked computationally heavy particle simulations, which would not realistically resolve at runtime.

Example of scrubbing the animation data in Unreal’s Sequencer

Scalability

The process for writing each frame of data to an EXR, although relatively quick, isn’t exactly what you would call ‘real time’ — taking roughly 1 second per frame to export. When testing performance, we know that we are bound by how quickly we can stream the images from disk, so our route to optimisation is to reduce the size of these textures (which in turn also reduces the amount of point data, and thereby reduces the number of particles).

Example of how cropping the data set reduces the perceived density of the point cloud

Rather than having to re-export the data set at a new resolution, we can simply crop the pre-rendered full-resolution dataset to the square of the number of particles we want to render.

If we can ensure that the points are encoded randomly for each frame, we can simply crop each texture dataset to the max resolution that our target platform can support.

Future work/conclusion

Although workable for our use case, the 16 bit EXRs used do result in a loss of precision for the points furthest away from the origin. The most logical solution for this would be to extend Unreal to support the playback of 32 bit EXRs, which would also negate the need for using bit shifting to serialise the colour data.

To summarise, we were able to leverage Unreal’s Image Media Sequence plugin and EXR textures to play back baked volumetric point cloud animations at real-time speeds in UE4.

--

--