Capturing 360° video from a VR DataViz application — trials and errors
This post is inspired by my experience with building a 360° video for Progress Telerik’s AR-VR web page. The video purpose was to show our “Sales Dashboard Data Visualization” demo from the eyes of several participants in a virtual room, who are discussing the presented 3D charts. As this visualization displays a lot of text and graphics, it is important to ensure that the rendering quality is sharp enough so that the participants can easily read the presented information. Although our demo application is written in Unity and manages to provide quality rendering when running in Oculus VR devices, it turns out that with the existing capturing tools for Unity it is not an easy task to preserve this quality in a 360° video. In order to see the pros and cons of the different approaches which I have tried, you should continue reading this post.
Why 360° video?
First of all, let’s see what are the benefits of having a 360° video and why we wanted to provide one, together with our demo application source code.
- Providing the source code of the demo is not enough. Although we have uploaded it with detailed documentation on how to use it and build it for different VR devices it certainly takes time for newcomers to download it and get acquainted with the development workflow.
- Providing already built APK and EXE files is still not enough. Although this would make it easier to install the demo on an Oculus device, still you will have to follow some steps for turning the device in developer mode in order to install and run the application.
- Even if you manage to successfully install and run the application on a VR device, you will be the only participant in a virtual room. In order to see the full functionality of the demo, you should install the app on several devices and have a group of people that will join the virtual room together.
- A 360° video would allow us to overcome the drawbacks from the previously listed approaches and practically provides the fastest way for the visitors of our web page to immerse in the demo room and see what functionalities are implemented in it.
- A 360° video also provides access to a larger set of people that may want to see the demo. With it, you are not required to have a VR device. You may easily see it on your smartphone and even the visitors that are using a desktop browser will be able to drag the mouse over the browser player and look around the virtual room in any desired direction.
How to record 360° video?
First, let’s say a few words on the scenario we wanted to record. It is a sample collaboration between several people in a virtual conference room. I and my colleagues Panayot and Georgi have run the demo application on Oculus devices and have joined the virtual room over a WiFi network. During this collaboration, there is always one of us that is in “presenter mode” and points different aspects on a chart data visualization. At the same time, the others are in “viewer mode” and see in front of them whatever is currently presented. When any of the viewers requests control using the Oculus controller, he becomes the currently presenting participant. We wanted to record the video in such a way that the visitors of our web site can easily look from the position of each of the presenters in the virtual room.
Recorder and Replayer
Most of the tools for capturing 360° video require quite a lot of time for capturing every frame of the video. This practically means that the video cannot be captured in real time and instead we need to somehow record all moving objects in the scene so that the movement can later be replayed frame by frame. We have created two Unity scripts that will be responsible for recording and respectively replaying the changes in our scene that occur during the collaboration. In this case we have recorded the following object properties:
- Chart transform. It changes every time the presenter moves or rotates the data visualization.
- Input transform. It changes every time the presenter moves his hand in order to show some aspect of the data.
- Avatars transform and avatar packets. These are the properties describing the current state of a participant’s avatar. Whenever someone moves his hand or head, new avatar packets are received and we should record this data in order to replay it later.
- Camera transform. When we project our demo on a 2D screen we use a spectator camera implementation which allows you to see from the eyes of the currently presenting user. In 2D display representation, it is fine if the camera moves together with the presenter's head. However, if we capture such movement on a 360° video and then play it on a VR device it is very likely that the VR user receives motion sickness from the intense camera movement. That is why instead of moving the camera with the head we have made it static and with the same position as the currently presenting participant. Changes in camera positioning are performed only when the presenter changes.
I will not get deeper into the implementation details of the recorder and replayer scripts in this post. We plan to publish a separate post on this matter in our Telerik AR VR series, so if you are interested in the implementation details, stay tuned for one of our next blog posts.
Tools for recording 360° video in Unity
So now that we have set up our scenario we are ready to replay it and try the different tools for Unity to see which best suits our needs for recording a 360° video.
The first approach — Unity built-in capturing API
At first glance, the built-in capturing functionality seems very tempting. It uses the Camera.RenderToCubemap method to project the space onto two cubes — one for the left eye and one for the right one. Then the result RenderTextures can be converted to an equirect image and a sequence of such images can easily be recorded by Unity frame recorder to a 360° video.
However, while testing this approach with our scenario we encountered the following issues:
- The graphics were a bit blurry even on higher capturing resolutions. This may not be an issue for some dynamic scene that does not require pixel perfect rendering but in our chart data visualization, there are a lot of text elements that become unreadable (even the ones that are close to the camera).
- The RenderToCubemap method has known limitation that it does not capture any UI elements. As our visualization shows some 2D graphics positioned in 3D space in the video it turned out that these graphics are missing. We have managed to find a possible workaround in Unity forums, however, it did not entirely solve the missing graphics issue in our scenario and we needed to extend its implementation for some of the TextMesh Pro instances in our scene.
As the built-in Unity capturing API did not work well in our case, we have decided to try some of the paid plugins in the asset store.
The second approach — VR Panorama 360 PRO Renderer
This paid asset is one of the popular assets for rendering videos in a Unity scene. It generally creates an image for every frame of the video by making camera snapshots in several directions and then stitching these snapshots to create the 360° image. After all frames are generated you can render a video file from the existing images. The video should then be injected with specific metadata that helps the video players to recognize it as 360° content.
VR Panorama asset provides a large set of options in its VRCapture script related to video format and video quality. One important option is the Capture Type. Other options regarding the quality are the Sequence Format (JPG or PNG), the Resolution and a Speed vs Quality parameter which controls the Anti-Aliasing. Let’s see what our findings are after testing different combinations of these options.
As you may see in the picture above there are two stereo options (the first one is with top-bottom layout while the second one is with side-by-side layout). As we target our 360° video for VR devices we would like to benefit from the stereoscopic displays for human realistic 3D visualization. That is why our first try is to render the video with some of the stereo options.
As we wanted to upload the video on Youtube with the best possible quality we have selected the Youtube 8k resolution preset and the Sequence Format is set to PNG (instead of the default JPG). At first glance, the result frame image looks with really good quality without blurriness of the graphics.
However, having a better look we have noticed that only the text and graphics in the center of the view are perfectly rendered. Looking left or right one may notice that the text and the chart bars are becoming double and this is reproducible both on the left eye image part and right eye image part.
This result feels particularly unpleasant when viewing the video with a VR headset and is totally unacceptable for our use case where the viewers should be able to easily read the labels in the chart. We have tried several options changing the resolution and the sequence format, but whenever the Capturing Type is set to some of the stereo options this double vision issue is persistent. Most probably it results from some error during the stitching algorithm that combines images from different view angles into a single 360° image.
With these issues in the stereoscopic capture type, our only option with VR Panorama asset was to try the monoscopic rendering. With this option, we won’t be able to benefit from the sense of depth provided by the VR devices, however, people will still be able to enter our virtual room and look around from the presenters' positions in all direction. So here is what our test with monoscopic rendering managed to produce for a single frame of the video using the maximum quality setting both for resolution and anti-aliasing:
Generally, the graphics were captured really sharp and all texts were easily readable without any double vision effects. When testing the video on an Oculus VR device it also looks sharp and with good quality. The only issues we had during the full video capturing were related to the performance and memory consumption of the VR Panorama tool — it took more than 10 hours to capture a 4-minute long monoscopic video and what is worse — the capturing process often crashes in the middle of this process. The calculations for generating every frame as a high-resolution PNG image are also very greedy of hard drive space — for a 4k monoscopic video, the images retained about 100 GB of memory. For 8k video, the needed resources are more than twice that big.
So, let’s sum up the pros and cons of the VR Panorama asset in our use case.
- We managed to record 8k monoscopic 360° video which was looking fine on Youtube both for desktop, for mobile and for VR headsets.
- The tool is easy to use and has good documentation on the different provided options.
- Recording stereoscopic video has stitching issues which lead to unacceptable quality in our scenario.
- The process of capturing the video takes a lot of time as it required rendering of one big image for every frame.
- The process of capturing the video requires a lot of disk space as the frame images are kept on the hard drive before the video generation.
- The capturing process often crashes the Unity editor for longer videos — in our case, the crash occurred when the captured video length was longer than 2 minutes.
The third approach — AVPro Movie Capture
As we could not achieve our go to record a stereoscopic video so far, we have decided to test one of the other paid assets from Unity Asset Store. AVPro asset provides a free version with 10 seconds recording limit which was enough for us to try its capabilities. In general, this asset uses a similar approach as VR Panorama and generates high-resolution PNG images for each frame. Here is what our findings showed when trying AVPro and comparing it to the VR Panorama asset.
- AVPro managed to render sharp and high-quality images when rendering in stereoscopic mode.
- AVPro frames did not have the VR Panorama issue with double-vision text and graphics when looking away from the central view direction.
- When viewing the result 360° video with a VR device the chart labels were blurry and hardly readable. This is probably related to an issue with correctly overlapping the left and right eye images and the result was unacceptable for our use case.
- The rendering of the frames seemed even slower compared to the VR Panorama asset.
As we could not create a stereoscopic video with acceptable text quality we decided not to test further this asset. We were already able to create a good quality monoscopic video with the VR Panorama tool.
The fourth and last approach — Facebook 360 Capture SDK
As a last try to capture stereoscopic video we tested the FBCapture SDK. It was looking very promising in sense of speed, however, it turned out that the produced quality was not acceptable for us (even for the monoscopic scenario). It uses the same RenderToCubemap method as in our first approach but provides an easy to use FBCapture prefab which has a variety of options for controlling the output quality. Because of its easy usage and really performant rendering, I still believe that it is worth mentioning the pros and cons we have found out while testing this SDK.
- Real-time capturing. As it uses RenderToCubemap method combined with shaders for creating the video frames it does not take hours to render the video which is a good advantage compared to the previous two approaches.
- Small memory consumption. The result MP4 files are also some times smaller compared to the ones produces with VR Panorama and AVPro assets.
- Easy to use — simply drag and drop the FBCapture prefab and use its Hotkeys for start and stop encoding the video.
- The SDK restricts the maximum size for video capturing to 4k resolution. Both VR Panorama and AVPro were allowing 8k resolution.
- The captured graphics are not sharp enough which makes the labels blurry and hardly readable. This is reproducible for both monoscopic and stereoscopic videos.
- The usage of RenderToCubemap method comes with the limitation for capturing UI elements. As in the first approach we had to try the workaround from Unity forums with some additional implementation modification in order to make it work for our scenario.
- We could not make the stereoscopic video to work in any of the 360° video players. In order to capture stereoscopic video with this Facebook SDK, you should select the RGBD_Capture texture format which creates a gray depth image parallel to the original monoscopic 360° image. However, neither Youtube nor Facebook video players were able to recognize this video format and we could not find any metadata injector that could make it recognizable. All metadata injectors seem to be working with top-bottom or side-by-side image formats and none of them is providing the option for a depth channel.
Although our demo application manages to show text and graphics with sharp edges on VR stereoscopic displays, we were not able to find a suitable 360 capturing tool that manages to persist this visualization quality in a stereoscopic video. That is why we ended up creating a monoscopic video which loses the sense of depth, but at least allows the viewers to easily read the text and graphics presented in our demo application.
I hope this shared experience will be helpful and time-saving for someone who is about to start their own challenge of capturing a 360° video. Feedback is welcome — feel free to share your thoughts about our or your own experience with the available tools for Unity.