Cinematic VR with spatial audio, a minimal workflow

Compatible with YouTube, Facebook and Vimeo

Adriano Farina
Mar 21, 2017 · 4 min read
This is getting silly

Minimal panoramic video

Currently, the cheapest and easiest way of getting a good enough image quality is using a dedicated consumer camera, like the Samsung Gear 360, the LG 360 Camera or the Nikon KeyMission. They tend to hover around €300 including taxes, they’re widely available and you can use your existing tripod. They’re also very easy to control both with and without a phone and come with complimentary video processing software, for both Android and Windows.

In all cases, the video quality is perfectly acceptable in good lightning conditions, but degrades quickly as soon as it gets dark. The angular resolution is also a limiting factor, it’s best to keep the subjects in the sweet spot, a bit over 2 meters away.

Another important consideration is the thermal envelope: they’re all subject to overheating, to different extents. Try not to use long shots, or don’t use wireless controls if you’re shooting for a long time. They all have airplane modes, which don’t change the battery life very much, but dramatically change the thermal compliance.

Ambisonics the easy way

The second easiest thing to do is buying a Zoom H2n, update the firmware, enable spatial audio, and that’s it. It lacks vertical information, but for most videos it’s just fine, and it’s already in today’s preferred format. YouTube will accept it without question.

Just make sure you point the Zoom and the camera in the same direction. In case you get this wrong, you can rotate either the video, using Premiere Pro’s offset tool, or the audio, using kronlachner’s “Ambisonics first order rotator” VST plugin in your DAW of choice.


The one at the bottom is a cylindrical microphone array with 32 capsules


For most real-life scenarios, I’d advise getting your hands on a copy of Adobe Premiere.

Editing the video

ffmpeg -i movie.mp4 -ss 00:00:03 -t 00:00:08 -async 1 cut.mp4

If you shot with the LG, and already have an aligned audio track, your best bet is to use ffmpeg, making sure you appropriately specify the audio channels with the -channel_layout 4.0 option. Just export all your clips, concatenate them to obtain your edit, and skip to the metadata section.

Next, we’re going to obtain a synced-up version of the Ambisonics audio for each video clip.

Editing the audio

Import the audio from one of the clips you obtained in the previous step, and open the relevant audio source from your Ambisonics recording. You will now have to sync them up, and export the relevant trim of the Ambisonics audio.

Once you’ve done this for all clips, you’ll have to mux each video clip to its audio, like this.

ffmpeg -i videoClip.mp4 -i audioClip.mp4 -channel_layout 4.0 -c:a copy -c:v copy -shortest muxedClip.mp4

Finally, concatenate the resulting files.

Injecting the metadata

Open it, select your video, inject the metadata, save the resulting video, and you’re good to go.

YouTube and Facebook should accept with no problem, Vimeo will accept it, but it will play back the audio in stereo, with no spatial information. Vimeo doesn’t currently support any spatial audio, though.

Please let me know about any mistake or inaccuracy!

Adriano Farina

Written by

Lazy biker and videographer. Very occasional scuba diver. Tends to bake when procrastinating. Did Classics in High School, EE + media in Uni.