Spatial audio: How to record for VR

The first time I met someone who did spatial audio for VR experiences, I immediately apologized.

It was at a VRLA Expo a year or so ago and it was while shaking her had that I realized I — and the entire 360/live action industry — had butchered the craft we tell our students is the backbone of storytelling.

I was so focused on stitching together good visuals that audio was completely forgotten, only to be replaced by the “voice of god” narration.

Since then I have been trying to figure out how to pull off spatial audio and couldn’t crack that nut, outside of using Unity.

Well, thanks to a great conversation with Aurelia Soundworks co-founder John Hendicott, I get it now.

And it’s not as complex or scary as I had initially thought.

Below are some production processes and notes, as I best as I can understand them, that are the result of Hendicott’s knowledge sharing. (I also asked him to review and make sure I got it right.)

Step 1: The Gathering
Many of us have seen the wide range of (typically expensive) microphones to capture sound in 360.

Honestly, it’s been intimidating and expen$ive.

Hendicott explained that, based on my goal to produce journalistic work, I would be fine working in the First Order Ambisonic format.

(If you are like me, you immediately thought of Star Wars Awakening.)

He explained that it’s basically 4-channel audio, capturing left/right, forward/back.

He works in the Third Order Ambisonic, which is 16 channels.

The H2n Handy Recorder by Zoom.

He also said that the Zoom H2n was good enough to record, and, while the H2n does not capture elevation, it’s be good enough for our purposes.

A few of us have heard about this mic but what typically stumped us has been the post-production side. How the hell do you merge the spatial audio and spherical video?

He continued.

Step 2: The Editing
Reaper. That’s the software that works multi-channel audio and, great news, is free! You can buy it — and you should — to support the great work they are producing.

This is where you start building your audio experience, which could include the voice of god narration too.

But what about placing the audio in space, dude?

Step 3: The Placement
Thanks to the great work from passionate and talented people, there are a few options to do this.

The one Hendicott recommended for first time users is the Ambisonic Toolkit (ATK), which is a plug in for Reaper (and SuperCollider).

And it’s free!

This Reaper plug in works with Kolor Eyes and allows you to position and line up the audio to match up your video.

Image: The Ambisonic Toolkit (www.ambisonictoolkit.net)

Below is a video tutorial that talks about this using a synchronize tool, but Hendicott does a slightly different approach.

Tutorial by Spook.fm, which created a synchronize tool (SpookSyncVR v0.4) for Kolor Eyes.

That said, another option is Facebook’s Facebook Spatial Workstation (also free).

You then export the audio file.

Great… but that audio file isn’t attached to the video.

Step 4: The Muxing
Here is the magic that pulls it all together: IFFMpeg (and, yes, it’s free to start).

What iFFMpeg does is allow you to export different video types/codecs for the different platforms you are going to be publishing on… and as you do that, you attach your spatial audio to the video as part of the exporting.

It turns out the act of mixing/merging of audio and video is called muxing.

And that, my friends, is it.

Those spatial capable, 360-players take over from there.

Now, as you can see, there are no screen grabs of me doing spatial audio productions along with this text.

Why?

Because I haven’t done this yet. But why should that stop you from trying this.

Once I get the H2n, I’ll be putting this to the test and will report back.

But if you try it, please let me — and the community — know how it went.

A big thanks to Hendicott for allowing us to respect the craft of audio in 360.