7 Lessons I Learned from Audio Directing Chiaro and the Elixir of Life
In September 2018, I shipped my first VR title, Chiaro and The Elixir of Life.
12 Environments. 6 Characters. 1 Interactive Soundtrack. 1000 cups of coffee.
What’d I learn?
- Music is the soundtrack to the player’s life.
Of all the game’s reviews on Steam, the soundtrack has been one of the most mentioned aspects.
In approaching the design for a music system, I collaborated closely with the Creative Director/Composer on an outline of the important gameplay and narrative milestones that the player would have to progress through. This outline helped me think about how to trigger music as both a reward for the player at pivotal moments in the story and how the soundtrack represented Chiaro’s life.
For example, in the game’s first level, the player builds their AI buddy, Boka, by collecting his parts off the floor and assembling them. Once the player successfully builds Boka, I trigger a mysterious music cue to play (tremolo on strings and a melodic buildup as the FX smoke clears), which helps let the player know that something magical is about to occur — Boka comes to life.
Not only does the music underscore important story moments, but musical themes develop over time as characters grow and change. Leitmotif is defined as a musical theme associated with a person, place, or idea. Scoria’s theme (who the player may or may not meet in the first level) evolves from her singing on a bridge to an epic fanfare that underscores her fight for independence.
The music changes not only as the characters arc, but also as the player progresses spatially through more complicated puzzles. One of the final levels is a temple (à la Legend of Zelda), in which the player chooses to enter either the Wind Trial or Fire Trial. I knew I wanted to program music loops that changed as the player progressed deeper into each trial in order to to reflect the intensity of their challenge.
2. [Important] dialogue needs to be heard wherever you go.
I’ll add a caveat that this really depends on the type of experience you’re aiming to make. How the player moves (teleportation, free locomotion), what type of app it is (seated, standing only) are factors that are part of the design conversation between the audio and the narrative team.
For our narrative game, the player has the ability to teleport around freely (a lot of VR enthusiasts are asking for smooth locomotion — keep that in mind). So, I enabled virtualization for all NPC dialogue, which allows a sound to play at 0 volume as opposed to stopping the sound entirely. I also created custom attenuation curves for my NPC characters, which allowed me to create pinpoint sounds if you’re within the max distance, and once you pass the max distance, the character’s voice will stay at a certain volume.
3. Sound effects must be spatialized.
It’s only once you’re in VR standing next to a robot that you realize how much sound needs to exist to make the person forget reality and believe in your world. For instance, I expect footstep sounds to Play Attached to the corresponding foot bone, robot squeaks to play attached to their sockets, maybe even breathing sounds attached to their noises for idle animations!
Here’s an example of a timeline for a single animation for Boka. The blue rectangles are PlayFMODEvent animation notifiers I’ve manually added that trigger spatialized sound to play on different parts of Boka:
To go even further, it’s important to ask: do we need to hear all these sounds at different distances away from the characters? The Magic Leap audio design documentation refers to this idea as Audio LOD. We can think of spatial sounds in terms of filmic framing. At a “wide shot” (i.e. further away from the character) do we hear the footsteps or just the voice? What types of sounds do we hear at a “close-up” of the character?
4. Give the player sound options.
A sound menu helps players who may have trouble hearing, helps players who want to listen to their own music, and gives players that modern feeling of customization. For Chiaro, my sound options control the SFX, Music, and NPC vox volumes, as well as enable/disable the Player Character’s internal voice (overall goal: how can I make the player’s audio experience more comfortable in VR?)
5. Prototype early.
(a) because someone will inevitably say “we need a new line for when the AI character jumps vertically into the air”, or “we need to test a line that clearly tells the player they need to go outside” but you haven’t scheduled the recording time for that, and you’re two weeks away from shipping!
We recorded about 450 lines with Taylor (Chiaro), and about 1,000 lines for AI characters (story and gameplay situations). Recording time at a studio is expensive in terms of money and time, so if you can get a developer to record some lines, implement them into the game, and test quickly, you can see if the dialogue fits in terms of tone and length. In the future, I would probably use a text-to-speech editor to generate files in order to cut down on the time for recording placeholder lines. I would also probably design a tool that lets me see how many times a line of dialogue plays, which would provide crucial info to your narrative designers to iterate on the script.
(b) because there is no loudness standard across VR platforms, which ties into #6.
6. Mix early.
Mix as you go. This is helpful for a lot of reasons.
Test your audio on the different target HMD devices, using headphones. Oculus’ audio documentation suggests mixing at -18 LUFS (perceived loudness of your audio) and -16 LUFS for standalone apps. Sony’s standard is -23 LUFS (+/- 2). Mixing at a target of -18 LUFS using Rift headphones, then deploying to HTC Vive, the game sounds… quieter? Uh-oh.
Sure, you can take a break for a little bit and listen on speakers, but you want to assess as much as possible on the target hardware using the built-in headphones or your own.
Additionally, set up your mixer bus structure early. Categorically, I worked from small to big FMOD event groups. That meant one bus for the rowboat sound effects, a bus for the fishing rod sound effects, one for the Peakoo birds. Then, I could group these busses by Lv_number_SFX, Lv_number_Ambience, etc. This allows me to solo, for example, only the rowboat sounds in Lv_03_SFX if I need to compare the levels between the rowing and splashes.
By organizing your mixing structure as you go, you’re setting yourself up for more success and less anxiety when you approach shipping. Another advantage to mixing as you go is that you can more easily ship great sounding builds of levels to investors, companies, demo nights, conventions, etc.
7. Give voice talent material to work with (other than just the script).
Give your talent enough visual materials to work with (if you have them). If you can’t get a VR-ready PC and headset to the recording studio for your talent to experience the game, then take screenshots from the engine so they can see the key moments and understand the geography of the level, where the characters are standing (helps the talent know if they should speak or shout the line), and who they’re speaking to in the scene. Another idea: take 360 photos and bring a few Google Cardboards. Every piece of visual material helps your talent understand what your VR experience is all about.
These lessons are important to think about when making your own VR experience, though I admit that the needs of your project may differ from mine. It may not rely on an orchestral soundtrack, or you may not have the budget to hire voice talent. I’ve left some other important topics: performance profiling, audio volumes, dynamic mixing, tools for speeding up workflows and debugging. But hopefully this article helps as a starting point for your VR ideas.
Q’s or comments? Feel free to give me a shout-out on Twitter @ murascotti