Immersive Storytelling Beyond Virtual Reality: Adapting Audio Techniques for VR

Coming from a background in radio, I’ve always viewed audio storytelling as an incredibly immersive medium. Putting on a pair of earphones and listening to my favorite podcasts always meant that I would be experiencing something new; I’d visit places I’d never been to or heard of, meet interesting characters and more, all from the comfort of my room. Sound, mixed creatively and powerfully, meant that the imagining, meeting, and traveling all happened inside my mind. To me, this was the epitome of immersive storytelling.

But when I joined Contrast, Al Jazeera’s immersive media studio, as an intern three months ago, I was introduced to 360/virtual reality journalism and filmmaking, and the world of immersive storytelling suddenly expanded. My understanding of 360 video, prior to getting involved with Contrast, was pretty simple: it was all about the location. Through a static camera, VR filmmakers can take you to places difficult to access.

I quickly realized that there was more to it than taking viewers where they couldn’t go. While it holds the visual upper hand, virtual reality, similarly to audio stories, has the power of making you feel like you’re a part of another world — where, for a moment, everything around you fades away as you become a part of someone else’s story.

For example, a few months ago, Contrast released ‘The Curse of Palm Oil’, a virtual reality documentary that captures the impact of deforestation on the unique culture, lifestyle, and livelihood of the Orang Asli, Peninsular Malaysia’s indigenous people. The 360 shots of the staggering scale of deforestation are well suited for a virtual reality film, as the visuals were an integral part of being able to show the Orang Asli way of life in an authentic way.

The Curse of Palm Oil, however, also employs spatial audio to create a more powerful, immersive experience. The audio in the film acts as a guide for the viewer, from the homes of the Orang Asli, to the forests that are still standing, to the cleared lands and then back home. The rich sounds of the forests, when juxtaposed with the hollowness of cleared lands, provide a shocking clue to viewers of what’s already been lost — and what could be lost if deforestation continues in the region.

The presence and creative utilization of audio throughout the virtual reality documentary helps shape a deeper sense of immersion, a more instinctive connection to the story. This made me wonder: what role does audio play in VR storytelling, and what audio storytelling techniques could be adapted for VR experiences? Both the audio and VR storytelling mediums are incredibly immersive on their own, but how can their techniques complement each other?

To find out, I sat down with Graelyn Brashear (a producer with JETTY, Al Jazeera’s audio network) and Viktorija Mickute (a producer with Contrast, Al Jazeera’s immersive studio), to learn about the techniques they use to make their mediums more immersive.

Here are the major takeaways from my conversation with Graelyn Brasher:

First, what’s so special about audio?

  1. It’s fairly low-tech, portable and inexpensive compared to other media: “You get a lot of bang for your buck. As a reporter, you can travel around the world easily with all of your equipment tucked into a carry-on and get field tape on your own. You can, from start to finish, be a one person show and have total creative control over the entire process. It’s very difficult to do that with documentaries.”
  2. It’s unobtrusive and intimate: “A lot of people are freaked out by a camera, but they’re less so by a microphone. That makes it easier to catch moments of vulnerability through voice. But this also makes things difficult for audio producers, because sound is all you have to tell the story… you can’t use picture as a crutch. So, it forces you — when you’re in the field or in an interview — to listen to and get the most compelling, emotional and transportive tape you can.”
  3. It’s the oldest form of communication: “There’s a natural tendency to sit down and listen; the first story you ever heard was probably told to you by someone you love. For this reason, we tap into people’s instincts easily through audio storytelling.”
  4. It’s magic (but we’re probably a little biased): “Being able to use the human voice and the sounds we hear around us to tell a complete story feels like wizardry. When we read or watch a story, there’s an extra step there because we are processing it visually. But when we’re just listening to something, we’re creating the images in our own minds as we hear them. There’s so much power in going straight to someone’s own interpretation of what something looks or sounds like. You can probably create a greater emotional connection by feeding the story straight to the listener’s brain.”

Second, what does it take to create an immersive audio story?

  1. You need to capture sound that puts you in a specific place: “You can’t have an audio story that takes you somewhere else without really good sound from the field and from the place that you’re trying to share with someone. But it shouldn’t just consist of holding a mic up in the field or in a forest and getting the sound of it, but actually moving in and getting individual sounds of specific things that you can then layer into your multitrack session to make sure that what this person’s hearing matches the physical experience of being out there.”
  2. You need good ‘talkers’: “Finding people who can talk well is so important. Because this is an audio story, you’ll be leaning a lot on the voices you use in the piece, so find people who are able to describe their experiences and convey their emotions beautifully, in a way that connects with the audience and taps into human instincts.”
  3. You need clear, beautifully written narration to fill in gaps: “Even if you have good talkers, you need someone to set the scene and connect all of the dots for the audience. That’s where the narration comes in. Being able to write simply and visually for the ear, and having someone set the scene and fill in all of the information that you can’t get from just listening to characters and ambient sound, is incredibly important. Descriptive writing is crucial; you need to describe scenes in a way that’s specific, using language that is relatable and comparable (i.e. instead of saying ‘Gaza is 10km at its widest point,’ you could say ‘you could walk across it in an hour and a half). Human-sized comparisons, where listeners can actually imagine themselves taking an action, can go a really long way in taking you somewhere else.”

I then sat with Contrast’s producer Viktorija Mickute who directed The Curse of Palm Oil, to hear about the ways she used audio to create additional layers of immersion in virtual reality, many techniques which sounded familiar to podcasting.

Here are some of her takeaways:

1. Creating Soundscapes Can Enrich the Exploration of the Visual 
360/virtual reality experiences can still appear flat and lifeless if sound isn’t used to create a similar three-dimensional environment to accompany the visuals. Many stories are often just as deeply a visual experience as well as an aural one; in The Curse of Palm Oil, we decided to use spatial audio in order to create a richer experience and enhance the feeling that the viewers were present inside the scenes. In order to create effective spatial audio, which is a surround-sound audio technique designed to mimic how we hear sounds in real life, it was crucial to gather as many sounds from the forests as possible during the making of ‘The Curse of Palm Oil’. One of the producers on the team, Drew Ambrose, took on the task of filming various, specific sounds — like the sounds of rustling insects at night and the cries of different species of birds — that would later be used to create realistic soundscapes.”

[Drew taking audio recordings of the monkeys.]

2. Audio Can Guide You Through the 360º Space

“It’s easy to get lost in a 360º/VR space — which isn’t necessarily a bad thing, if the goal of the piece is to simply explore and experience an event or location. However, if we are crafting a story through the VR medium, we need guides. In a medium where the viewer has the ability to look around in all directions, sometimes we need cues to help nudge the viewer to look in the right spaces, to stay connected to the story. For the Curse of Palm Oil, Dendi (Orang Asli tribe member) is not only the central character, but also the narrative guide to the story. With his voice, Dendi gently explains to you the location, the action appearing on screen, the importance of the scene. Beyond voice, we also utilize other audio triggers to help guide the viewer’s attention. If you hear footsteps, you know to look around and search for the person walking. If you hear the sound of trees being slashed down, you know to seek the source of that action on screen.”

3. Audio Can Become a Central Character

“In a VR story, audio has the power to become another character in itself, creating parallel narratives from that of the visuals. For example, in The Curse of Palm Oil, many of the audio scenes followed a different storyline than that of the 360 shots, creating tension within the story. In the opening shot, Dendi says,“When I go to the forest, I find harmony. I can hear birds, monkeys and squirrels. There used to be elephants and other animals like tigers.” When he speaks of a distant past where there used to be different animals roaming the landscape, we hear the sounds of the elephants and roaring tigers, triggering a world that is at odds with the reality of modern day. The contrast between the storyline of the visuals with that of the audio can create a more nuanced and layered narrative.”


Through these conversations, it is clear that audio plays — or should play — an immensely important role in virtual reality documentaries. In order for virtual reality films to expand its immersive potential, it has to go beyond relying on powerful visuals and compelling stories, and also embrace audio as a medium considered not a supplement to visual storytelling, but a complement.

Most of the audio techniques used by podcasters were already being organically explored in Contrast’s virtual reality documentaries.

  • Both approach the method of creating soundscapes similarly. Invoking a sense of presence in a different space doesn’t come simply from capturing ambient sound, but paying attention to the details that combine to create the wholistic sound. Both Graelyn and Viktorija recorded tiny components of sounds and then layered them in post production to capture the full richness of the place.
  • Both also speak to the power of crafting audio narration as a beautiful and intimate way to guide the viewer’s experience through the story. While Graelyn discusses it as a mechanism to ‘fill in the gaps,’ in podcasting, Viktorija views it as a way to connect the viewer to the visuals around them.

In many ways, audio can play an even more versatile role in VR documentaries. While using the same techniques necessary for immersive audio storytelling (detailed above by Graelyn), audio in VR storytelling goes further by working in relation to the visuals. For example, in The Curse of Palm Oil, the use of creating two storylines — one through audio, and the other through the 360 shots — created another layer of tension in the film, as the audio was used as a plot device pitted in direct contrast to the story being told visually.

While we tend to instinctively tie the immersive quality of VR to the strong visuals and the ability to transport viewers elsewhere, we can do so much more if we use audio powerfully, in a way that creates a more realistic experience that mimics what viewers would see and hear if they were, say, standing in the middle of a forest. By appreciating both audio storytelling and virtual reality for their strength(s), we can figure out how to fuse those worlds together to tell not only a good story, but the best story.