How the Apple Vision Pro Will Shift the Audio Experience: Part 1

Amal-Sebastian Das
Fansea
Published in
6 min readSep 15, 2023
A potential game-changer for work, fun and lifestyle

Sounds From Above

Let us dive into a wildly fresh topic: the world of individualized spatial sound and the innovative technology Apple brings into the game. That is not just a passing trend but promises to redefine the next wave of virtual experiences, with Apple’s Vision Pro at the forefront.

As someone actively in the middle of a team, building spatial experiences with Apple Vision Pro, I have been deeply involved in understanding how spatial audio shapes and elevates our virtual interactions, making the experience even more real. Watching 2D images with stereo sound is now supplemented by immersive 3D environments and a matching aural experience.

I am passionate about sharing my research, experiences, and perspectives on the audio aspect with you. Whether you are a fellow audio geek, a tech enthusiast, or someone intrigued by the future of sound in the 3D space, I invite you to join me in exploring this fascinating realm.

Why Spatial Sound is Vital for the Immersive Experience

I always found it amazing how sound shapes our world. From the distant hum of a city, as it stirs, the seagulls yelling above my roof, to the raw energy of a crowd going wild at a live gig. These sounds are not just an addition to what we see; they place us in the middle of the scenery and fill things with life while spotlighting their position around us.

When we step into virtual spaces, sound is exactly as essential. Without some of the tangible or full visual cues we are so used to in the real world, our ears are on the front lines, guiding us through digital spaces.

The Apple Vision Pro

Intuitive Design Can Be the End of a Long Process

But virtual spaces also bring an intriguing challenge to every audio engineer. While some sounds mirror the real world, many elements within these spaces — like simulated objects, control panels, and specific functions — are exclusive to the virtual environment. These elements are not just replicas of the physical world; they are novel, unique, and sometimes artificial by intent.

Crafting sounds for these virtual-exclusive elements is an intricate job. These sounds often do not have a real-world counterpart to guide user expectations. They are designed from scratch, ensuring they fit seamlessly into the environment while effectively communicating their function and the impact of the user on them. It is all about ensuring that users intuitively understand their actions and consequences in a space where traditional audio cues might not apply.

Blending artificial sounds with the realistic audible world is challenging, like balancing creativity with intuitiveness and function.

Apple’s approach with the Vision Pro is to transfer the spatial placement of real-world object sounds to control elements, too. I not only see the button or the functional component at a specific room position, but I can also hear it there. So, the effect of visible augmented reality increases significantly through the acoustic orientation in the room.

How the Magic Is Done

It is truly fascinating how our ears and brain seamlessly team up to determine the exact location of a sound source. Picture this: you are in a room while your kid drops his dummy to your left. Almost instantly, you can pinpoint where the sound came from, all thanks to the impeccable design of your awesome auditory system.

When a noise occurs, the ear closer to the source perceives it slightly before the other, guiding our brain in determining its origin. Beyond this time delay, the unique contours of our auricles, those detailed outer parts of our ears, add another layer of sound interpretation. They interact with incoming sound waves, causing nuanced frequency filtering and providing spatial data to our brains. This anatomical transfer function is essential for perceiving and understanding the real world.

Now, standard headphones present a challenge. In traditional stereo playback, a sound with the same signal on the left and right appears as if it is coming from a phantom center point between our ears, typically perceived as being in front of the listener. When modifying the volume levels between the stereo channels, the sound appears to be moving horizontally. Simple enough.

Now, spatial audio goes beyond this. By simulating the physical features of ears and filtering audio in response, it elevates the experience entirely. Especially when matched to unique ear shapes, which are not interchangeable with the hearing habits of anyone else. Audio no longer only shifts left and right; it surrounds, envelops, and reacts with realism, tailored and only working for you.

Being simultaneously in the digital and the real worlds is a main Vision Pro idea

The Existing Ecosystem of Personalized Spatial Sound

Apple’s commitment to delivering a bespoke auditory experience is evident in their adoption and adaptation of individualized Spatial Audio. While this concept is not entirely new, Apple’s unique blend of hardware and software offers an experience unlike any other.

Several Apple products, such as the AirPods Pro, AirPods Max, the Beats Pro series, and more, already employ Spatial Audio. Their approach hinges on the synergy between the audio hardware and the TrueDepth camera on the iPhone. This camera assembles a unique Spatial Audio profile, ensuring the sound is optimized based on the listener’s distinct physical features, I described earlier. Now Vision Pro is making the scanning process even more of a no-brainer for the consumer.

Personalized Spatial Sound With the Apple Vision Pro

For the Apple Vision Pro, the landscape of spatial audio takes a slightly different shape. Instead of leaning on external devices like the iPhone for tasks like head tracking and audio customization, the Apple Vision Pro harnesses its technical possibilities. This built-in, on-device tracking likely paves the way for a more immersive and pinpoint-accurate spatial audio experience, especially considering how close the tracking sensors are to the ears and eyes of the user.

An Unusual Speaker Design

Finally, let us examine the intriguing design choice of Apple Vision Pro’s speaker setup. These speakers neither wrap around the ears nor come with bone conduction tech. The result is an audio spill. So, while you are absorbed in a call, this could allow folks to throw ideas into your conversation or dance to your latest Apple Music find. While this might add a communal vibe to your music choices, it does come at the expense of some privacy. If your mood is for a more intimate audio session, you are practically pushed to pair with AirPods.

Whether you view it as a pro or con, it is reassuring that the Vision Pro’s distinctive features integrate seamlessly with the broader Apple ecosystem, complementing the family of devices and their spatial audio capabilities.

So watch for the follow-ups where I will fill you in about audio raytracing, a technology to identify the topology of the surrounding room, and Spatial Sound, a 360-degree soundscape supported by a growing number of movies, TV shows, and music.

--

--

Amal-Sebastian Das
Fansea
Editor for

UX Groupie | Audio Geek | Pushing to a digital analogue future 🚀