Understanding Audio Augmented Reality

Augmented reality is about adding virtual content to the physical world. So far the emphasis has been on augmenting what we see. With the growing popularity of wireless earbuds and voice assistants, let’s explore what happens when we shift the emphasis to augmenting what people hear, or how they experience sound in the first place.

Rather than offer a prescriptive definition, I propose that what we call augmented reality is a combination of form factor, user interface, and the actual experience itself. Let’s apply that framework and see how the components of the Audio AR ecosystem are shaping up.

Form Factors for Audio AR

The ideal form factor is one that lasts all day long and is so small and unobtrusive that the user can almost forget they’re wearing it.

Left to Right: Wireless Earbuds (Apple Airpods Pro), Over Ear Headphones (Bose 700), and Audio Glasses (Amazon Echo Frames)

User Interface for Audio AR

Augmented reality is associated with a more natural UI layer — one that allows a user to engage with digital content in a way that keeps their hands free to focus on other tasks. For Audio AR, the hands free voice interface is well established, and there is potential in a UI around recognizing one’s head gestures as well.

Experiences for Audio AR

There are two kinds of experiences I’ll explore: augmented hearing, and spatial audio content.


Change how we perceive the real world by adding a digital layer

Augmented reality adds or subtracts from what naturally exists, so for audio we can consider the blocking or amplifying of real world sounds.

Spectrum of modifying real world sounds. On the left, Earplugs, ANC, and Adaptive Sound look to block out the real world. On the right, Audio Filters, Transparency Mode, and Hearing Aids look to connect you more with the world around you. This is not meant to show their relative strengths, just to help visualize the difference between how these technologies either get you to engage or disengage with your surroundings, which is the very nature of augmented reality.

Adaptive equalizers feels like something consumers of all kinds would like access to, and will start to blur the line between earbuds and hearing aids.

Another class of audio filters could be specific to voice. Imagine if your earbuds could autotune the voice of everyone around you? While live autotuning everyone around you might not be feasible quite yet, TikTok has many voice filters you can apply to a video in post-processing, and their popularity shows that there is an interest in this kind of audio content/manipulation. I would not be surprised to find out that some podcasters have been “photoshopping” their voice (please let me know if you are familiar with this). Now that Twitter is testing their new voice tweets, the sound of your voice will become as important to your identity as the way you look, so there’s no reason to think filters won’t get involved. For the purpose of this article, I would put the Google pixel buds translate features into this bucket of “audio filter.”


New kinds of responsive content

Many spatial audio experiences exist and will continue to be developed. There are two kinds I’d like to explore: being able to create an incredibly immersive listening experience so a stationary user feels like the music is coming from all around them (see picture below), and creating a sonic landscape that a user moves through.

Sony360 experience of feeling like speakers are placed all around you, creating a deeply immersive sound.

What’s Next For Audio AR

Rather than try to define Audio AR, I have explored the various components of augmented reality that are appearing in the audio space with the framework:

Audio AR = Form Factor + UI + Experiences

We have passed a tipping point with regards to Form Factor (wireless earbuds) and UI (voice), and I believe we will see more emphasis on Experiences moving forward. Specifically, I believe the time is ripe for earbuds to act more and more like hearing aids, so I anticipate seeing more features around being able to better hear the world around you. The earbud form factor can continue to improve, and it may need additional sensors to enable new experiences in the future, like biometric sensors or ultra wideband chips for spatial awareness. Perhaps these sensors will also end up enabling new UIs.

AR & Emerging Tech Strategy Expert | Venture Partner @IndicatorVC | Founder @BostonARmeetup

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store