Why it’s hard to hear someone on the phone


It is easier to hold a conversation with someone we can see as our eyes help us separate what they are saying from the background noise.

In the noisy din of a cocktail party, there are many sources of sound that compete for our attention. Even so, we can easily block out the noise and focus on a conversation, especially when we are talking to someone in front of us.

This is possible in part because our sensory system combines inputs from our senses. Scientists have proposed that our perception is stronger when we can hear and see something at the same time, as opposed to just being able to hear it. For example, if we tried to talk to someone on a phone during a cocktail party, the background noise would probably drown out the conversation. However, when we can see the person we are talking to, it is easier to hold a conversation.

Ross Maddox and co-workers have now explored this phenomenon in experiments that involved human subjects listening to an audio stream that was masked by background sound. While listening, the subjects also watched completely irrelevant videos that moved in sync with either the audio stream or with the background sound. The subjects then had to perform a task that involved pushing a button when they heard random changes (such as subtle changes in tone or pitch) in the audio stream.

The experiment showed that the subjects performed well when they saw a video that was in sync with the audio stream. However, their performance dropped when the video was in sync with the background sound. This suggests that when we hold a conversation during a noisy cocktail party, seeing the other person’s face move as they talk creates a combined audio–visual impression of that person, helping us separate what they are saying from all the noise in the background. However, if we turn to look at other guests, we become distracted and the conversation may become lost.

To find out more

Read the eLife research paper on which this story is based: Auditory selective attention is enhanced by a task-irrelevant temporally coherent visual stimulus in human listeners (February 5, 2015).

eLife is an open-access journal that publishes outstanding research in the life sciences and biomedicine.

The main text on this page was reused (with modification) under the terms of a Creative Commons Attribution 4.0 International License. The original “eLife digest” can be found in the linked eLife research paper.
One clap, two clap, three clap, forty?

By clapping more or less, you can signal to us which stories really stand out.