What the heck is “Sound Recognition”?

Brandon Marin
Published in
4 min readSep 18, 2018


image credit: Jeremy Beck

Let’s start with a story …

I am Deaf which means I do not hear any sounds… at all.

Because of this, I thought of something funny recently: Machines are deaf, just like me.

Hold on. I’m not dehumanizing myself and my identity as a Deaf person. Think about it in the sense of functionality.

Machines (i.e. AI, computers, and all things technology) are amazing in what they can achieve nowadays. IMO, they are exceeding beyond their capabilities in deep learning, machine learning, visual cognition, facial recognition, air recognition, speech (this includes music and voice) recognition, and many more.

Holy cow. But, machines can listen to and identify speech, voice, and music. This establishes the fact the machines as capable of hearing?

Yes and no. When we look at the key elements of existing AI capabilities related to hearing, there is just speech recognition. Speech recognition and wake words are limited by the types of sounds that the human mouth can produce.

Sounds, on the other hand, are immensely diverse, unbounded and infinitely more unstructured than speech and music.

