OpenAI can hear you Whisper

In a step towards solving challenging automatic speech recognition, the company has released open-source software Whisper.

Jasper Kense
ILLUMINATION
2 min readOct 2, 2022

--

Woman talking with artificial intelligence
Image generated by Dalle2, by the Author

Speech recognition remains a challenge in artificial intelligence, but OpenAI’s latest move takes us one step closer to solving it. The software is an automatic speech recognition (ASR) system trained on 680.000 hours of multilingual and multitask supervised data from the web.

Other organizations like Google, Meta and Amazon have all tried to design ASR-systems that lie at the core of many products. Amazon’s Alexa main interface is through human-to-computer speech.

OpenAI now could outperform every one of those ASR-systems. What makes this new software different is the robustness against background noises, accents and technical terminology. The ASR-system is even able to transcribe in multiple languages.

As with every other technology, there might be some downsides with the technology. Some have raised questions about the broader implications of such an algorithm.

“OpenAI warns that it could be used to automate surveillance or identify individual speakers in a conversation” — Tasmia Ansari, Analytics India Magazine

Luckily there is a lot of good intention coming with Whisper. OpenAI wanted to make the piece of software open-source to promote accessibility tools and speech recognition. You can find the source code here on Github.

--

--

Jasper Kense
ILLUMINATION

UX Designer and 3d enthusiast talking about AI and the implications on creativity — Creator of AI ux research tool http://leapfrogapp.com/