OpenAI Whisper is a Breakthrough in Speech Recognition
Me: “Alexa, Turn on the Light”.
Alexa: Sorry, I can’t find the song “Turn on and fight” in your music library.
Don’t you hate it when this happens?
Good news, things are about to get a lot better.
Today, OpenAI has published a new artificial intelligence model that can transcribe speech with 50% fewer errors than previous models. The new model, called Whisper, is robust to accents, background noise, and technical language. It also enables transcription in 99 different languages, as well as translation from those languages into English. The system has full support for punctuation and was trained on 680,000 hours of multilingual data collected from the Internet.
In contrast to previous works by OpenAI, such as DALLE-2 and GPT-3, this time around they have released Whisper as a completely free, open source model — which means you can install and use Whisper, today !
While transcription technologies have been around for a long time, they have typically been inaccurate and were able to transcribe only few languages. With Whisper, OpenAI has created a model that is much more accurate while being able to transcribe a wide variety of languages.
This breakthrough means that artificial intelligence is one step closer to reaching human levels and this could have far-reaching implications.
In healthcare, for example, doctors could use AI to transcribe patient notes, which would free up time for them to see more patients. In customer service, AI could be used to accurately transcribe customer calls, which would help businesses to resolve issues more quickly. And in law, AI could be used to transcribe court proceedings, which can reduce the cost of litigation.
Due to its multilingual abilities, Whisper would be an incredible boon for international business, for education, and for diplomacy. Imagine being able to have a conversation with someone in another country without having to worry about the language barrier. Or being able to sit in on a lecture at a foreign university, even if you don’t speak the language.
An improved voice-to-text model can also be good news for hearing impaired people who use voice-to-text technologies to communicate. The more accurate the transcription, the easier it will be for them to understand what is being said.
With this technology, we will be able to better communicate with each other, regardless of language barriers. I believe this has the potential to help us to better understand each other and work together to solve problems.
I can’t wait to see what else OpenAI has in store for us. They are truly changing the world for the better !