Microsoft’s speech recognition system can recognize audible words just as humans do

Microsoft just made another great success of making artificial intelligence equal to a human. The researchers from the company have created a speech recognition system that transcribes conversational speech as well as a human does.


In a paper published on Monday, researchers reported that their speech recognition system managed to reach the word error rate (WER) of 5.9 percent, beating the 6.3 percent record set just last month. And, you know what? This 5.9 percent error rate is about equal to that of a human who manages to miss in the same conversation. Also, it’s the first time for a computer to just as well in the industry standard Switchboard speech recognition task as do humans.

“We’ve reached human parity,” said Xuedong Huang, the company’s chief speech scientist. “This is an historic achievement.”

Microsoft researchers from the Speech & Dialog research group


Well, this achievement doesn’t mean that computer can recognize every word. In this case, Microsoft’s speech recognition system is just good as human now, and far from perfection.


This advancement had far-reaching implications. Microsoft’s own product such asCortana and Xbox could immediately benefit from this milestone. Also, it could be beneficial to accessibility software, such as instant transcription services.

The team’s next goal is to achieve higher levels of accuracy as well as ensure that their speech recognition system can be used in real-life situations such as on crowded streets or while driving. Also, they hope the system will one day not only recognize the speech but also understand it.