When ASR surpasses human capability

Leor Grebler
Social Robots
Published in
2 min readSep 16, 2016

My colleague Ahmed posted yesterday an article that had appeared about a new record for ASR performance. Cortona had reached an ultra-low 6.3% error rate. Human voice dictation error is in the 4–6% level, which means we’re just on the cusp.

In May, I made a prediction at SpeechTek that within the next twelve months, we’d surpass human error rates in ASR. Of course, the caveat is that it would be a well trained and tuned ASR and for a specific individual. We still have another eight months to go to get there but I think it’s going to happen. Think about that accomplishment…

This would mean that a computer would be able to make out what a person is saying better than another person.

This is huge. This is Singularity type stuff. After we pass this threshold, we’re going to start to start to create all sorts of applications that help us understand each other better. “Siri, did you make out what John told me?”

We may start constantly recording each other just to be able to better understand what each other is saying. How many needless quarrels have been fought because we didn’t understand what someone was telling us or we mistook their words? Imagine if that device was always there chiming in when we came to that wrong conclusion.

It might be a bit scary but this could make our lives much more peaceful.

--

--

Leor Grebler
Social Robots

Independent daily thoughts on all things future, voice technologies and AI. More at http://linkedin.com/in/grebler