In the World of Voice-Recognition, Not All Accents Are Equal
But you can train your gadgets to understand what you’re saying
In a spoof advertisement on a humorous website, a woman asks her Echo, Amazon’s voice-controlled speaker system and assistant, to play “the country music station”. The device, mishearing her southern American accent, instead offers advice on “extreme constipation”. Soon she has acquired a southern model, which understands her accent better. But before long, the machine has gone rogue, chiding her like a southern mother-in-law for putting canned biscuits on the shopping list. (A proper southern lady makes the doughy southern delicacy herself.) On the bright side, it corrects her children’s manners.
To train a machine to recognise what people say requires a large body of recorded speech, and then human-made transcriptions of it. A speech-recognition system looks at the audio and text files, and learns to match one to the other, so that it can make the best guess at a new stream of words it has never heard before.
America and Britain, to say nothing of the world’s other English-speaking countries, are home to a wide variety of dialects. But the speech-recognisers are largely trained on just one per country: “General American” and Britain’s “Received Pronunciation”. Speakers with other…