In the World of Voice-Recognition, Not All Accents Are Equal

But you can train your gadgets to understand what you’re saying

The Economist
3 min readFeb 26, 2018

--

Tower of Babel, 1563, by Pieter Bruegel or Brueghel (1525–1530 circa-1569), oil on canvas. Photo: DeAgostini/Getty Images

In a spoof advertisement on a humorous website, a woman asks her Echo, Amazon’s voice-controlled speaker system and assistant, to play “the country music station”. The device, mishearing her southern American accent, instead offers advice on “extreme constipation”. Soon she has acquired a southern model, which understands her accent better. But before long, the machine has gone rogue, chiding her like a southern mother-in-law for putting canned biscuits on the shopping list. (A proper southern lady makes the doughy southern delicacy herself.) On the bright side, it corrects her children’s manners.

To train a machine to recognise what people say requires a large body of recorded speech, and then human-made transcriptions of it. A speech-recognition system looks at the audio and text files, and learns to match one to the other, so that it can make the best guess at a new stream of words it has never heard before.

America and Britain, to say nothing of the world’s other English-speaking countries, are home to a wide variety of dialects. But the speech-recognisers are largely trained on just one per country: “General American” and Britain’s “Received Pronunciation”. Speakers with other…

--

--

The Economist

Insight and opinion on international news, politics, business, finance, science, technology, books and arts.