“OK Google,” What Are the Problems with Speech Recognition Technology?
How Amazon and Google’s big bets on the smart speaker market affect their customers’ privacy
Speech recognition technology and the voice user interfaces (VUIs) we use to engage with it have gotten so good that they now make errors only about 5.5 percent of the time. That’s about the same error rate as a human.
And this speech recognition technology has advanced rapidly — particularly in the past 10 to 15 years —and is becoming commonplace in the bedroom, kitchen, and the rest of your house. It is light-years beyond its first iterations in the 1990s, when one of the earliest commercial products was Dragon Dictate, a typing software released in 1990 that was error prone and cost $9,000. (Google alone has reduced its speech-recognition error rate by 30 percent since 2012.) Both Google and Amazon have jumped into the market with gusto, with Google Home and Amazon Echo now accounting for the two largest market shares in the industry. Microsoft also recently released its own smart speaker featuring an AI assistant named Cortana, and unsurprisingly, Apple supposedly has a similar device in the works. All in all, the value of the virtual assistant market, and the speech recognition it necessitates, is expected to exceed $3 billion by 2020. What once may have been seen as a neat gadget is no longer. Companies are investing in a future where VUIs are commonplace, and that has implications for both consumers and companies.
Amazon’s Echo (you might know her as “Alexa”) is perhaps the best example of the far-reaching effects of VUIs and helps illuminate the far-reaching implications of the proliferation of this tech.
Scott Galloway, a professor of marketing at the NYU Stern School of Business and founder of numerous companies, writes in his book The Four, “It’s clear that Amazon wants to drive commerce through Alexa, as they are offering a lower price, on many products, if ordered via voice v. click. In key categories like batteries, Alexa will suggest Amazon Basics, their private label, and play dumb about other choices (‘Sorry, that’s all I found!’) when there are several other brands on Amazon.com.”
As a result of this sort of product recognition and prioritization (or lack thereof), Amazon Basic batteries account for a third of all battery sales online.
In doing so, Amazon is able to undercut brands that have spent millions to build their images and brand recognition. Amazon is already working to undercut prices in a number of industries, decimating the retail industry in the process, moving into groceries, and becoming an inescapable platform for sellers and consumers. Many companies have to participate in the Amazon Market because of the number of people they’re able to reach through Amazon’s platform. Consumers participate because things are cheaper and all it takes is a click to order something or, going forward, merely a voice command through your Echo.
Through that ease of access belies a darker side. According to a recent report from the Institute for Local Self Reliance entitled “How Amazon’s Tightening Grip on the Economy Is Stifling Competition, Eroding Jobs, and Threatening Communities,” “Amazon is eliminating more jobs than it is creating, driving down wages and working conditions, and spreading its low-road labor model to other sectors.” The report goes on to contend that Amazon has eliminated about 150,000 more jobs than it has created and that its workers’ wages are lower than those for comparable positions.
Speech recognition presents one of the more subtle advances in Amazon’s never-ending expansion, one that quietly results in a prioritization of the company’s preferred products, which supports Amazon’s business model regardless of the consequences. And the reach of Echo’s Alexa isn’t going to slow anytime soon. RBC Capital, an investment firm, has estimated that by 2020, Alexa-enabled devices will have spread to 128 million items across the world. It goes on to say that by that time, Alexa could bring in $10 billion of revenue for Amazon. We are only beginning to see what the resulting bias toward Amazon or other products using the platform could have on the retail industry.
“Voice is going to be your shepherd or butler to the consumer world,” said Galloway during a recent interview. “And whoever that butler is owned or controlled by is going to take billions or trillions of dollars away from other companies.”
It’s not just the economic implications that should concern consumers, but also privacy.
But not because they’re listening to you, as is commonly assumed. These VUIs are not constantly listening to you. A widely shared anecdote reports that after having a conversation with a friend about, for example, a boat, the people speaking were then delivered ads for boats on their social media pages. They become convinced that through their phone, or VUI, companies were listening to them and then targeted them for certain ads because of the nature of their conversation.
But these devices spend most of their time in passive listening mode, running only “device keyword spotting.” According to the companies that produce VIUs, the devices are only listening closely for their “wake-up words,” such as “OK Google” or “Alexa.”
What should concern you is that Amazon has expressed some degree of openness to granting third-party app developers access to transcripts of audio recordings saved by Alexa-powered devices — though the company says this would only be with users’ consent, according to the Verge. While this may seem shocking, Amazon is considering it because its biggest competitor in the VUI market, Google Home, is already giving developers access to this data. In the same ways that we’ve seen Facebook and Google slice and dice data unknowingly given up by consumers and sell it to third parties with varying results, the practice should always give users slight pause. No free app or service is ever actually free.
There are also risks from a variety of other factors, though some are more well-founded than others.
Another concern is the reach of law enforcement. As has been well documented at this point, Amazon handed over Echo data to law enforcement in a homicide investigation, though this was in response to a subpoena. According to Gizmodo, the FBI, for one, would neither confirm nor deny that it wiretaps Amazon Echo devices when asked about such practices last year.
There’s also the risk from hackers, who aren’t governed by the same regulations as intelligence agencies or law enforcement and are often able to access anything connected to the internet — whether it’s your voice-controlled thermostat or your smart speaker. As Wired reported in August 2017, a hacker was able to install malware on an Amazon Echo that turned the mic into a wiretap that could constantly listen in on conversations.
As the ACLU wrote on its concerns around such products and their software: “Most companies do not make their code available for public inspection, and it can be hacked, or unscrupulous executives can lie about what it does (think Volkswagen), or government agencies might try to order companies to activate them as a surveillance device.”
For companies that see this technology as the future, there are still obstacles. The ad agency Huge conducted a study in early 2017 on the use of smart speakers and found that users shied away from shopping on these devices for a number of reasons. According to the research, “the number-one reason why the respondents didn’t shop with their device: the lack of a screen (10.03%). But an extremely close second place was that they didn’t know how to shop (10.02%) and 8.33% didn’t even realize that they could.” The third reason was privacy concerns.
But don’t expect these obstacles to dampen any company’s drive and appetite when it comes to pushing their products with VUIs. There is money to be made, and that will ensure they keep pursuing more advances and build-outs when it comes to VUIs. It’s up the consumers to look past their friendly helper Alexa and see the larger implications behind her technology.