Forget personal assistants, The real battle is for voice UIs

Which should I ask to tell me the weather? Google or Alexa?

Roughly 5.1 million Amazon Echo devices have been sold since the product’s launch in Dec. 2014, according to data from the Consumer Interest Research Group. This startling number for a product that most people couldn’t even describe at launch (I told people it was a cross between Siri and Sonos) has led to an intense focus on personal assistants.

Google Home is now on the market and Google CEO Sundar Pichai credited Amazon with the idea behind a speaker and voice-activated personal assistant for the home. There are also rumors that Apple may try something similar with Siri.

Now the tech media is obsessed with the concept of personal assistants. They will make small decisions easier. They will influence your buying choices. They will invade your car, your office and even hotel rooms.

If this was a fight to be the digital butler of choice then I would bet on Google. In reality, I think the fight today is about something much simpler. It’s about who will control the voice user interface. Voice is the natural interface for connected devices that often have no screen to touch or keyboard to type with.

Voice comes in many flavors. If I’m in my car, I press a button and tell my Tesla to “navigate to,” “call [someone]” or “play [song].” I can also tell it to send a log to Tesla if I press the voice button and say, “bug report.”

When I’m in my kitchen, I tell Alexa to “set a timer,” “play [song]” or any number of thousands of commands as long as I have activated the skill I want to use.


I’m exhausted just typing this, much less trying to figure out what computer I need to tell to turn on my hall lights and how to phrase my request.

I can also talk to my Google Home by saying “Hey Google,” or “Okay Google.” Its vocal commands are similar to the Amazon Echo’s although it doesn’t have the more than 5,000 skills that the Echo ecosystem has. Plus, most of us have a phone capable of taking voice commands that can find information or even (in the case of Siri) control your home.

Currently, I have half a dozen different computers to talk to in my life. Remembering how each one functions and what it can do can be a challenge. When I’m in my car, I have to remember that it’s way more limited than my Echo and speak accordingly.

When I’m in my home, I have to remember that only Google can speak to my Chromecast so I have to use that for TV stuff. Unless I want to see the content on my Kindle tablet, which means I have to talk to Alexa. I’ve linked SmartThings to Google so I need to know what devices I can control with Google as opposed to the dozens of devices that the Echo can control.

I’m exhausted just typing this, much less trying to figure out what computer I need to tell to turn on my hall lights and how to phrase my request. Or which computer I should use to text a friend using voice.

This is why I feel like the battle for personal assistants will happen eventually, but for now, the battle is really about setting the voice interface. Much like Apple helped codify what a swipe did on a handset, Amazon is trying to define how we’ll talk to the internet of things.

With the 5 million sold, deals to put Alexa in Ford cars, in dozens of new devices like speakers and intercoms, and an open ecosystem that lets anyone bring voice control to their services, Amazon has taken a strong lead in becoming the standard bearer for voice.

In the future, a good voice-based platform will need several talents that go beyond good artificial intelligence. It will need the ability to discern different voices for authentication purposes.

It needs to go beyond having a decent knowledge graph about a person to having a strong knowledge graph about the millions of devices we’ll want to control with voice. Finally, it will also need a sophisticated model of how to derive our intent and then execute a task based on that intent given the device we own.

Google has a good knowledge graph about a person’s interests and movements, while Amazon has a good sense of what devices people have to work with. It’s still an open question of who will combine those things and develop a strong model to let you ask your digital assistant if everything is okay at home, and get a response.

Taking the long view, the race today is about controlling the voice UI. Everything else is still in development.


Did you like this story? Want more? Sign up for Stacey Knows Things, a newsletter covering the internet of things, to get this essay and more.