Why voice command interface might not be the next big thing

Ashley Hogarth
4 min readMar 17, 2017

Alexa, Cortana, Siri; all names becoming as omni-present as Trump. But are they really going to be as life changing?

For anyone that has any of these hot gadgets of 16/17 you may have noticed your life hasn’t quite become like iRobot, Ex-Machina or Moon — in fact, it’s not even close and I put this down to one thing, Exploration.

Exploration is something you do every time you use a visual user interface of any product ever created, especially for the first time. Exploration is what you do when you know a tool is right for the job, but you aren’t entirely sure of the sequence to achieve that job. The learning curve. Sometimes you know the job upfront, other times you work it out when you pick up the tool and it’s something that UX designers obsess over.

What the f**k are you talking about? Lets look at a physical product, you know that that new TV remote is what you need to use to change the channel, adjust visual settings, set recordings, get subtitles & everything in between. You might not know right now how to do all those things, but you do know that that remote is the tool for the job.

You may not realise it, but it’s the same for the digital experience too. I know I want to listen to music and I know Spotify will do that job, but right this second I don’t know what I want to listen to, I want to explore that.

You even explore when you speak! Think about conversations you’ve had today, do you think through your entire sentence and structure and then deliver it or do you begin talking with a point in mind and wing the rest of it? I would guess the latter. Drawing back to the TV remote example, I know the person I need to address and the point I want to make to them but I don’t know the exact map of how I’m going to sculpt that point in conversation.

With voice interface there is no exploration. From the interaction beginning you need to know exactly what you want to achieve and the syntax to achieve it.

“To err is human…”

Even then you face problems in that you have to be precise in your language. Try giving Siri a longish voice command and ‘err’ing half way through it, or even just pausing; a perfectly natural thing to do in common parlance but an unforgivable mistake for voice interface. Does this make the voice control interface redundant for those with a stammer?

How do we fix it?

Until perfect natural language processing is achieved hybrid solutions may provide a more effective experience for users. Coupled with visual interfaces voice control interfaces could afford the user a chance to explore.

“Ok Google, show me songs on Spotify by David Bowie”

Giving the user the ability to pause for thought when deciding what they want to do would afford the user the chance to be human.

“Alexa show me some songs by Coldplay”

Exploration also allows for feature discovery. Unless you know end to end every feature of a particular piece of software, an Exploration period (or an end to end tutorial) gives users the opportunity to find out all the nuances of a products feature range and additionally, their idiosyncrasies.

Final Note: There’s a lot of coverage on tech companies making these voices female and the sexist implications of having a female voice in a subservient role — if NextBigTechCo are reading this, can I propose calling it ‘Lloyd’? Who doesn’t want to be able to yell “Lloyd!!!!!” to their voice-control assistant? You’re welcome.

Ari Gold of Entourage

--

--