Voice User Interfaces

Ashna Shah-Grover
Jan 27 · 5 min read

This week the StackShare weekly digest listed VoiceFlow as one of their “Recently Approved” tools of the week. I was intrigued by the short description of the tool: “Build voice apps in your browser without coding”. So I spent a few hours researching the tool and the domain it resides in, the world of “Voice User Interfaces” (a play/development on the “Graphical User Interface” term). It was an interesting way to spent the afternoon.

Voice User Interface (VUI)

The VoiceFlow App

The Voiceflow app helps you create your own Alexa Skills of Google Actions “without coding”. I was out of the loop when I encountered these terms — I had to look them up. On a personal note, I’ve never had any interest in using Alexa, it always struck me as a very “extra” product to have in the house. I never saw any point in having to give a device simple command (“Play X song” or “Turn on the lights”) that I could do myself in under 10 seconds.

But it turns out there is a large market place for custom Alexa Skills, from custom orders that tell Alexa to skip a guided meditation routine you’re not feeling, to a Potterhead quiz where Alexa asks you various trivia questions and tells you whether your are right or wrong after responding.

A flash of some of the Alexa Skills available on the market place

Normally creating and launching a custom Alexa skill would require coding. Voiceflow takes the coding out of it and allows you to create a simple flow chart of Alexa choices and user responses. According to this flow chart, your Alexa skill is generated and made available on the Voiceflow marketplace!

A Voiceflow flow chart of Alexa dialogue and user responses
A close up on the elements of a Voiceflow flow chart
Input for a “Speak” element in the flowchart

In a similar vein, Google Actions can perform voice activated tasks on Google Assistant, which can also be installed on your phone, laptop or speakers. There is a similar marketplace for Google Actions — and Voiceflow allows users to create their own without needing to know how to code.

Ways to download and run Google Assistant

Voiceflow is based out of Toronto and appears to be seeking to grow their engineering team.



Voice Activated Games

Whilst reading about Voiceflow I encountered something very fascinating: the idea of voice-activated video games, and the idea of the “joystick” being replaced by verbal commands.

A snapshot of the Mayday app

One of the first voice activated video games to be released was Mayday! Deep Space. Created by the developer Daniel Wilson, it was made for the iPhone and required "using speech recognition as a medium forces the reader, ideally, to be alone and in a quiet place, which helps with the mood. The game is scary. The story is built around the experience of using speech recognition."

In an interview, Wilson said:

“The very act of speaking out loud to someone, even if you know that person is not real, creates an instant connection,” Wilson said. “Somewhere deep in the recesses of your brain, you believe this person is real. Whether you want to or not, speaking out loud causes you to build a relationship with whoever, or whatever, you’re talking to.

Though this was a while back, it still resonates, and points towards a more immersive video game experience that will go beyond a purely visual means of immersing the user.

Mayday was praised for its unique and intuitive approach of making users speak to the player on screen, whilst criticized for being too short in length. The criticism points towards the fact that VUIs are still in their primitive stages, with only a handful of voice-activated games currently on the market place. There are numerous difficulties in creating games like these:

It’ll also be extremely challenging for game developers who will now have to account for hundreds (if not thousands) of hours of voice data collection, speech technology integration, testing and coding in order to retain their international audience.

Developers have to take into account accents, dialects and whole languages on top of baseline video game localization for players in different cultures.

Not to mention, gathering all the different potential phrases a user would say during the game or command their character with.

Still, inevitably VUIs will be a crucial component of more sophisticated virtual reality and augmented reality experiences! And with that inevitability will come the necessary and revelatory breakthroughs in natural language processing and conversational AI.

Resources


Something just for you

Voice Tech Podcast

Voice technology interviews & articles. Learn from the experts.

Ashna Shah-Grover

Written by

Voice Tech Podcast

Voice technology interviews & articles. Learn from the experts.

More From Medium

More from Voice Tech Podcast

More from Voice Tech Podcast

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade