The 3 key factors in designing voice user interfaces

Thorsten Borek
LET’S GO NEON
Published in
4 min readNov 29, 2017

Simple, intuitive multi-sensual, versatile interactions

As prototyping experts, we’re testing UI/UX interfaces on a weekly basis. From prototype to prototype test we experience that the attention span of users is going down. The average attention span for the notoriously ill-focused goldfish is nine seconds, but according to a new study from Microsoft Corp., people now generally lose concentration after eight seconds, highlighting the effects of an increasingly digitalized lifestyle on the brain.

Key UX/ UI trend for 2018: Simple Navigation

For UX/ UI design this means, navigation needs to be further simplified and more intuitive. Simple navigation means, or at least used to mean, sticky and linear navigation, so users could pick up the process easily.

What’s even simpler?! A dialogue that leads you through the navigation before you even know it.

That’s why a lot of navigation solutions we’ve worked on in 2017 were chat/ dialogue based. They work extremely well for creating a natural, fast and intuitive user experience.

However, the fundamental that all these nav trends share is, that they focus on a world in which navigation is primarily achieved by physical contact, whether this is a click, a tap or a swipe. The increasing prevalence of voice recognition technology, especially with the rise of Amazon’s Alexa, means, that the fundamental way we think about digital navigation is changing. Something that the visionaries who created Star Trek knew in the 80s:

Amazon Echo Show: Voice AND Graphic Touch Interface

A week ago, Amazon started selling the new Echo Show device in Germany. It works like the known Echo or Echo Dot but it also has a Touchscreen, meaning graphical elements and physical navigation components complement the voice interaction on the device itself. For example, a recipe skill can read out and display images of the ingredients and the preparation process to allow the best hands-free support. A city guide skill can display pictures or videos of requested attractions, or take the user on a guided tour.

Although the display component may enhance the user experience considerably, voice continues to be the primary interaction method with Alexa.

Now, X-O was one of the first German companies to have it. The arrival of our new, shiny toy coincided with “TALK TO ME, HAMBURG!”, a Hackathon for natural language interfaces. We’ve participated to crack this baby.

A Hackathon works like this: After a challenge is posed, work groups form based on existing or newly found ideas in the workroom. We obviously wanted to created something for our Start-up FinGym and our new device.

It turned out, we were the only ones working with the Echo Show and even the Amazon Echo evangelists weren’t able to really help us. Luckily we are experience innovators, so our structured workflow, teamwork and innovation methods helped us to set up a skill anyhow.

As an additional experience to the digital coach we’re working on for FinGym, the Echo Show skill was meant to create easy everyday “workouts” for your financial fitness, including a video-based finance dictionary, that explains finance terms short & simple and with a “punny” sports analogy, and an a daily financial inspiration from historical facts to money songs. Wanna try? We’re working on our live version and will keep you updated.

Our 3 key takeaways:

  • Voice Interfaces are way more focused. That’s why you’re looking for the “Happy Path”, the most simple and short way to a complete result. The more questions you need to answer, the higher the risk that the user will drop out. A user will come back to a 3-sec-help, not a 30 sec chitchat. A rule of thumb: if you are using a voice UI it should be easier, faster or more fun than a physical form of interaction.
  • Eyes expect uniformity — repetitive patterns, colors etc. to create a more pleasant user experience. With ears it’s exactly the opposite. You need variety, like in a natural conversation. That’s why Alexa has high emphasis on input and output utterances, for example a yes, sure, okay, yeah etc. instead of just yes.
  • When designing for a conversational interface, it’s important to remember you’re still dealing with a user interface, just not a visual one, so UX principles should still be applied. Meaning: The user is king. So prototype, test, improve.

Need help designing a chat bot or Voice UI for your users? Get in touch, we’d love to talk. ;)

--

--

Thorsten Borek
LET’S GO NEON

Co-Founder of NEON Sprints, Design Sprint Facilitator and Trainer, Post-it addict