Designing for Voice User Interface

Anil Nair
Southern Design
Published in
6 min readAug 27, 2018

Starting of with a question, whether this voice commands & voice interaction is something new? No,it isn’t! We have had automated voice calling & automated voice responses prevailing before this whole hype of Voice Interactions begin.

It has been quite some years that voice commands and voice interaction has been trending, and big companies investing heavily into this technology. With the emergence of Artificial Intelligence and Machine Learning, the technology has found its way to become the chatbots and personal assistance. What has changed with this is the user expectation and user perception about Voice interactions. The modern voice interaction has become more user centric taking away the restriction of commands, making it more personalized and human. These interfaces resolve around the skills bbbbacquired by the human.

What do user expect from Voice Interfaces?

Speech has always been fundamental means of communication for humans from the days of stone age. Because of this when given a choice of voice interface, tend to expect the behavior of a normal person even thought they are aware that they are speaking to a device. So before we design the interfaces for voice we need to understand the psychology of humans and the principles that lead speech communication. There can be a situation where the same message is conveyed in different ways or in different accent.

If the message conveyed and the understanding of the device doesn’t match, things may go wrong. This makes me recall a section from Jeef Patton’s “User Story Mapping” where he mentions the examples from “Cake Wreck”

Imagine the same scenario with the devices, if the machine doesn’t understand what the user actually meant.

So before you start designing the system, understand how you would like to have a conversation to be. Sit with someone and make notes of all possible ways a conversation can take place for a given use case.

This is what I experienced while trying to interact with the assistance. I was expecting google to know my choice of music as I always listen to music on my phone and all my devices are synced.

As a user, I felt frustrated when this happened to me.

A designer has to consider all the possible use cases, as each user may interact in n number of different ways.

For example, lets see two set of users booking an Uber ride.

Case one:

User: Open Uber

Asst: Sure, where would you like to go?

User: Church Street.

Asst: Any specific location on Church Street?

User: Yeah! Near Church Street Socials

Asst: Ok! What kind of ride would you like to take?

User: An uber pool may be

Asst: Booking you an Uber Pool to Church Street Socials

Asst: Your ride has been booked and is 6mins away

User: Thanks!

Asst: Is there something else I can help you with?

User: No thanks!

Case two:

User: Book me an Uber Pool to Church Street socials

Asst: Ok, booking your ride to Church Street Socials

Asst: Asst: Your ride has been booked and is 6mins away

User: Thanks! Exit assistance.

These are just two instances, but there can be cases were the credit available in the payment wallet can be low from the required amount or the ride might not be available at that point of time.

Guiding the design for VUI

VUI as mentioned above is all about making a natural conversation between the machine and Humans/Businesses. Though large corporations are investing their resources heavily into the research & development of voice integration, voice interactions are in their pre-matured stage.

You need a personna for a VUI

Building a voice interface is about building a natural human conversation and for this you should treat the machine as a human. Identify the characteristics of a human and how human communication takes places. Humanizing the voice and the characteristics of interaction, helps to connect with the users. What they expect is to connect with another human when interacting with the machine through speech.

Designing for accent & context

Always the wording used during the conversations may have different meanings for different contexts given. The system needs to be educated for these contexts and the dialog exchange should be driven by contexts.

Coming to the accents, there are different accents of English itself in India. But now imagine the scale, if you are building it for the world. The application should be able to pick these accents and also learn from the instances. Its more of a learning process and it also helps to personalize the application for an individual. This does take sometime, but with the available tool kits its not impossible.

Leveraging the user skills or being inclusive to design better

Every designer say don’t start with assumption that your user will be knowing the process. But when it comes to something like NUI specially with voice interfaces, there are certain skill sets which your user is more comfortable with and the patterns that he/she uses those skill sets. Speech is also something similar.

From childhood, a person speaks in his own way of communicating. If you are making the user adapt to the accent or the way interface will understand won’t be an easy task, hence increasing the user effort to a greater level. With VUIs, the effort has to be put into making the conversation personal and customized to each person.

User navigation

Voice interfaces are no different and users may find themselves lost during their journey, providing them a way out becomes something a difficult task with the VUIs. Help them get assistance with proper suggestion on the actions to be performed and this will also prevent the chances of user making errors.

Limit yourself with information

Another challenge with VUIs are giving the right information that the user is looking for and being precise. Limiting the information will also help in bring down the cognitive load of the users and making the conversations more human.

Limiting yourself with information overloading also helps the user to take his/her next step easily without confusions.

VUI doesn’t need to always listen to you

A fear of being heard by someone of else always bothers the user. This needs to be taken care off at the design level itself. Most of the consent by early adopters are the data protection. Most of the assistance are programmed to trigger a conservation when its asked for, but they constantly keep an ear for that call. So during this while actually speaking as the application keeps its ear for the call, it should also not process or hear whatever the conversation maybe.

Conclusion

From “Her”

The whole time while writing this I was remembering the movie “Her”. VUI has been slowly changing the things around us and its redefining the way we interact with the machines. The interaction which started with simple IVR is now emerging as one of the trending things with the advancement in AI and ML. What I believe is that no matter how strong the technology may grow but the basic principles will always remain same to connect with the human minds.

--

--

Anil Nair
Southern Design

Product Designer| Game Development UE - 5 | Volunteer - @DesignUpConf