Demystifying the Voice Interface — Nadini Stocker
Notes from UX Salon 2017 — by Summurai
Can a natural conversation be created artificially?
Voice UI design is extremely intriguing, but not very well understood today.
Designing a verbal dialog between humans and a computer which accounts for all the variables of a real human to human conversation is extremely complex.
This presentation focuses on a new approach to voice and conversation design, in which the rules of human-to-human conversational exchanges inform the fundamental interface.
Nandini’s experience comes from working on IVRs (Interactive Voice Response platforms). IVRs are push-oriented interactions. This non-natural interface is standing in the way to the solution.
The problem is that we’re trying to teach the computer how to interact with humans based on input and Output mechanisms such as Speech Recognition and Speech Synthesis.
Nandini offers a brief history of technical voice mechanisms.
1952 — a computer was able to pronounce numbers.
1970’s — a computer was able to say as many as 100 words.
1980’s — The appearance of a mathematical approach to sound waves, which is mainly used today.
Most recent years — Big-data platforms offer new ways of managing conversations. The voice interface market is lead by Google, Amazon and IBM.
In spoken language, you do not have to teach the user anything. They already know how to speak. The UX designer’s role is to guard this idea.
Conversational interface is what we learn first when we are born, and what we know best.
When designing voice interfaces, Nandini encourages not to convert the visual UI and transfer it into voice. Conversational UI requires a different approach. Building a character that will lead the conversation and is interesting and fun to talk to will help you succeed and eliminate extra features.
Nandini described the 6 steps which a conversation can be broken into:
Step 1 — Opening a channel — this is where the conversation initiator sends a message to the other conversation partner.
Step 2 — Commit to engage — the conversation partner acknowledges and stops his/hers other activities.
Step 3 — Construct meaning — the talking sides share ideas, thoughts or actions based on the context, culture and location.
Step 4 — Evolve — one of the sides or both, gain an added value from the conversation.
Step 5 — Converge on agreement — If things work well, the conversation partners reach an agreement.
Step 6 — Act or interact — After ending the conversation, one of the sides or both turn to do something or have a value, such as having an issue solved or feeling less lonely.
Voice interfaces allow you to be more passive.
The best use case for voice is that it’s more convenient than using active interactions. It requires little effort and delivers big value.
Imagine being in a situation where you are doing something more important with your hands (like driving). This can be achieved by designing effortless interfaces and allowing them to remove barriers.
Nadini is a part of Google’s Conversation Design Standards team, and has created voice experiences for various Google Business units in over 60 languages and 120 different countries. She is passionate about making voice interfaces easy and more accessible.