Voice user interface documentation: Navigating the world of conversational interfaces
“Alexa, play Coldplay — Fix You. Em, actually, no, resume my latest Spotify podcast!”. At this point, you expect Alexa, a virtual voice assistant, to turn on your unfinished streamed episode. This scenario is one way of using the voice user interface. So, what is the voice user interface in a broader meaning?
Voice User Interface (VUI) empowers users to interact with digital systems by voice commands. To add more context, the trend towards voice user interfaces is gaining momentum since screen fatigue grows and Generative AI evolves. Also, the connection between people and machines tightens, combining not exclusively keyboards and touchpads but various means of communication. Ease of use along with accessibility of the technology is pivotal to all individuals, especially those with low vision or those requiring hands-free experience. For accessibility writing tips, refer to Accessibility writing red flags: guide to becoming a responsible Technical Writer.
Consequently, who garners accolades for communication effectiveness — people in charge of masterful techniques under the hood or users who apply clear narration for the system to recognize it?
In this article, we will review the writer’s perspective on building conversational interfaces:
- VUI in numbers.
- Introduction to specifics of voice user interface apps and devices.
- Best practices for immersive voice-activated responses: user guides (documentation) and voice user interface responses.
As a Technical Communicator or UI/UX Designer on a VUI project, you will likely describe app prompts, commands for users to learn, and voice assistant responses.
VUI in numbers
Why invest Technical Writers’ and UI/UX Designers' efforts in voice user interfaces? In which situations do people count on this functionality?
According to Statista, the global voice recognition tech market reached close to 12 billion U.S. dollars in 2022. The industry is projected to amount to almost 50 billion U.S. dollars in 2029. Businesses are likely to see an increased use of voice technology, particularly in banking, financial services, and healthcare.
Google’s metrics on how voice assistance is reshaping consumer behavior state that 62% of those who regularly use a voice-activated speaker will likely buy something through their voice-activated speaker.
The integration of the Internet of Things (IoT) with VUI has introduced use cases that enable seamless voice control and interaction with a range of connected devices. Users can monitor their smart appliances, adjust the temperature, turn on lights, and manage home devices hands-free. Innovative IoT and VUI will continue to expand, shaping the future of smart homes, cities, and industries.
People claim to use voice technology in their daily routines, such as finding information about local businesses and shopping. Whether you check your bank balance, chat with friends, order food, or book a medical appointment — the voice user interface does all the magic by listening to your verbal instructions.
Introduction to specifics of voice user interface apps and devices
Step 1. Learn about leading representatives and specifics of the voice user interface, at least at the basic level, to write about it.
VUI concepts
VUI flow
In terms of voice user interface types, the market presents options as follows:
- Smart speakers powered by AI assistants: Amazon Echo powered by Alexa.
- Virtual assistants and voice searches for apps, platforms, and other devices, for example, Alexa on the Lenovo laptop.
- Voice picking and inspection systems (the Voice dictation section in the article).
- Smart house and IoT networks activated by voice, including environmental data tracking, such as temperature and humidity, through sensors and tablets.
Thus, familiarizing yourself with the content of Apple Siri, Google Assistant, and Amazon Alexa will come in handy, particularly when initiating competitor analysis. Furthermore, review other voice-enabled tools that simplify routine tasks and allow performing them faster instead of typing lines of text:
- Amazon Echo (including Echo Dot, Echo Plus, and Echo Show) and Amazon Alexa
- Google Home (including Google Home Mini and Google Home Max) and Google Assistant
- Apple HomePod and Apple Siri
- Samsung Galaxy Home
- Microsoft Invoke
- Sonos One
- Lenovo Smart Display
- LG ThinQ Speaker
- Sony LF-S50G
Now that you know about types of VUI, let’s proceed with the analysis and the writing stage to make human-machine voice communication smart and not hard.
Best practices for immersive voice-activated responses
Step 2. After you delve into industry insights, the research phase comes into play:
- User preferences to embrace your audience’s needs and to define the tone of voice.
- Competitor analysis to overview the market landscape.
- Probable user paths, navigation techniques, and contexts where the conversation could lead to.
Note: Consider any convenient instruments — flowcharts with journeys, storyboards, or mind maps with personas. Outlining potential scenarios is vital before transforming them into a dialogue.
- User-oriented terminology and natural conversation patterns.
Note: For this purpose, you can check the available guides, like the VUI guide. - FAQs for effective voice search.
- Troubleshooting paths for errors or misunderstandings.
User guides
Voice assistants
Step 3. Educate your users through the documentation on commands and integrations. Let’s explore how Amazon communicates those to Developers and non-technical customers since Amazon Alexa supports a wide range of voice commands and the ability to integrate additional functionality or gadgets:
- Overviewing Alexa personality
- Interacting with Alexa
- Building new skills and integrations that connect to Alexa
Essential voice commands
To initiate a request, you say the default wake word Alexa, which activates the voice assistant.
Note: You can customize the wake word.
- Alexa, help
- Alexa, stop
- Alexa, mute/unmute
- Alexa, add an appointment with Jane to my calendar for June 10 at 10 am
- Alexa, set a timer for 10 minutes
- Alexa, decrease/increase the volume to 5
- Alexa, call sister
Amazon communicates these commands through product documentation, interactive tutorials, online resources, device packaging, social media, and community forums.
Main concerns
In addition to the learning curve, ambiguity of speech and privacy issues are of concern.
A worthwhile example of informing customers from Amazon:
“Don’t joke about privacy. Don’t use the term “always listening” in reference to Amazon Echo or Alexa Built-in devices. Don’t imply that Amazon or Alexa know everything about you. Don’t feature Alexa in anything suggesting spying, privacy violations, or surveillance. Visit amazon.com/alexaprivacy for examples of how to accurately describe wake word detection and other privacy features.”
To figure out how to deal with misunderstandings, we will momentarily switch to another product — Apple Siri and Use Siri on iPhone — Apple Support: “While making back-to-back requests: Repeat your request in a different way”. From the voice assistant side, add explicit error messages on why the assistant cannot handle the narration task. Document any limitations and constraints.
Voice dictation
Another type of VUI that eliminates visual dependency is voice dictation. Widespread in warehouses and logistics, these systems help frontline workers locate and verify items using voice utterance. A device vocalizes instructions to the worker, and the worker communicates information back into the system hands-free. Voice-based directives reduce time spent on reading instructions or manipulating gadgets.
Typical voice dictation commands that might help you in building user manuals:
- Label the consolidated box with the correct shipping information.
- Go to aisle 1, shelf D, and pick three units of product A.
- Return the damaged item F to the designated returns area.
- Load pallets of product H onto truck K for delivery.
If you are interested in the logistics and VUI integration, visit What Is Voice Picking? How It Works, Benefits & FAQs.
Speech-to-text
An extra feature available in products like Microsoft Word and Google Docs is speech-to-text, where users see instant feedback on their screens. This way, production workers narrate all data while staying busy on operational tasks. As a Technical Writer, you can benefit from such a tool while drafting articles and blog posts.
Built-in voice search
Product voice searches commonly rely on answers from knowledge bases. Your task is to follow SEO practices in your articles and apply synonyms of keywords because users might search in various ways. For example, “Play a video” vs. “I want to watch the video”.
Voice user interface responses
Step 4. Take care of the voice user interface responses.
- Add conversational feedback and confirmation phrases (“All right, I got it” or “OK, I won’t”) in voice outputs to ensure that the user’s input is accurately interpreted and executed. Greetings, questions, repeating, or paraphrasing are perceived well. However, balance the amount of feedback to be appropriate.
- Provide context, but don’t overload to avoid huge monologues.
- Don’t provide all the details at once. The conversation must contain only statements that the user requests. If there is a visual representation, keep the lines short and separated.
Step 5. Test and maintain the quality by gathering user feedback and continuously improving your app spoken experience. Don’t be afraid to adapt your documentation after the rollout.
Conclusion
VUI team must be conductors orchestrating a symphony of the refreshing capabilities of voice user interfaces and the accompanying content, paying equal attention to both.
From the technical standpoint, VUIs that recognize and respond to emotions are in demand. While this trend is still being established, you can always apply an empathic and friendly tone to your conversations.
What do you think are the most significant challenges of voice user interfaces? Feel free to share your views in the comments. The following article might give you hints: What are the current challenges and limitations of voice user interfaces?. 😊