Usability testing for Voice AI products

Qian Yu
Qian Yu
May 12, 2019 · 9 min read
Photo by Brooke Cagle on Unsplash

Before diving into the details, a little background about my work and experience. I work on a workplace Voice AI product called Webex Assistant. It is built upon our collaboration solution hardware. It’s not a voice alone AI. Meaning it does have GUI as part of the interaction.

Just like many others, I don’t have answers for everything and I’m mostly learning on the spot. But here’s what I’ve learned from this project and hope you find it helpful.

1. Define your objectives

Why are you about to run this usability testing? What state is your product or feature at now? Do you have a working product? Or you’re still at an early stage and want to test your Minimum Lovable Product (a better alternative I just learned recently for Minimum Viable Product).

A well-defined testing objective list should also be discussed with your PM and eng team. If you have an in-house NLP team, you also need to bring them into the planning phase. These early discussions will not only guide you to take the most advantage of all the time and energy you’re about to put into this testing but also will make your findings more digestible for the larger team.

We started defining objectives from the following areas:

Choose what areas you want to focus on and what your hypothesis is can help to make the rest of the testing a lot easier. Also, it’s important to build the team’s expectations that qualitative interview methods can’t answer all the questions.

2. Define the target audience and recruiting

Photo by Jacek Dylag on Unsplash

You should have some high-level user persona defined already. But if you’re testing a specific feature which mostly happens to a subgroup of your primary users, you might want to think about what kind of testing participants can give you the most helpful or critical feedback.

For example, if you’re trying to understand on a high level what people expect a voice AI to do in your product category, you might want a good mix of people who are frequent voice AI users and not. In our case, we often test meeting room features for collaboration in a workplace environment. We usually recruit participants who primarily work in an office, since those will be the primary users of the feature.

We usually recruit two types of participants: internal vs. external.

Internal participants usually fit better with our targeted users. In our company, employees use our own products heavily. Their current workflow exists in the ecosystem of our own portfolio. This type of participants usually has more direct and relevant feedback on our product. However, the disadvantage is that they are used to how the ecosystem works. They tend to think more “in the box”.

External participants can give you fresh perspectives which usually can’t be offered by internal participants. For our case, we design collaboration products for enterprise users. External participants really give us a better understanding of how other companies and industries work, how people interact with different kinds of tools, and how our product can make their work easier, or maybe not 🤷‍♂️ They also usually break the interaction flow in the most unexpected and illuminating way.

3. Define your testing flow

Now it’s time to be creative. A lot of research methodologies for GUI are still applicable for VUI products. For example, recently we used “card sorting” for testing our help menu and see what’s the most natural and logical way to present our command list.

You may find many other articles around how to define a testing flow so I won’t get into details here. However, one thing to point out is the warm-up session.

You probably know why you need to warm up the participants before getting into the meat of this testing. Especially for VUI products, I find it very important to make it clear to the participants that we’re not testing them but the products. Because of the immaturity of the technology, participants will run into errors way more often than using a GUI product. Most of the time people feel embarrassed, self-conscious and blame themselves for errors, which is not helpful to get the most genuine feedback.

I always ask questions around if they use any voice AI products currently, how often and why so. That is also a useful piece of info to uncover some potential patterns. Another thing I find useful is asking for their expectations before actually interacting with the product. Now is the sky-is-the-limit moment. I usually get some really wild, ambitious yet inspirational ideas out of it.

4. Prototype and setup

Based on the objectives and test flow you defined, you now need to decide to use a prototype or working product.

Based on where you’re at with the product or the feature you’re developing, sometimes you don’t really have a choice. However, they have their unique pros and cons.

Fake prototype:

First of all, when I say fake prototype, I mean a keynote prototype. I know you might be shaking your head or rolling your eyes now since there’re many tools for prototyping voice. However, I personally don’t see those products are responsive or rich enough yet. Especially for our products, it’s VUI + GUI, so most of the time I need corresponded comps to be presented at the same time.

Screenshot of Amazon Polly

How I do it is that I define the main screens with GUI and voice prompts. I use Amazon Polly 🦜 to download voice prompt audio files and attach them to each slide. Within testing, I project my screen to a bigger screen in the testing room. I’ll have the presenter’s view on my laptop and have the navigation menu open. So that based on the participant’s behaviors, I will manually navigate to the right screen.

Working product:

Just as it says, you can also use a working product for testing. This means that you have already designed the interaction, NLRs and UIs. Plus having the engineers implemented. This should mainly happen after some previous validations since it’s tons of work to develop a voice feature.

When you use a working product, I also recommend doing dry runs before the real testing. Due to the complexity of voice AI products, you may run into some unexpected tech issues which may make your real testing way less effective.

5. Testing

Photo by Headway on Unsplash

You’ve got the process planned out at the previous stages. Plus the basic disciplines are very similar with how you test a GUI product so I won’t explain much here.

One note to call out though is that when you give the participants a task, always avoid any words or phrases that they might use in a voice command directly and get biased results.

For example, one of our use cases is about joining a scheduled meeting in a meeting room. When we gave the participants the task and ask them to initiate the interaction, if we ask them “Oh, so how would you join that meeting you just scheduled in this room”, people would borrow those words directly. That isn’t helpful at discovering what the command variations are and validate if our AI can understand all.

I also always ask the participants if they’re okay with being recorded. I personally find taking notes within the testing makes the participant less engaged. Plus, showing video clips to your team during the readout will also build empathy more effectively and make them understand users’ real pain.

6. Consolidate insights

If you had a spreadsheet prepared from step 1, now it’s time to take it out. If not, still totally ok. You might have already found out some patterns from doing the testing. You can also create one while you’re re-watching the recordings.

In addition to the questions you had, it’s also good to keep track of things like

Video is a powerful storytelling tool. Don’t forget when you’re re-watching the recordings, you might just want to cut save clips to demonstrate the issues.

Depends on the scope of the insights you’ve collected, you should also create some actionable items so that the team understands what the next is.

7. Readout presentations

Photo by Campaign Creators on Unsplash

In the beginning, I suggested to discuss and define the objectives with cross-functional team members. Now it’s time to invite them back for a testing readout, and maybe more people from the product team. People are usually interested in how users would interact with a product they’re creating.

Again, don’t forget to add some video clips or even with quotes to make the whole readout more convincing.

8. Follow-ups

Photo by Headway on Unsplash

The readout presentation isn’t the last step of a testing cycle. I usually find people very engaged during the presentation and start discussing potential solutions. However, time is limited and usually, we won’t come to any conclusion within the presentation meeting. I have to interrupt the discussion and move on to finish the whole readout. That means follow-up discussions and even meetings are needed.

We love hosting bug filing parties with a few members on the team. That will keep track of the most obvious bugs. However, as for any new feature or flow improvements, you might need to go back to the double diamond process.

Cisco Design Community

The Design practice at Cisco is a critical capability that…

Qian Yu

Written by

Qian Yu

UX Lead for Webex Assistant, a multi-modal conversational AI at Cisco Collaboration. 👨🏻‍💻 www.thousandworks.com

Cisco Design Community

The Design practice at Cisco is a critical capability that ensures we find the right problems to solve, and solve them in the best way possible. The designers that drive this work are some of the best and brightest in the industry. These are their stories.

Qian Yu

Written by

Qian Yu

UX Lead for Webex Assistant, a multi-modal conversational AI at Cisco Collaboration. 👨🏻‍💻 www.thousandworks.com

Cisco Design Community

The Design practice at Cisco is a critical capability that ensures we find the right problems to solve, and solve them in the best way possible. The designers that drive this work are some of the best and brightest in the industry. These are their stories.

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store