Beginners Guide to Automated Voice App Testing

Florian Treml
Feb 3 · 5 min read

This guide suggests best practices, infrastructure and tools to ensure your voice app continues to deliver outstanding user experience.

Image for post
Image for post

Questions When Testing Voice Apps

Application of the suggested practices helps answering the questions:

  • Is my voice app following the designed conversation flow ? Is the conversation flow working as expected ?

The Art of Challenging Chatbots

The challenges when testing chatbots, escpecially voice-enabled ones, are different ones than when testing apps with a graphical user interface: while a graphical user interface restricts the possible user interactions by the controls it offers, with natural language, the number of possible user inputs are limitless. Additional when using voice as user input there are again more variables to take into account: the individual nuances in voices, the quality of the microphone, the background noises sourrounding the speaker, and more — when testing a graphical user interface, a button click is always perceived the same by the application, regardless of who actually clicked it.

The platforms behind powerful voice applications are still evolving and are subject to constant improvements — which means that developers have to rely on components that they do not own and the possible influence is limited.

Testing the Voice Conversation Flow

The open source product Botium provides you with all the tools required for implementing a comprehensive, holistic test strategy for your voice apps. You can read about Botium and the background on testing conversation flow in the official Botium documentation.

We will use Bring! Shopping List as an example of a voice app to test. It is published as Alexa Skill, and we can use the Botium Connector for Amazon Alexa with AVS for simulating voice input and output with Botium.

For details about the presented steps and tools please take a look at the Botium Wiki and our Blog!

Record Test Cases

The quickest way to get started is to use the Live Chat in Botium Box to record your own voice with your microphone. You can immediately see and listen to the response of your voice app.

Depending on the technology of your voice app, both text and audio response are shown or either of them.

Image for post
Image for post
Botium Box Live Chat — Recorder

You can save the conversation as test case and make some changes afterwards.

  • Refining input and output text and audio
Image for post
Image for post
Botium Box Voice Test Case

Synthesize Test Cases with Text-To-Speech

Instead of recording your own voice for the test cases, you may decide to instead (or additionally) use synthesized voice samples. Botium has it’s own Text-To-Speech and Speech-To-Text platform based on the best open source and cloud engines available — Botium Speech Processing.

Test cases are showing plain text now instead auf audio input:

Image for post
Image for post
Botium Box Live Chat — Text Input

Eliminating Flakiness — Homophone Mappings

A typical problem when testing voice apps is that audio transcriptions, especially for low quality audio, can be rather unstable — in test automation we usually rely on hard facts (fixed text assertions), and this will lead to increased flakiness of the test results.

In this example, you can see that instead of okay milch ist auf deiner liste the transcription says okay milch is auf seiner liste — this one character difference will make a test case fail:

Image for post
Image for post
Transcription problem

Botium provides the option to specify homophone mappings to deal with audio snippets that are often misinterpreted by the Speech-To-Test engine.

Image for post
Image for post
Specifying Homophone Mappings

Test cases use these mappings to qualify transcription results as success or failed.

Image for post
Image for post
Transcription Problem — Homophone Mapping applied

Testing Real-Life Scenarios

Using your own microphone in front of your laptop might be a good starting point, but in real-life voice apps are used in another way — with smartphone, with a home automation or entertainment device like Alexa or Google Home, in a car. To come up with meaningful End-2-End test cases for these scenarios you will have to make your test data similar to those scenarios.

  • Add background noise on various levels

In Botium Box you can apply various effects for simulating real-life usage scenarios to your own clean recordings or synthesized audio samples.

Image for post
Image for post
Botium Box Voice Effects

Continuous Monitoring

The recipe for ensuring availability of your voice app is actually rather simple — all you need is:

  • a smoke test for checking basic behaviour (for instance, just sending a simple hello to the voice app and listing for a response)

With Botium Box, everything you need is coming out of the box.

Image for post
Image for post

Summary

Now you know what is needed for automated testing of your voice app, you may give Botium Box a try, or you can stick to the free and open source plan with Botium Core.

  • Record your own voice or use synthesized voice

The Startup

Medium's largest active publication, followed by +771K people. Follow to join our community.

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store