Best use cases for voice

Jaakko Maalainen
5 min readOct 7, 2020

--

Most of us are pretty confident using all kinds of mobile and web applications. This is because most of them are using familiar UI patterns.

We know to look for menu items under a hamburger icon, contact information from footer and click an arrow pointing right when we want to go back to the previous screen, for example.

It’s easy to use new applications when they use the same familiar icons and patterns for the same functionalities we are used to use in other applications. But there’s necessarily nothing intricintly good about these icons and patterns. We are just used to use them.

Hamburger icon doesn’t really look like anything, but we are accustomed to click it when we are looking for menu items or more rarely used features.

That doesn’t mean we couldn’t find better patterns. One option is voice.

Navigating deep menus

In most social medias, Medium included, you can change your profile picture. It’s a feature a lot of users know to exist, but how exactly it is done varies a bit.

Here in Medium I would look for it in the profile menu on the top right of the screen.

As you can see, clicking the icon on top right indeed opened a context menu. But there’s at least two possible places that I could try: Profile and Settings.

Correct place for changing the profile picture is Settings, but I actually tried Profile first. It wasn’t too hard, but it wasn’t as intuitive and easy as it could be.

I knew I could change my profile picture, but I had to look for it. This is actually a great use case for voice. If I could just click on a microphone button and tell my problem “Where can I change my profile picture” I could skip all this clicking through and find the correct setting on one try.

One great use case for voice is navigating deep menus. No matter how deep in your navigation a feature lies in, voice enables your users to jump right in to it.

Search filters

ASOS.com with traditional search filters

Searching is a very common use case for most applications and services. Every eCommerce shop has one and often the user can filter the search by selecting certain criteria from a variety of categories.

As a simple example, here’s the ASOS web store that allows users to filter their search by product type, brand, color, size, sale status and price range. On top of this, they can sort the results in different order — most commonly price ascending and descending, relevance to search terms and possibly date.

In the traditional touch or mouse based user interfaces it requires quite a bit of clicking and scrolling through menus. What if this was done by using voice?

Search filtering by using voice input

Speechly has actually created a simple demo to show exactly that. Instead of selecting the criteria, the user can just say what they want and the search is updated in real time. And the best thing is that if they want, they can continue using the more typical graphic user interface, too.

Voice can improve search experience.

Structured data input

Speech recognition is already close to human parity, but that doesn’t mean that it works always. Humans are not perfect either and we often hear each other wrong and even more importantly, we understand each other wrong.

There are a lot of companies, such as Nuance with their Dragon Dication doing ASR based dictation services, but the more important the accuracy of the transcription is ,the more probable it is that it will require a human to validate the transcript.

However, if the data is not free but structured text, voice works great. Most forms that people fill online and in the mobile apps are mostly structured. They ask for gender, age, dates and locations. This gives important cues for the ASR service that help them perform.

Web form turned multi modal with voice input

For a structured input such as any web form, voice is a great solution to improve user experience.

Media players

Voice works great for mobile media players as the user is most probably already using headphones with a microphone.

Spotify has recently added voice search to complement their typical keyboard based search. This is a great example of a multi-modal user interface and as the user already knows what they want, the experience is intuitive and natural for the user, too.

If you haven’t already tried Spotify voice search, I really recommend it!

Voice works great in media industry.

Mechanical tasks in professional services

There’s a wide variety of B2B SaaS’s in the wild that can improve productivity ranging from ERPs and CRMs to more specialized tools, such as warehousing or maintenance applications.

One thing that is common to pretty much all of them that is that they are only as good as the data that they are filled with.

If the sales people don’t fill the CRMs with relevant customer data or the maintenance workers don’t report the issues they fix, the system doesn’t offer that much insight to the management.

Another common feature for most of them is that they are seldom very fun to use and hence the data quality suffers. Often the workers are also in a hurry or not in the mood of reporting when they are actually doing the stuff they should report to the system. When they are back at the office, they don’t remember all the details.

Voice (and natural language understanding) could work here well, too. The things that the employees have to fill don’t have to be that complicated, but the data should be filled as quickly after the actual work has been completed as possible.

If the sales guy could fill in the details from the car and the maintenance worker could use their mobile phone with their gloves on, it would most certainly improve the data quality.

Voice improves data quality in professional services.

What are your favorite use cases for voice user interfaces?

--

--