Is Voice Inside Apps The Next Big Thing in UI/UX?

Kumar Rangarajan
Geek Culture
Published in
7 min readJun 3, 2021

Why are a lot of apps suddenly adding in-app voice assistants? Will it help in their voice commerce ambitions?

Traveling down the memory lane of Voice Assistants

The journey inside a voice assistant can be broken down into two phases:

Phase 1 is all about getting consumers introduced to the idea of using Voice to perform tasks inside the app. This phase has witnessed the rise of popular voice-aided devices and technologies like Alexa and Google Home. They have managed to create the awareness of Voice as something that needs to be utilized more when performing daily actions and tasks.

Phase 2 is about Voice becoming a pervasive interaction mode with more capabilities and different contexts. It marks the current transition that is taking place in the field of voice technology- from smart voice-enabled devices to Voice inside apps. In-app voice assistants are the next big thing and are becoming popular.

Understanding the Voice Assistant Journey: Phase 1 and Phase 2. Courtesy: Slang Labs

Why are In-App Voice Assistants becoming popular?

With the current Covid pandemic restricting people to their homes, there has been an increase in smartphone usage. In June 2020, global retail e-commerce traffic recorded 22 billion monthly visits, with demand for everyday items such as groceries and clothing being the highest. This report by Statista supplements the fact with data.

According to a report by BigCommerce, by 2021, mobile e-commerce sales are expected to account for 54% of total e-commerce sales. Voice shopping is shaping the future of mobile commerce. However, Smart devices like Amazon Echo and Google Home are just used for entertainment and home automation, not shopping. A report by TechCrunch in 2020 stated that 81% of users use such devices to listen to music, while around 70% use it for doing inquiries that include asking about the weather, amongst other use cases.

Similarly, A report by PwC states that general voice assistants like Google Assistant and Siri are still used for general-purpose use cases like setting the alarm, scheduling events, making calls, amongst others. Only 10% of users who were studied used it to buy or order something.

Major business giants with apps have understood this need to enable commercial usage and drive e-commerce traffic online, thus focusing their voice commerce strategy around smartphones with dedicated in-app voice assistants. Even Google Assistant and Alexa are encroaching upon this trend of in-app voice assistants and are shifting strategy.

Why should brands add in-app Voice Assistant?

Leading brands who are winning customers with custom in-App Voice Assistants. Courtesy: Slang Labs

Reduced Thought-to-Action Latency

‍You think of something, and you can get it done almost immediately. It’s magical when you can just speak out what you want, without worrying about how to convey what you want to share using a series of individual clicks. “Add 2 kg tomatoes to cart” “ is so much faster than “find the search icon, click, type t..o..m..a..t..o..e..s” (hopefully auto-complete will kick-in, else), scroll, and then click to add quantity and then go to cart.”

Breaks barriers of entry

While most of us (the folks reading this blog) might feel comfortable with the modern UI paradigm — menus, buttons, text boxes, etc. — for many others (folks who are mobile-first, not so tech-savvy, elderly, etc.), it can be pretty intimidating. A button does not naturally mean “click on it.” A menu icon does not inherently imply “discover or navigate to other capabilities.” Visuals sometimes don’t imbibe the local aesthetics, and understanding the app becomes more complex.

“Thought experiment — how many of our parents can use the apps that we build?”

Voice, on the other hand, does not intimidate. It does not need any additional training. You don’t need to teach your parents how to use Voice. They are already a pro, just as you are.

Language of choice

Serving the World with Multilingual Voice Assistants. Courtesy: ITITranslate

‍The language of the user is a crucial constraint that Voice handles better. Even if the user is familiar with the UI elements and can navigate the app, they tend to be intimidated by the language on the screen.

Here is another thought experiment — Change the language setting of your phone to a language you don’t know. Reboot your phone. Try to change the setting back to English. Do you think you will be able to do this easily? If that phone had a Voice interface that would have understood English (“Change language to English”), would you have used that or still tried to use touch to carry out this action?

It’s Natural

We (well, most of us) are all born with the ability to communicate using our Voice. Imagine doing something as simple as buying a packet of sugar. If you are like me, I assume this would be your journey — “hmm.. which menu item should I choose?…Go to groceries…… click.. click.. click.. click.. one done.. Add to cart.. damn.. forgot to change the quantity….uh, let’s remove the item from cart and again go back…..this is so tedious”.

Contrast this with being able to do the same via “Add 2 kg sugar to cart” or “Change quantity of sugar from 2 to 3 kg”.

Doesn’t it feel more natural and comfortable to use?

Accessibility

According to WHO, about 285 million people are visually impaired worldwide: 39 million are blind, and 246 million have low vision (severe or moderate visual impairment). An in-app voice assistant provides a solution for visually impaired people to live independently and buy essentials without seeking help from others to operate their smartphones.

Furthermore, accessibility is restricted to vision-impaired people who don’t know how to operate a smartphone (Demographics: 55+ in age). Speaking to their apps in their native language is far easier for them than learning something which is fundamentally very new to them. Voice makes it easier for them to ask to get something done with added visual cues to support them. Next time your grandparents don’t need to know how to type to order one online. They need to know how to pronounce it.

So does a Voice-based UX mean touch-based UX would become futile?

Not at all. Touch/GUI based apps still have their beauty because of the following reasons:

Discoverability

Discovering and adding capabilities (skills or actions) to the assistants are currently a challenge. There are 30K plus Voice apps, but most people are oblivious to their existence.

Contextual Awareness

When using a mobile app, the context and app capabilities are typically well understood by the user. The visual structure (buttons, menus, etc.) also gives enough clues about what is possible. Contrast that with the experience of talking to a specific Alexa skill and then struggling to know what it’s capable of.

ComScore Mobile Report — 2017

UI only functionality

‍ There are lots of use-cases where visuals beat Voice straight out. One such example is anything that needs a list.

You: “I want to order a pizza.”

Assistant: “Sure. What type of Pizza would you like?”

You: “What are my vegetarian options?”

Assistant: “You can order a garden veggie, a farm-fresh, a Mediterrane….”

You have probably zoned out and don’t remember the first one when the 3rd one is being spoken.

This is where UI lists make perfect sense. Imagine the response to the request above was instead something like this -

Courtesy: Upmenu.com

Privacy‍

When a customer is interacting with an app, they are placing the trust on the app. Both their input (their intent) and the response from the brand are based on a trusted relationship between the two. But if there is an assistant in the middle, who processes both the input and the output and knows who you are (because you are logged into the assistant), it triggers privacy concerns. Both for the user — because they trust their data with someone who understands a lot of details about them (across brands) and the brand — because they are sharing details about their customer to someone else.

The best way is to take the middle road- building on the best of both worlds of touch and Voice. In-app voice assistants must enhance user experiences such that it feels assistive instead of being assertive. When Voice is not the best option to do a particular task inside the app, for example, while listing items, let the user fall back to using touch-based GUI. The same applies vice-versa. This harmony will ensure that the app users don’t feel burdened by using only one mode of interaction- something that smart devices like Amazon Echo or general voice assistants like Google Assistant lack.

Multimodal user experience is what makes In-App Voice Assistants stand out from other voice solutions.

The revolution begins now…

“The best way to predict the future is to invent it. Alan Kay’s resounding words speak volumes. Slang Labs has taken the initial steps towards that future by building Slang CONVA — the world’s first Voice Assistant-as-a-Service platform. Slang CONVA enables businesses to quickly and easily integrate an accurate and multi-lingual In-App Voice Assistant and enables them to delight their customers in a matter of days.

Leading e-commerce giants like Big Basket, Udaan, and Procter & Gamble have already begun the process of adding this experience inside their e-commerce apps, using this platform because they know that Voice-aided UX is the Next Big Thing in Technology and Interaction Design.

Are you ready to use it too?

--

--

Kumar Rangarajan
Geek Culture

Obsessive Dictator @ Slang Labs. Earlier co-founded Little Eye Labs, which was acquired by Facebook. A die-hard Illaiyaraja fan and still believes in AAP.