AMA on Voice Technology — A Summary from the Twitter Session

Isha Mahajan
Agrahyah Technologies
5 min readFeb 12, 2021

VOICE Talks India, a series of online events powered by Google to help developers, brand custodians, media agencies, and content creators leverage the power of voice, organised an Ask-Me-Anything (or AMA) session on Twitter in December 2020. After having successfully organised two events, Voice Technology, the New UI, and Voice for Bharat, the aim of this AMA was to address queries around voice technology and its presence in India. Our panel of experts included Kumar Rangarajan, Co-Founder & CEO, Slang Labs; and Vishal Golia, CEO, Boltd. Audiences could tweet their questions to us using #VOICETalksIndia and one of the panellists would address it.

This blog is a summary of the most pertinent questions associated with voice technology asked during the session.

AMA on Voice Technology: A Summary

Question 1: How has voice technology connected and assisted its users during the lockdown?

With the pandemic underway, it has been noted that there has been increased visibility, exploration, and usage of voice solutions, from stories to entertainment and music, to information and education.

Vishal Golia: Think of voice technology as giving commands or asking questions to people in your house or your house help. The questions could be — What is the day today? Who’s at the door? What time is India’s match? Now substitute this with Alexa or Google Assistant. It’s as simple as that.

Question 2: What are different forms of consumption of Voice apart from the commonly known Voice as output, input, and/or entertainment?

Vishal Golia: Music is by far the biggest contributor of voice today. Gaming on voice is going to be huge in the near future. Another utility of voice technology that can be seen soon would include performing daily tasks in sectors such as banking, automobile, bookings, etc. Having said that, a lot of activities can be performed by voice today. These may include booking one’s doctor’s appointment, unlocking the car (maybe for the cleaners), paying one’s utility bills, and setting various reminders like medicine time or the time to tune into a specific channel/podcast.

Question 3: Voice is still a design problem rather than a tech problem at the moment. Do you think the technology will come to a stage that designers become good to have than must-have in a voice project?

Before developing any use case, solution, a comprehensive Voice User Interface (VUI) is designed to map the content, flow of conversation, fall-backs, and more to make the interaction conclusive.

Vishal Golia: We can call it a VUI design or in simple terms refer to it as pre-production. It is always the bedrock of any creative project and voice is no different. The current end product may not have all the tech-savvy bells & whistles but is well thought of keeping the end-user in mind.

Question 4: Is there a chance we get to see a made in India independent personal assistant in the near future- not just the addition of the Indian languages to Google and Alexa but an independent VA?

Kumar Rangarajan: VA comes in different forms and shapes. This is what we are doing at Slang Labs, allowing developers to add our multi-lingual In-App Voice Assistants inside their apps. We are not dependent on the Alexa or Google infrastructure and have a native home-grown India first implementation.

Question 5: What is the MVP that brands can do to experiment with voice?

Vishal Golia: MVPs or Minimum Viable Products are a great starting pointing to test new technology and features. Getting out voice apps that answer common queries of customers is the lowest hanging fruit and can be used to test the waters. But this must be followed up with an additional utility that is more habit-forming and prods the user to start using it more and more.

Question 6: How do you think voice-first apps need to handle privacy concerns? Do you feel voice input should be front and center or just an add-on element for now?

Vishal Golia: Privacy concerns should not be overlooked. Users have already “voiced” their concerns and Alexa and Google Assistant have addressed this by giving them the choice to control their data. More should be done and will be done, I believe.

Kumar Rangarajan: Privacy as an issue needs to be understood first. There are two types of privacy IMO:
- When I am speaking, who is listening?
- Will the system listen to things that I did not want it to hear?
The first is a challenge that a user needs to be vigilant about. It’s like making a phone call in public. We need to be aware of our surroundings before we speak. The second is a potential concern but it can be addressed by 2 things:
- Having explicit control when a Voice Assistant listens to you (hot words are notoriously buggy but are getting better, however finally its always pattern-based that could fail)
- The trust on the provider of the VA who gives the guarantee that it won’t be heard or recorded when it should not.

Question 7: Does voice have a future in India especially when it comes to shopping? Even while using Flipkart & Amazon, people check for reviews for a product and compare it with others so is it possible that people will start buying things using voice products?

Kumar Rangarajan: We are absolutely bullish when it comes to the amalgamation of e-commerce and Voice. Grocery and Pharma (which are one of the fastest-growing segments) will lead the way in voice adoption because this is where the need is most (repeated search in a single session and multiple sessions in a month). Starting from multi-lingual voice search to cart management to navigating around the app and getting things done, Voice as an interface makes it much easier to perform these activities. Imagine if our moms could be comfortable enough with shopping online that they wouldn’t need our help! Voice would be the enabler for that. All revolutions need to start small so the journey to completely natural voice transactions would be preceded by relatively “modest” gains such as voice search and simple voice navigations.

The tech is here. But people need to first get used to the idea of talking to tech. The core use-cases are becoming clear. Larger use-cases will come when there is demand, which will come with gradual up-take.

Question 8: What is the future of AI and Voice Systems in the coming decade?

Kumar Rangarajan: The initial days of Voice Systems (and the AI powering it) is about getting people used to the notion of talking and understanding the value/convenience/efficiency improvements it brings in. It’s not truly natural & also relatively hard to integrate and limited in language scope.

The next years of this tech would enable:
- Understand and respond to more natural and non-scripted conversations
- Human and machines will have a better protocol of hand-off between each other (improved ways to invoke the machine than the currently broken hot word model)
- Non-English based conversations would become more common.

(The structure of the answers has been modified slightly to suit the flow of the article. To read more about the raw tweets, click here)

--

--