Speak up, be heard — the rise of voice tech

Did you know that Microsoft Word has the ability to convert spoken speech into written text? Awkward as it was, I decided to try it and draft this article out aloud, letting my computer do the heavy lifting!

30% of web-based searches will be screenless by 2020 — Gartner

Voice as a medium of digital interaction is skyrocketing, especially in China and the US. In China, Baidu claims that saying a word in Mandarin is 2.8 times faster than typing it, and with an error rate of less than 5%, China is poised to see very strong adoption.

Voice today

What is Voice used for today, and why has it not become completely entrenched in the lives of people globally and locally?

Today, voice tech is primarily used via smartphones to conduct basic activities (play music, open apps, set alarms etc). It is often used for personal entertainment and to better surf the net and is generally seen as a nice to have.

In Asia, China is clearly at the forefront of this space, with the three tech behemoths making moves (CBinsights). But where does South-East Asia (SEA) stand in relation?

Voice in Indonesia is picking up. 62% of surveyed smartphone users said they use Voice, albeit 25% indicated they use it daily (300 surveyed by Iprospect). Non-users cited public embarrassment or concern around voice making them lazy or seem lazy as the top two barriers.

Interestingly, tech-savvy Singapore had 14% of its respondents indicate that they use voice daily. 50% saying they engaged weekly. Aside from the above points, inaccurate comprehension was stated as a deterrent.

My guess, and I’d love to hear diverging viewpoints, is that people in SEA are generally less outspoken. Speaking aloud or often is considered somewhat impolite, and then doing so to a device is further diverging from a person’s comfort zone. But, as with most technological marvels, the curve of adoption skyrockets when functionality reaches a tipping point (Eg. the Internet now is far from being some fictional realm for geeks)

Most people believe Voice is going to be big, and I hope to see more moves from the region. At present, voice tech innovation doesn’t appear to be originating out of SEA. There are few or no entrants in the space, with none of the regional unicorns or digital savvy conglomerates making any visible moves. Why is this?

The a-ha! moment for Voice

We are at the crossroads of change. The turning point for voice, rooting our daily dependence, will arise when voice devices do things that are relatively inefficient via exclusively using visual interfaces.

Visual interfaces today, primarily keyboard-based ones, don’t accept input the way our brain naturally processes it. Whether we are trying to search or filter to get results, there is effort involved and the experience isn’t seamless. Each digital platform has a basic learning curve, requiring one to understand its visual user flow, and interactions can become time-consuming especially for complex requests.

With Voice a future shopper could say something like, “Can you show me red, 3-inch heels, in a size 8, for no more than 50$”, and receive instant results. This feels more natural. It is akin to someone speaking to an assistant in a physical store.

Voice users are also excited about personalization brought upon their lives. Google’s research below shows that curated deals and tips are desired, I’d assume aggregated from multiple sources.

Yes, everyone wants a digital butler, one that listens, adapts, and knows you. Not only making life easier but providing curated and relevant informational access that allows you to be a better you. Amazon (Link) is banking on such a future. Once Voice can be accurately used for game-changing use cases, we will see a rapid shift in consumer behavior and adoption, creating the a-ha! moment.

Therefore, although the tech might come out of China or the US, it is paramount that companies in SEA are ready and nimble enough with their strategic vision and data architecture to adapt and stay on par.

Look, speak, wave — the interface beyond voice

You might be curious to know that a device that was invented close to 150 years ago remains a primary input capability today. The QWERTY key format (keyboard on our computers), patented in 1878, has not been toppled from its reign, despite several attempts over past century.

But as AR & VR begin to pick-up and hologram capability starts to materialize, it’s only logical to speculate about the future of physical keyboards.

I can picture a world, immersed in various forms of 3D visualized content, using voice as a medium to navigate.

But why stop there? The future manifestation, as I see it in every home and office, will be voice + visual + gesture controlled interfaces (Google’s Soli project). Think of the interface in the movie Iron Man. For those less familiar with the reference, a visual or holographic interface manipulated both by verbal and physical gestures.

Such an interface, beyond Voice, I reckon would probably start in the e-commerce or entertainment space. The amount of resources directed to attract and delight customers in these sectors is mind-blowing, and will only increase as occupational automation frees up people’s spare time.

If you’re a Star Wars or Star Trek fanatic though, you may disagree with me as both franchises depict physical key-based devices used to command their spacecrafts and other technological marvels.

Would be great to hear adjacent thoughts or conflicting views.

Q. What is the tipping point for Voice in SEA?

Q. How can brand & marketers adapt? What should we expect?

Q. Will Voice become a dominant force over the next 10 years, or will other tech overshadow (There is some movement around thought-based input devices)

Other fascinating voice trends

  • Children can’t read, write, or type — their growth with Voice (Link)
  • Voice’s impact on the elderly (Link)

Thanks for listening.