The Ugly Truth of Virtual Assistants Technology

I feel like I can be anything with you. In movie <HER>, the character Samantha shows a perfect artificial intelligence who truly understands and adjusts herself to users’ emotional needs. Her accent and intonation is natural enough to confuse with an actual human voice, and she even expresses her own feelings. The only difference we can tell between us and Samantha is that she isn’t physically embodied.

In a real world, the most Samantha-ish AI voice I’ve ever seen was the newly announced feature of Google, called Duplex. At the annual Google I/O conference, it was definitely a belter of the conference and surprised the world with its flawless humanlike voice. It even contains conversational fillers like ‘hmmm’ or ‘um…’ sounds, which make it more natural.
Duplex showed that virtual assistant technology has been developed astonishingly and now they seem like to catch up with human-to-human communication. Yet, here’s a quick question. How long do we have to wait until this humanlike assistant come to our daily life?
The Ugly Truth of Current Virtual Assistant Technology

OK, Google. How’s the weather?
According to a recent smart-speaker trend report, the most frequently used features were music and weather. Yeah, which is neither interesting nor tech-savvy tasks. The fact that current virtual agent is not intelligent enough would come as no surprise to anyone who has used these devices. Since they often misunderstand user’s commands, people don’t expect them to handle a complex task. Even though the list of command assistants can process is growing, it’s still far under our expectation of the AI assistant.
Additionally, Duplex was a prototype of the newest technology of Google, the best tech company in the world. It means that it will take a few more years to commercialize it for normal people with various devices.
Mind the Gap! A Gap Between English and Non-English Speaking Countries

The problem becomes way worse if you step away from English-speaking countries. The latest technologies related to Natural Language Process, which is a particularly crucial part for voice interfaces, have been advanced mostly based on English words and phrases.
So, AI companies outside of English culture have to develop their own natural language process system. Considering the fact that Korean, Japanese, and Chinese have their own alphabets and grammar separately, even Asian countries can’t help each other.
I noticed that this gap creates an issue when it comes to user satisfaction. The growth of Amazon Echo and Google Home represents that people in Western countries are actually adopting these new voice interfaces in their daily life.
On the other hand, a research shows that less than a half of Korean smart speaker users are not satisfied with their voice agents and it’s mainly because they can’t understand the commands of users, which is caused by technological issues. It shows that non-English speaking countries are still suffering from building their own language processing systems.
We have to admit that there’s no crystal clear answer for this. The inequality issue of tech industry between the first-world and the other countries has been existed in, like forever. What I’m trying to say is that a fancy dream of future AI like Samantha would remain as a fantasy for few more years, especially for the rest of the world who do not speak English.

