Pryon founder Igor Jablokov at the Future Labs stage, The AI Summit. Images: Caroline Sinno Photography LLC

Voice AI for business opportunity

Pioneering companies are putting the best in voice tech to work, reimagining customer interactions, supporting human intelligence, and scaling products globally. Get insights on voice artificial intelligence from Igor Jablokov of Pryon, Accenture’s Laetitia Cailleteau, and Eugene Weinstein of Google

Published in
5 min readDec 20, 2018

--

Voice-enabled technology has a growing presence in people’s everyday lives, with digital assistants like Alexa bringing artificial intelligence into homes and offices the world over.

These tools are continually getting better at turning out the right responses to users’ commands and questions. Making them into better conversationalists, however, is the next challenge for voice AI researchers — and as with its human counterpart, it’s not an easy one to resolve.

In part 3 of our series covering the Future Labs stage at The AI Summit, New York on December 5, we show how voice AI, led by some of the top minds in the field, is advancing to create new opportunities for companies.

The trouble with talking

Future Labs managing director Steven Kuyan kicked off the discussion with a look at the complex problem of AI-human conversation. Talking is hard, after all, because our natural interactions with others involve layers and levels of understanding, spanning language, conversation context, and word meaning.

Even with these variants factored in, there are endless possible outcomes to most human-to-human conversations because our intents and topics change along the way.

“We take it for granted, but a number of steps have to happen for us to be able to have a normal conversation with somebody else,” Steven said. “So, the unpopular reality is that state-of-the-art conversational interfaces are still very far from having natural free-form conversation.”

That said, technologists are working to close the gap as fast as possible. Leading companies in the space are already putting the best in voice tech to work, applying it to reimagine customer interactions, support human intelligence, and scale products internationally.

Voice bots on the rise

Laetitia Cailleteau’s team at Accenture has seen 300+ conversational AI projects, ranging from small proofs of concept to large transformation programs for global enterprises.

That number includes text- or messaging-based projects like chatbots. But voice is on the uptick for call center interactions and other customer-care scenarios (with human assistance at the friction points). And as voice AI’s effectiveness improves, companies will start building their products around it.

Laetitia Cailleteau, Laetitia, managing director, UKI emerging tech & AI lead, and global virtual agent lead, Accenture

“Years ago, we went from designing for web first to designing for mobile first. Now you’ll need to design for voice first, and around context of use,” said Laetitia, managing director, UKI emerging tech & AI lead, and global virtual agent lead for Accenture.

To succeed with voice AI, Laetitia says companies shouldn’t expect to shift a text-based bot experience into spoken conversation. Voice reinvents the relationship between companies and customers — it demands new frameworks for interactions.

When those frameworks work, they make services more accessible.

One example is providing banking services by voice to elderly people or those without internet at home. As bank branches close, voice can replace the physical support these customers were accustomed to, without having to push for any behavior changes to adopt digital tools.

Business value with voice interfaces

Voice is also four times faster than any other input to a machine interface. But most of today’s seamless, attractive digital interfaces still aren’t ready for it, despite the many years it’s been on the horizon.

Igor Jablokov was one of the earliest AI leaders to put it there. Now the CEO of AI startup Pryon (currently being incubated at the Future Labs), he previously ran Yap — a speech recognition company that was later Amazon’s first AI acquisition and the nucleus for Alexa and other products. Before founding Yap in 2006, he worked at IBM, where his teams designed the precursor to Watson and developed the world’s first speech-enabled Web browser.

Igor Jablokov, CEO and founder, Pryon

Those early innovations were largely ignored because the infrastructure for voice tech didn’t exist back then. It’s still under development to this day, really; Pryon, for one, works with companies to make smarter use voice technologies.

At The AI Summit, Igor shared how voice-enabled interfaces will help companies capture more information, apply it to AI, and use it to augment human intelligence. He predicts companies will ultimately use AI to incentivize actions that improve or extend customers’ lives — recommending healthier foods at the drive-through rather than just taking down a junk food order, for example.

Why? “Because if you’re going to be alive a few more decades longer, guess what?,” he said. “Your lifetime value of a customer actually improves.”

Voice at an international scale

Since launching its first voice search tool in 2008, Google has scaled its Google Speech product line to a staggering 120 languages and locales.

Eugene Weinstein, senior staff software engineer at Google, gave the Future Labs audience a look at how the tech giant uses advanced data science and language-specific R&D to advance its speech recognition products at such a massive global scale.

Eugene Weinstein, senior software engineer, Google

Google uses grapheme modeling in its voice AI, for example, which learns the correspondence between the speech sounds and spelling patterns (or ‘orthography’) of a dialect. This lessens the manual work that would otherwise be involved in replicating human language-learning processes — which require understanding rule-based pronunciation standards — with every new language.

And grapheme models are just one piece of Google’s advanced ‘end-to-end modeling’ approach.

End-to-end models train a single neural network to transcribe speech directly into word sequences, using knowledge on a broader mix of language elements, such as both acoustics and pronunciation. Google is leading innovation in this newly trending AI area.

“If you want to internationalize speech or voice products at scale, you have to invest in automation in research,” said Eugene. “We’re trying to break down traditional boundaries between the various types of speech models that we work with. End-to-end models are really promising in that direction.”

Read more about what to expect in AI over the next 12-24 months in our 6-part series on the Future Labs stage at The AI Summit, New York:

Part 1: 2019 outlook: AI research to real-world application
Part 2:
Vision AI and ‘senseable’ autonomy
Part 4:
Robotics AI is augmenting human intelligence

--

--

The Future Labs at NYU Tandon offer the businesses of tomorrow a network of innovation spaces and programs that support early stage startups in New York City.