Duplex — Google’s latest AI system

Originally published at www.brandminds.ro

Duplex is Google’s latest AI system and its developers, Yaniv Leviathan, Google Duplex lead, and Matan Kalman, engineering manager on the project say it can “help you get things done in the real world over the phone”.

At Google’s 2008 I/O developer conference, Sundar Pichai showed Google Duplex in action.

Listen to Google Duplex book an appointment with a hair salon:

What is Google Duplex?

Google Duplex is an AI powered technology within the Google Assistant.

The technology is directed towards completing specific tasks, such as scheduling certain types of appointments.

It can book reservations over the phone with various businesses: hair salons, restaurants, etc.

The Google engineers designed Duplex to carry out conversations in a human-like voice. The system makes the conversational experience as natural as possible, allowing people to speak normally, like they would to another person, without having to adapt to a machine.

How does Google Duplex work: the technology behind it

Google Duplex’s conversations sound natural thanks to advances in understanding, interacting, timing, and speaking.

The engineering team built a recurrent neural network using TensorFlow Extended to cope with the challenges of human speech.

The network uses the output of Google’s automatic speech recognition (ASR) technology, as well as features from the audio, the history of the conversation, the parameters of the conversation (e.g. the desired service for an appointment, or the current time of day) and more. The Google team trained Duplex for each task separately and used hyperparameter optimisation from TFX to further improve the model.

How does Duplex manages to sound so natural?

Duplex sounds natural thanks to a combination of a concatenative text to speech (TTS) engine and a synthesis TTS engine (using Tacotron and WaveNet) which controls intonation depending on the circumstance.

Also the voice of Duplex sounds so human-like due to the incorporation of speech disfluencies: e.g. “hmm”s and “uh”s. In their user studies, Google found that conversations using these disfluencies sound more familiar and natural.

Response latency was also taken into consideration. Depending on context and situations, people expect faster responses (when we say hello) or slower responses (to complex sentences).

The Google Duplex system is capable of carrying out sophisticated conversations and it completes the majority of its tasks fully autonomously, without human involvement. To train the system in a new domain, Google engineers used real-time supervised training. Much like a professor teaching a student, Duplex was supervised by instructors who monitored the conversations and were able to change the system’s behaviour in real time. This process allowed the system to learn and evolve until it met the desired quality level and it became fully autonomous.

image source: ai.googleblog.com

Google Duplex usability

Duplex for businesses

  1. Assists customers with appointment booking
  2. Sends customer reminders
  3. Answers phone calls from customers asking for various information

Businesses can benefit from Duplex by allowing customers to book through the Google Assistant without having to change any day-to-day practices or train employees.

Using Duplex could also reduce no-shows to appointments by reminding customers about their upcoming appointments in a way that allows easy cancellation or rescheduling.

Duplex for users

  1. Saves time calling businesses and booking appointments on their behalf
  2. Assists hearing-impaired users
  3. Assists users who don’t speak the local language
  4. Assists users suffering from illnesses that impair their ability to communicate (autism, Lou Gehrig’s disease, social anxiety, deafness, etc)

Conclusions

The future is upon us.

Long gone are the days when computers talked back to their human operators in a metallic cold voice.

With Duplex, Google has achieved the long-standing goal of human-computer interaction where the computer talks to the human as if it was another human being.

Soon users will be able to choose from 6 different voices and later this year, Google will also allow the user to change Assistant’s voice to none other than John Legend’s.

New features such as continued conversation, multiple actions, custom routines and “pretty please” are in the works and will be coming out later this year.

As with any new technology, people’s opinions about Duplex range between enthusiastic and supportive to skeptical and concerned.

Is Google Duplex ethical?

Some raise questions about ethics: we are talking to a machine while being unaware the person on the other end is not a person.

Is Google Duplex deceitful?

Duplex is so good at carrying a basic conversation in a normal voice that our ears cannot tell us he or she is not a real person but a machine. Duplex is basically impersonating a human being.

These are some of the questions that arise when talking about Duplex that Google may face in the near future.

Google Duplex is a digital tool.

It is our responsibility to use it as intended.

All we can do is take the good while limiting the bad.

Brand Minds is The Central and Eastern-European Business Summit of the Year.