You Can Say That Again: The Role of Repetition in Conversation Design

Why repetition is at the heart of confirmation strategies in conversation design

James Giangola
Google Design

--

This post is part of “Words go here,” a series about the role of language design in product design.

New to the gym, I’ve got repetition on the brain lately — as in “reps,” the no-pain-no-gain kind. But being a conversation designer and linguist, I more often find myself thinking about repetition in language than at the lat pulldown machine.

Here’s an example of repetition in a mini-dialog:

A: Joe’s a waiter, but he paints on the side.

B: Oh, what kind of pictures does he paint?

A: Pictures?! He paints houses.

The verb “paint” is repeated twice, and the word “pictures” is repeated once. Repetition explicitly connects one sentence to another and helps the dialog “hang together” in a meaningful way. Discourse linguists would say that this is a good example of what’s called “cohesion.”

Cohesion happens when the interpretation of one item in a stretch of language makes reference to some other item in the same stretch. Cohesion devices — such as repetition, pronouns, and discourse markers — are words and phrases that connect parts of a spoken or written text so that ideas and details fit together more clearly. Because these devices connect parts of the text in a meaningful way, they help ensure the reliability of our communication and make comprehension easier.

Just as repetition plays a useful role in natural, everyday conversations, it’s also a valuable but often overlooked technique in conversation design. To get to the heart of why repetition is so important in conversation design, we first have to understand the problem of “false accepts” in speech recognition technology.

To begin with, a “correct accept” is when the user says something and the recognizer gets it right. In contrast, a false accept happens when the user says something, and the recognizer gets it wrong. For instance, the user has said “Round-trip ticket to Austin,” but the recognizer wrongly parsed the utterance as “Round-trip ticket to Boston.” Other examples of false accepts would be “June thirteenth” misrecognized as “June thirtieth” …or “forty dollars” misrecognized as “fourteen dollars.”

As speech recognition systems are becoming more and more accurate, false accepts are becoming less common. (Even so, it’s critical to design for them, for reasons we shall soon see.) Perhaps someday they’ll become a thing of the past, but for the meantime, since we don’t yet have “perfect” (well, as-good-as-human) speech recognition, how can we at least minimize the problem? Enter repetition!

Here’s an example of a mini dialog without the recognizer repeating what it “thinks” it heard:

User: What’s 11 times 3?

Assistant: The answer is 21.

This response is confusing, because most of us know that the answer isn’t 21. The design of this message makes it look like the Assistant can’t do basic arithmetic. And, if this were a more complicated math problem, we might just accept the answer as correct and be done with it.

Now here’s how the dialog would proceed if the recognizer were to repeat to the user what it “thought” it heard:

User: What’s 11 times 3?

Assistant: 7 times 3 is 21.

User: No, what’s 11 times 3?

Assistant: 11 times 3 is 33.

By repeating what we think the user said, it’s clear right off the bat that there was actually a speech recognition error — a false accept. By making the system’s “recognition hypothesis” transparent to the user, the assistant no longer appears to be bad at arithmetic (maybe just a little hard of hearing on occasion). And then the user knows to ask again.

When we repeat a key piece of information from the user’s request, that’s called “implicit confirmation.” This sort of repetition is useful in conversation design so that users know whether or not they’ve been understood correctly by the system. On one hand, when the recognizer gets it right, implicit confirmations offer reassurance. This reassurance gives users confidence in the recognition technology, especially if they’ve been burned by bad experiences with other, less accurate systems in the past. On the other hand, when the recognizer gets it wrong, implicit confirmations let the user know that they need to try again or to refine their request. That makes the design “forgiving.” That is, when mistakes happen, the design minimizes the consequences of the error.

Here are some more examples of implicit confirmations:

User: What’s the state motto of Georgia?

Assistant: Georgia’s motto is “Wisdom, Justice, Moderation.”

User: Who is the Prime Minister of India?

Assistant: The Prime Minister of India is Narendra Modi.

User: Call Amber Gómez.

Assistant: Calling Amber Gómez.

Especially if the user has no access to a screen — let’s say they’re driving — then hearing “Amber Gómez” lets them know they’re about to be connected to the right person. After all, think of the tongue-tying, reddening chagrin we feel when we realize we’re now conversing with the wrong contact. This reminds me of the store owner who inadvertently sent me a heartfelt text message apologizing for “the fight last night” — the one we never had — and hoping that I had calmed down. Oops, wrong contact! (And yes, I did respond.)

Of course, implicit confirmations don’t necessarily depend on verbatim word repetition. It’s also possible to substitute with a synonym or paraphrase:

User: Set an alarm for tomorrow morning at 5:30.

Assistant: Okay, 5:30 AM. Your alarm’s set.

…where “AM” is synonymous with “in the morning.”

There are some assistive situations in which implicit confirmation is unnecessary. For example, if you ask your digital assistant to turn on the flashlight, the light should go on — no confirmation needed. Seeing the light turn on is a confirmation in itself, so hearing “Your flashlight is on” (an implicit confirmation) doesn’t add any value. (Perhaps the assistant can simply acknowledge the request with an acknowledgement, such as “Alright” or “Okay.”) On the other hand, if the user is sitting in their living room and asked their assistant to turn the lights off in the bedroom, then an implicit confirmation would indeed be necessary so that the user knows what happened and where (“Okay, turning the lights off in the bedroom”).

Another consideration is whether or not the user will need to look at a screen. For example, if the user asks to see a map of San Francisco, an implicit confirmation of the location isn’t critical because the user is going to look at the screen anyway and then see what the system understood. On the other hand, if the user asks for the weather in San Francisco, we can’t assume that they’ll refer to the screen, so implicitly confirming their request is a good idea.

Repetition of what the user said is essential to another flavor of confirmation — explicit confirmation. Unlike implicit confirmations, explicit confirmations turn the floor over to the user, asking them to confirm with a “yes” or “no” response. Here’s an example:

User: I need a flight from L.A. to Miami on June 7th.

Assistant: Okay, that’s from Los Angeles to Miami on June 7th. Did I get that right?

User: Yes, you did.

…where the Assistant repeats back “Miami” and “June 7th” and replaces the user’s “L.A.” with “Los Angeles.”

Explicit confirmations are appropriate when:

  • The literal cost of an error would be high, e.g. selling 100 shares of Google stock
  • The application is about to take some action that is difficult or impossible to undo, e.g. canceling an order
  • The user has already invested time in providing the system with numerous pieces of information, e.g. making an airline reservation
  • There is a business or legal requirement to get an explicit verbal agreement from the user, e.g. before finalizing a money transfer
  • Confidence about the accuracy of a particular recognition result is low

The advantage of explicit confirmation is that it’s clear to users how to respond. Since users are asked a yes/no question in order to confirm, they’ll know how to get the dialog back on track in the event of a false accept.

However, the disadvantage of explicit confirmation is that the dialog feels slow-moving, especially when confirmations are numerous.

In contrast, implicit confirmations are appropriate when:

  • An undesirable outcome due to misrecognition is unlikely
  • Recognition confidence is high

The advantage of implicit confirmation is speed. When the system has gotten it right, the experience feels relatively fast. When the system has gotten it wrong, the user has only lost a few seconds and can simply repeat or restate their request.

One potential disadvantage of implicit confirmation is that users might not know what to say when they need to repair a false accept. In this case, the recognition grammar should be constructed to accommodate users’ attempts to recover from the error. This requires data on users’ verbal behavior, as well as iterative refinements (“tuning”) of the recognition grammar.

When considering which type of confirmation to use, it’s important to take into account both recognition accuracy and the cost of the recognition error to your user.

To sum up, repetition of what the user has just said is at the heart of confirmation strategies in conversation design. These strategies, in turn, are fundamental in making the design forgiving. That is, confirmations can minimize the consequences of false-accept errors.

Confirmation strategies also build user trust. If the user knows the assistant will never make decisions based on unconfirmed recognition results without somehow double-checking first, they’ll be more likely to trust the technology and feel in control of what’s happening.

For more great tips check out Google’s Conversation Design best practices.

Special thanks to Eunice Liu for assisting with content.

--

--