Best practices: conversation design

An introduction to IBM Research’s Natural Conversation Framework

James Walsh
IBM watsonx Assistant
6 min readApr 3, 2023

--

All images copyright IBM Corp. 2023

Why we need a design framework for conversation

The first generation of websites didn’t provide a user experience that was either pleasant or efficient. It took decades for graphic designers to fully codify the design principles that provide the elegant user experiences that we take for granted today.

Chatbot and virtual agent authors find themselves in a similar situation today. They understand their business use cases and technical requirements, but they haven’t studied how natural conversation works, and that knowledge gap manifests in unnatural experiences for users.

That last point is crucial: natural language does not equal natural conversation, and the recent, admittedly impressive advances in automated text generation can’t account for the knowledge gap with regard to the principles of conversation.

Bringing the field of conversation design forward will require a new vocabulary, new principles, and new patterns. Ultimately, it will require the development of a new discipline: Conversational User Experience Design.

To take the lead on design, the practitioners in this new field, conversation designers, must be the experts on how natural conversation works. They will apply formal knowledge of conversation to the creation of virtual agents instead of relying on intuition and subjective experience. And they will master the application of the interaction patterns that are essential to natural conversation.

This field will represent the marriage of enterprise design principles with Conversation Analysis (CA), a mature sub discipline of Sociology. Dr. Robert Moore of IBM Research developed the Natural Conversation Framework (NCF) to help codify the principles that underly this new discipline.

Moore, Robert J. and Raphael Arar. 2019. Conversational UX Design: A Practitioner’s Guide to the Natural Conversation Framework. Association for Computing Machinery, New York. DOI: 10.1145/3304087.

The NCF provides reusable interaction patterns rooted in observational science. In addition to patterns and reusable code, the NCF provides designers with the necessary knowledge of the anatomy of conversations to speak with authority on the subject to the rest of their development team.

Elements of conversation

What we call conversations are the top level of a four-part structure:

  1. Utterances
  2. Sequences
  3. Activities
  4. Conversations

Utterances are the basic units of a conversation:

  • Complete sentences: “Can you recommend a good Indian restaurant?”
  • Phrases: “How about coffee shops?”
  • Single words: “Maybe.”

Speakers take turns exchanging utterances with each other. Organizing the conversation by turns helps ensure that the parties are able to receive and understand each other.

Sequences are two or more utterances between speakers that are exchanged in order to accomplish an action:

User: Can you recommend a good Indian restaurant?
Agent: Madasrahaba on East 6th has 4.9 star customer rating!

Activities are two or more sequences that are share a common goal:

User: Can you recommend a good Indian restaurant?
Agent: Madasrahaba on East 6th has 4.9 star customer!
User: Do they deliver?
Agent: No, but you can order ahead at togo.com/Madasrahaba

Conversations in this context are defined as two or more activities, usually bracketed by an opening and closing with at least one activity in between:

Agent: Hello! How can I help you today?
User: Can you recommend a good Indian restaurant?
Agent: Madasrahaba on East 6th has 4.9 star customer rating!
User: Do they deliver?
Agent: No, but you can order ahead at togo.com/Madasrahaba
Agent: (Pause)
Agent: Is there anything else I can help you with?
User: No thanks.
Agent: Alright, have a great rest of your day.

Conversations, viewed through the lens of CA, are not a series of disconnected, back-and-forth utterances. They have a structure that can be abstracted into design principles and working code.

How utterances are designed

The first principle of effective utterance design is efficiency. Efficient utterance design is governed by three rules:

  1. Utterances can be understood by the recipient
  2. The minimum number of words is used in order to achieve 1
  3. If the recipient cannot understand the utterance, rule 2 is suspended until understanding is repaired through rephrasing the problematic utterance

Here’s an example:

1 User: I want to watch a movie related to current events.
2 Agent: You should watch an AI movie!
3 User: What does that mean?
4 Agent: You should watch a movie where Artificial Intelligence plays a key role.
5 User: Ah! What are some titles you can recommend?
6 Agent: Have you seen Interstellar?
7 User: I didn't know that movie featured AI!
8 Agent: It does! TARS is the ship's AI assistant, and he's real funny.

This example shows the efficiency gained through utterance design. The vast majority of the users would understand the short version of the movie recommendation (line 2), and only the few who had trouble understanding it would get the longer version (line 4).

Utterances, in the context of conversational UX, are designed so the agent can speed the user toward his or her goal, expanding in length and complexity only if necessitated by the need of the user. This efficiency at the utterance level contributes to efficiency at the conversation level.

How conversations are designed

For conversational UX designers to create an experience that’s as effective as a natural conversation using words only, they need to master the shape of conversations. To shape your conversation, you combine different activities.

Think of it like a hamburger:

  • The meat are your main activities, such as answering or asking questions, fulfilling complex requests, or giving instructions
  • The condiments and fixings are supporting sequences like asking for clarification and saying ‘okay’ or ‘thank you’
  • The buns are the opening and closing activities that hold the whole thing together, like saying ‘hello’, ‘how can I help you’, ‘anything else’, and ‘goodbye’

The following examples illustrate some of the patterns and combinations of activities employed in the NCF.

Open request

The open request module enables you to handle complex requests and sets of related requests.

Extended telling

Extended tellings consist of multiple parts, such as, stories, instructions, or narratives.

Open request with extended telling

The agent transitions from fielding a request to walking the user through a set of instructions.

The patterns can and should interact with each other to form an effective, efficient experience for the participants. The pattern modules used in the NCF aren’t a mere catalogue of patterns but rather form a pattern language that can be applied to a wide variety of use cases.

What’s coming

As the use cases outlined above illustrate, the NCF has clear applicability to business use cases and is an ideal design template for building conversational flows for virtual agents.

The development of that template is underway: Dr. Moore and his team from IBM Research are currently developing a Conversation Design System in collaboration with experts at IBM Customer Care and the IBM Design Guild. The system will include reusable interaction components and working code that conversation designers will be able to upload and use in development environments.

That’s coming later this year. In the meantime, IBM Research is experimenting with applying the NCF via prompt engineering with Large Language Models, and the IBM Watson Assistant design team is developing visualization tooling that will help authors put the NCF into practice when building actions.

Learn more

Many thanks to Dr. Robert Moore for his guidance in producing this article.

--

--

James Walsh
IBM watsonx Assistant

Boston born. Virginia alum. Austin based. UX/UI, LLMs, and other acronyms.