Prototyping for Conversational Interfaces

An Approach to Designing and Prototyping Conversational Interfaces

Steffen Blümm
adorsys
Published in
7 min readMar 18, 2019

--

In my last article I documented how we developed our approach for prototyping Voice-UI applications at adorsys. Principally, prototyping is only helpful if it is embedded in a process which helps you take your idea and helps you extend your understanding of its potentials regarding actual users. Hence, we thought about the prototyping process as a whole.

How did we arrive at this point?

Of course, you start by identifying potential users and gathering information about them. Now, let us assume we already did a first iteration of user requirements engineering and worked out personas, storyboards and a couple of user stories. Now, how do we arrive at the point at which we can invite potential users to participate in usability tests?

When doing a paper prototype for evaluating a GUI design, we want to know if the user can overlook the structure of the interface, the navigation and if she can find central UI elements — like the search functionality. Theoretically, this can be done with hardly any data at all. The basic operation mode of a GUI is point-and-click or tap — and … we are able to point-and-click or tap on UI-elements surrounded by ‘lorem ipsum’.

With a conversational interface this is not sufficient as a starting point, as you cannot build a conversation on such a scarce data set. We need data as the foundation for the scenarios. We need the data as the central point of the conversation. Additionally, the data constrains the conversation and thus, the testing scenarios.

The Prototyping Cycle

We depart from the facts and impressions we gained during user requirements engineering, as well as having completed the first design tasks (e.g. personas).

Prototyping Cycle for Conversational Interfaces

(1) The Prototyping Cycle starts with formulating assumptions based on the data collected during user requirements engineering. We also need to prepare plausible data necessary to support the user stories.
For example, for a banking scenario we should know how much our personas earn, their incomes, on what they spend their money, if they are saving for some specific targets, how much money they have at their disposal every month, etc. We need to design their financial lives plausibly and determine how they spend their money in coarse terms.
Based on this data we think about the concerns and behaviours of the users and how they might interact with our conversational application. What are the questions they want our service to answers? How would they like the data to be segmented for presentation?
In some of our projects we found it difficult to start writing dialogue scripts (or sample dialogues) starting off with the data from the classical personas. We needed to specify the personas more concisely and condense the information gathered during research even further. To achieve this, we built semantic word clouds, categorized the personas within ranges of opposing aims, needs, or characterizing statements. If necessary we worked on a tailored ontology and information architecture. The result of this step is a few drafted scenarios.

(2) In the next step we design the conversations for the scenarios. This is all about externalisation. We write the dialogues. We read them aloud, discuss them, perform them. We draft conversations which match the persona and structure the scenario. We model the flow of the conversation, the utterances we expect to hear from the user and the answers from the virtual assistant we are designing.
Furthermore, we also take back-channeling and discourse markers (1) into consideration. Writing a dialogue this might feel a bit strange at first … but we really need to consider this and use it as a design tool for the assistant’s character.
To support these tasks there are some exercises which we perform to check the conversations we draft. We can easily and quickly check if the scenarios and the data are plausible, by initiating short conversations between colleagues–just take a ball and try some dialogue moves (conversation turn-taking ball game).

laying out the conversational moves and stepping back to explore different paths

Another exercise which is also very insightful — and it involves literally stepping back and looking at the dialogue — is to lay out conversational moves on a whiteboard (or wall). At our office we have two whiteboard walls which are ideal for this exercise.
One thinks about the situation (or dialogue strategy) (yellow sticky-notes), what the user would or could say (written above the sticky-notes / written in green) and what the assistant responds in return (orange sticky-notes: visualisation / UI Elements; message written in red). [This is a technique we borrowed from an Amazon webinar on Situational Design and adapted it to our needs. ]
We used this technique in-house as well as in a couple of workshops. It is a great starting point for discussing the conversational moves and dialogue turns. We always applied it in groups where it is also a communication facilitator. The result of this step is a script of the conversation for the scenario.

(3) As our low-fidelity usability test strategy is based on the Wizard of Oz technique, in the subsequent step we need to render the sentences which the assistant will say to audio files for our testing sessions. Hence, at the end of the step we have the scripted conversation and a great number of audio files.

(4) Finally, in the usability test, we validate our assumptions based on the conversation script and the audio files (see our last article). After the test, we have the user provide feedback (discussion, questionnaire).

From the user test we gain feedback which helps us re-think and refine the data and our scenarios. With each iteration over this cycle we can refine our scenarios (and the data) up to a point where we feel confident enough to start implementing them.

Yes, and .... the Prototyping Tool-Chain

In the last article we emphasized how important it was to be able to make changes to the scenarios and conversations quickly and effortlessly. Having to render those sentences in step (3) to audio files manually would be quite tedious and time consuming. Furthermore, especially in the case of complex scenarios, a tool for testing can definitely support the testing team. Therefore, we thought about how to support this workflow with tools.

draft of our tool-chain to support the prototyping cycle

As we needed this tool-chain for an ongoing project, we started to implement the tools in reverse order. At the moment, we have prototypes of the cpt and the prt. Instead of using a tool for the conversation design, we create a JSON-file of the unfolding dialogue. This is not the most convenient solution, but as digital creators, it is not too challenging for us, either.

prt

This file is read by the phrase render tool (cui-prt). The prt uses the text-to-speech technology of macOS to render the phrases of the assistant into audio files. As personalisation and non-repetition in natural dialogue is important, we conceived a notation to allow us to inject the user name (or sub-sentences) with a certain probability into the phrases.
The prt creates an updated script (JSON) and the audio files. These are stored in a folder structure which relates to the project, to which persona these scenarios are related, the test person and the scenarios. This makes it easy to work with and archive them.

For convenience, the prt also offers functionality to pre-listen to sentences. This allows us to check the pronunciation of phrases while writing the script of the conversation.

cpt

The conversation prototyping tool (cui-cpt) allows us to carry out Wizard of Oz-based usability tests. The UI for the operator is automatically constructed based on the scripts (JSON) and allows the operator to trigger playback of the audio files. We will discuss the cpt in a separate blog-post.

This is the concept for our Prototyping Tool-Chain — for low-fidelity to mid-fidelity prototyping, we can use the cpt. For mid-fidelity to high-fidelity prototyping, we can replace the cpt with actual artefacts (e.g. implementations of the virtual assistant running on a target platform).

Changes to the scenarios and the data can be made quickly and inexpensively, as we only need to update our dialog model (the JSON-file), load it into the prt again, re-generate the audio-files, and load the new file together with the new audio-files into the cpt.
Especially for the firsts in a round of usability tests, it quite often happens that we adjust some phrases of our assistant to make the conversation more precise or make it feel more natural from one test to the next. Or we add a phrase for a related topic we did not have on the radar during conversation design. Little changes, which pop up during tests and which are quick to make. There, this tool-chain really pays off.

At adorsys, we already did quite some tests with our tools and some of them took place at short notice. We think it was also thanks to our tools that — despite rough schedules — we had insightful usability tests nonetheless.

(1) Related to the topic of back-channeling: there is also the book ‘How we talk’ by N. J. Enfield

--

--