What do Voice Interface Experts Think About Design Guidelines?

Krishika H. Khemani
Voice Interfaces and HCI
7 min readFeb 2, 2021

This article is in collaboration with Stuart Reeves, Assistant Professor in Computer Science at the University of Nottingham.

Photo by Thomas Kolnowski on Unsplash

Recently we’ve seen a big growth in interest in Voice User Interfaces (VUI), conversational AI and conversational design across the board. Academia, tech industry, and many other organisations seem excited by its promises. Significant platforms have been established to support and deliver VUI services by major corporations like Amazon, Google, Microsoft, Apple, IBM, etc.

With this has come a proliferation of VUI design guidelines both from major platform vendors (Amazon, Google, etc.) but also from wider industry (Cathy Pearl’s book, VUI adaptation of Nielsen’s heuristics), which aim to help voice interface designers build voice applications both in general and for specific platforms. These are clearly a similar endeavour to existing interaction design approaches, from Nielsen’s usability heuristics to Apple’s Human-Interface Guidelines. While these classic GUI interaction design principles, guidelines and heuristics have a mature place in practice, we wondered if VUI guidelines were likely to follow this route.

So, we set out to investigate what designers actually thought about them, how they used them, and what future they saw for VUI guidelines. To this end we interviewed nine VUI industry professionals with a range of experience from VUI conversational design, through to VUI software and architecture development. Our participants ranged from 1 and 10 years of experience in the industry, with some also having a strong understanding of Interactive Voice Response (IVR) systems.

What are guidelines anyway?

Although we used the term “design guidelines” in interviews, several connected terms naturally came up, including “best practices” and also “heuristics”. Sometimes interviewees made a distinction between these while others used them interchangeably.

Setting that to one side we noticed a clear debate going on between participants about two ways of making sense of VUI guidelines. One way was seeing them more like instructions or even checklists, as recommendations to designers about specific ways to go about design, e.g., “Keep lists under four items.” Another way was seeing VUI guidelines as embodying and communicating broad design ‘values’, e.g., “keep it simple”, offering grounding in the general ‘characteristics’ of good VUI design.

The takeaways are that most obviously we noted a clear lack of clear agreement. Secondly, it seems likely there’s a lot of spill over from existing debates in UX and interaction design and these impact VUI guideline discussions: so VUI-specific guidelines and platforms are simply not starting with a clean slate here when presented to VUI designers.

How do VUI design guidelines get used?

We noted a clear correlation between how VUI professionals actually used VUI guidelines in practice, and how much experience they had. One noted (perhaps wistfully?):

“In those early days, I read a lot of those design guides and kind of treated them as a set of rules.”

Broadly we identified different stages of professional development for VUI designers in their relationship with VUI guidelines.

Participants who had only a year or so of experience tended to rely more closely on design guidelines which in turn sensitised them to important features of conversation that may play a role in design. As one reflected:

“Google’s conversations and Alexa’s conversation design have given me a foundation for understanding how good conversation is.”

In that way design guidelines do play a role as a supportive tool to help develop an initial pathway into VUI design.

But for more experienced VUI professionals, guidelines simply seemed less important and some reported essentially ‘internalising’ them. Instead we found a repeated emphasis put on the process of VUI design, within which design guidelines might feature. We also found that some more experienced interviewees described the importance of developing their own VUI guidelines, which then are used in combination platform-specific ones.

Overall, the picture looked like a pick-n-mix approach to the use of VUI guidelines alongside some custom tooling, although guidelines do influence in ‘path setting’. So, while we can find fault with design guidelines for many reasons, they do help in orienting new designers, despite their background. Further, they have a role to play for the more experienced is more complex than those developing design guidelines may realise, and VUI guideline creators should bear these different forms of use in mind when building them.

What problems are there with VUI design guidelines?

Various shortcomings of existing guidelines were also raised. Most interviewees argued current guidelines lacked focus on linguistics (e.g., psycholinguistics, sociolinguistics), while other key areas that needed development were guidelines for better management of turn taking, better understanding of how to maintain context, how to deal with repair, and changes of topics. It was also felt current guidelines do not offer as much help as they could regarding matters of inclusivity.

Some participants also claimed that design guides are “oversimplified” (especially when presented as a set of “dos and don’ts”) and that they are often lacking in sufficient situational-based examples. Consequently, one might know what the principle is, but could struggle with applying it in the right situation, potentially making guidelines feel like they were restrictive in nature.

Design guidelines were of course not the be-all and end-all. Some interviewees attached importance to developing a technical understanding of the way voice services work (e.g., automated speech recognition (ASR), natural language processing (NLP), etc.). We felt this mirrored long-standing and pretty vigorous debates in UX & design about whether designers should know how to code or not. At the same time, some interviewees felt the lack of control over underlying technologies of ASR and NLP within major voice platforms rendered this moot.

Should VUI design guidelines be standardised?

On the surface, interviewees tended to suggest that a lot of VUI guidelines, principles, and heuristics were broadly similar. It would thus stand to reason that there is a possibility such guidelines could be harmonised, unified in some way and in a sense ‘standardised’. We note there has been active discussion about this within conversational user interface research communities. Further, Amazon recently (2020) published the first version of its Multi-Agent Design Guide under the Voice Interoperability Initiative (VII). The purpose of this is to deliver multiple voice services (termed ‘agents’) offered by different companies (e.g., Accenture, Microsoft, Facebook, and many more) on the same device — some attempt at platform standardisation (notably, Google and Apple are not a part of this initiative).

Yet interviewees presented a number of different responses to this question about the significance of bringing together VUI guidelines.

Some felt standardisation, while potentially desirable, was nevertheless simply too early and subject to future technical developments. Or, in other words, ‘we aren’t there yet’.

But other interviewees were actually hesitant about the idea of standardisation as they felt the nature of human language resisted standardisation processes, putting it slightly at odds with ideas about the importance of consistency and convention in broader interaction design. One participant suggested that standardising could be helpful as it would support a common design approach across platforms, but at the same time felt that reducing variation to a fixed set of concepts and terminology might in fact lead to problematic restrictions for VUI design. Building on this, other interviewees pointed out that although a lack of standardisation meant it is hard for new designers to understand and map meanings across different VUI guidelines, term differences might be doing something useful in that difference.

A key friction against standardisation was related to platforms. Interviewees nevertheless felt platforms were somewhat destined to get in the way of standardisation attempts, and with good reason: different platforms have different capabilities. This means that VUI guidelines for specific platforms aren’t necessarily just about pushing a particular brand or product identity, but more substantively are also deeply wedded to the underlying technical capabilities and offering of that platform.

One interviewee suggested a quite different approach to thinking about standardisation, rather than the ‘all or nothing’ view. In this view a limited set of specific aspects of VUI design could benefit from standardisation such as audio, volume, clarity, or comprehension (analogous with visual aspects of colour, contrast and so on). Furthermore, these features of VUIs are more feasible to normalise as they are not platform-dependent.

Takeaways

Our interviews indicated wide disagreement over the meanings of “VUI design guidelines”. This might be inherited from inconsistencies in language stemming from prior interaction design “guidelines”, “rules of thumb”, “principles”, “frameworks”, and so on. This language needs clearing up because it is potentially creating confusion, hindering VUI design and community cohesion.

Our interviews indicated that VUI professionals have quite a complex relationship to the emergence of VUI design guidelines. While useful initially, guidelines may end up simply being superseded by practical experience. This is not necessarily a bad thing and may be the ultimate actual purpose of VUI guidelines. Explicit positioning on issue up front may better support emerging VUI design career pipelines.

Finally, it was clear that even though participants seemed to indicate similarity between guidelines, there were mixed feelings about a process of standardisation. This connects back with the points about the complexity of language, which may be such a fundamentally different interaction modality to existing ones (touch, keyboard, mobility, etc.) that expectations need a complete reset about the eventual role of VUI guidelines. This links back to the terminology point above — VUI designers should be wary of the highly overloaded use of terms like ‘guidelines’ and the potential communicative cost (e.g., confusion) of using such heavily laden language.

Krishika H. Khemani, is currently an intern with the Horizon Digital Economy Research Institute (EPSRC grant nos. EP/T022493/1 & EP/M02315X/1).

--

--

Krishika H. Khemani
Voice Interfaces and HCI

Research intern at Horizon Digital Economy Research Institute (University of Nottingham) | UX Research