A Little Less Conversation, A Little More Action

Pros and Cons of Conversational Interfaces

Published in

Thinking Design

6 min readJul 11, 2018

“Chatbots are going to change digital experiences as we know it! “ was an all-too-familiar rallying cry little more than two years ago. But after a flood of half-baked virtual assistants in the early adopter period, very few of them, if any, have taken a hold in our day to day lives. However, in reality, the chatbot buzz wasn’t just about chatbots. Chatbots were representative of a new paradigm made available through the advancement of NLP, or Natural Language Processing. That new paradigm was that of CUIs, or Conversational User Interfaces.

CUIs aren’t especially new or novel as it turns out. Have you ever pressed “1” for customer service? Have you ever repeatedly screamed “OPERATOR” while trying to navigate a decision tree? Did you ever navigate MS-DOS using different commands to launch Wolfenstein 3D?

All CUIs.

Danger: Linguistic Nerdiness to Follow

What it all comes down to is translation, or, to put in instructional design terms, the communication gap. When we as humans interact with computers, there is a constant translation process happening: We have an idea of what we would like to accomplish in our mind. The technology — in this context, a computer — has functionality that helps us realize that task. However we have to be able to translate our thought into a language that the computer can understand.

That is why the first interfaces were all text based. It was all about translation.

This idea of translation isn’t restricted just to human/computer interactions either. If you are an English speaker, you might be surprised to learn that you don’t think in English. Don’t worry, you’re not some savant thinking in French without knowing it.

As humans, we think it terms of concepts, ideas, and principles. Language is just a construct that exists to help us communicate those things to someone who isn’t inside our brain — or in other words, everyone besides ourselves. Language is a construct that helps us bridge the communication gap.

Another way to think of it is this: If you’re trying to learn a new language, you should never be attachingEnglish words to their counterpart words in the target language. That will result in an additional layer of translation. You’d have to:

Think of the concept,
Think of the English representation of that concept, and then
Translate the English into the target language.

Instead, you should be attaching the target language words directly to images and concepts. If you’re truly fluent, you will stop thinking in English altogether and you will cease all English to target language translation.

However, language isn’t the only way to communicate. Some more complex ideas and principles need to leverage other methods. Hence the existence of different mediums that enhance that verbal communication such as music, art, or cinema. We can also communicate using different senses such as touch, taste, and smell.

There are other restrictions to bear in mind as well: In a military operation, when a team is preparing to breach a hostile environment, it may be much more effective to communicate with a simple hand gesture as opposed to whispering “I’ll hold here and you go around the left flank.”

The optimization of the tools we use to bridge the communication gap are totally dependent on context.

Back to Chatbots and CUIs

So what does all of this have to do with the failure of chatbots/CUIs? This lack of context was a large aspect of CUIs’ false start within in the past few years. Companies weren’t considering it as one tool in the experiential tool belt, but rather looking at it as a wholly new platform upon which everything would be built. It was a classic Jurassic Park moment that we see happen often in emerging technologies.

So am I therefore anti-chatbot or anti-CUI? Of course not. And I think I may have recently stumbled on a new frame of reference that could help us decide the right times and places for these types of experiences.

Pizza Is the Answer to Everything

Recently I was moved to a new team within Adobe called the Emerging Initiatives group. Our goal is to investigate and pursue new trends, technologies, problems, and solutions within the context of Adobe. As part of this new assignment, I have been thinking a lot about CUIs and chatbots. Being the narrative-centric designer I am, I decided to put CUIs to the ultimate test: ordering pizza. Using the Domino’s pizza online ordering system as my core set of functionality, I started building out three different narratives.

The first narrative described the actions a user would take in ordering a pizza by using, more or less, their existing Graphical User Interface (GUI) on Dominos.com. The result looked something like this:

The second narrative imagined if the user had never used Domino’s to order a pizza before and the user wasn’t aware of the different options available, but the user was forced to order the pizza via chatbot/CUI. The result was less than elegant:

The third and final narrative continued in the CUI direction, but however this time the user knew exactly what they wanted to order and what options were available. This was the use case where CUIs really started to sing and bring delight and ease to the user:

My Pizza “Aha!” Moment

It was in the third narrative I started to identify a sweet spot for CUIs. CUIs have a definite weakness in that they are rigidly linear in how they have to interact with the user. In an exploratory use case, they become incredibly tedious very quickly. However, CUIs are incredibly effective at interpreting a user’s intent to summon existing functionality. Or to put it another way, CUIs are robust and intelligent hotkeys. They are a way that we can quickly access functionality that we know exist, without having to go through all of the interactions to select what we’re desiring. More or less, they’re dynamic ad-hoc functions that the user can invoke with a simple phrase.

You would probably never want your entire Facebook feed delivered via CUI:

Sounds awful.

However, there is something elegant about walking around your house and being able to talk to Alexa to find out which of your friends have birthdays today, deciding which of those friends you’d like to send a $5 to, and completing that action right then and there.

I think there are more use cases on the horizon as well. As machine learning and artificial intelligence technologies continue to advance, they’ll be able to detect and predict our intent before we have to explicitly ask for any given functionality. There are also opportunities on the horizon for natural language voice as a controller for digital interfaces. But as all of these new patterns begin to be explored and developed, we have to remember the one most important guiding principle: context.

Because at the end of the day, the goal for most of our interactions isn’t to have a conversation, it’s to accomplish an action. So when it comes to CUIs, I’m a big believer in Elvis.

A little less conversation, a little more action.