Bots: Conversation is More Than Text

What goes up must come down. A few months ago, chatbots were The Future of Everything. Forget the web, forget app stores. The Time of the Bot was at hand.

Now, it’s allegedly all over — with many in the media suggesting bots are just the latest round of empty Silicon Valley hype.

Thankfully, some of the dialogue is less extreme. After all, nothing is the future of everything; perhaps what we need is a little nuance. Bots captured our imagination with good reason. Before we write them off, let’s explore these issues and their implications.

Does the problem lie in our insistence that bots be conversational? Kik’s Ted Livingston writes, “natural language processing and artificial intelligence are not yet accomplished at managing human-like conversations,” but, “bots are useful even without the conversational element.” That is, other aspects of a messaging platform—such as the permanence and familiarity of a thread, the lack of download, the simplicity of interaction, or the built-in identity and social context—can be compelling on their own. Dan Grover (formerly WeChat, now Facebook) goes further:

“Designing the UI for a given task around a purely conversational metaphor makes us surrender the full gamut of choices we’d otherwise have in representing each facet of the task in the UI and how they are arranged spatially and temporally.”

And they’re right, to some extent. We’ve spent decades evolving graphical user interfaces to be far more discoverable, usable, and efficient than their command-line predecessors; why would we throw that out and go back to 1970?

But conversation is more than just text. A face-to-face conversation layers subtle facial expressions, gestures, and tone of voice over the textual content — indeed, we can converse without uttering a single word.

Similarly, every digital interaction is a dialogue — whether it’s a simple text chat, an exchange of video and voice clips, a series of button presses, or manipulation of a chart. We can build it to be more or less explicitly conversational, but it doesn’t suddenly become unconversational when we introduce GUI.

By way of analogy: Say you go out to that new restaurant around the corner. As you sit down, the waiter silently hands you a pencil and a three-page form, on which you can peruse the menu and mark off your choices. You make your decisions, hand it back, and a few minutes later your food arrives. You’re intrigued: it’s efficient and error-free, but also eerie and antisocial. And the waiter won’t answer any of your questions, which is problematic given your peanut allergy. You walk out satisfied with your meal, but lacking a connection to the business. You don’t come back.

That’s how an app works, and it’s perfect for a variety of use cases: writing a document, perusing a map, composing an email. But in many cases, it’s not ideal. And apps can be a bit much: there’s overhead to build them, overhead to download them, overhead to manage them on a home screen.

You’re not surprised when that restaurant closes, and decide to check out its replacement. As you sit, a much friendlier waiter appears:

  • WAITER: Hi, welcome to Café Bot! I’m Andrew, and I’ll be taking care of you. Would you like to hear the menu?
  • YOU: Er, OK.
  • WAITER: Great! We have seven appetizers: a field greens salad with candied hazelnuts and a soy-miso dressing; a kale salad with tangerines, feta, and…
     [time passes]
     …then there are twelve entrees: a panko-encrusted sea bass with seasonal root vegetables; a smoked pork shoulder with a cherry-blackberry reduction, seared pea shoots, and basil-sage orzo; a rack of…
     [more time passes]
     …and the bourbon bread pudding with a scoop of vanilla bean ice cream and a tiny oatmeal cookie.
  • YOU: What was the second entrée again? The panko-encrusted pork shoulder?

This is how a chatbot works. In many ways it’s better: it feels far more human, and you can ask questions at any point. But it’s also tedious and difficult, with plenty of opportunity for confusion and a lot of thinking required.

Of course, a real restaurant uses neither of these extremes. The menu is a GUI: laid out for easy perusal, with the ability to point to stuff if you can’t pronounce it. But even with that GUI in play, the interaction is conversational. It’s an undeniably human exchange between the waiter and you, centered on the GUI, with the opportunity to ask questions as needed.

People are social creatures, with conversation baked into our brains. Let’s take advantage of that—but also recognize that we can be conversational while employing the full power of the GUI. Adding buttons (as Facebook’s Messenger platform did in July) doesn’t have to nullify the conversation; it can enhance it.

Tomorrow’s messaging experiences aren’t simple chatbots, nor are they apps embedded in a messaging shell. They’re something in between, “conversational apps” with the permanence and humanity of a thread and the interactive flexibility of an app. GUI makes them powerful, easy to use, and efficient. Conversation makes them approachable, predictable, simple, and human.

If this seems a little abstract, that’s because it is. We’ve just begun building these platforms, let alone the experiences that sit on top of them. How do we weave GUI and thread-based conversation together? How do these next-generation bots interact with all the things that already happen in chat? We don’t know all the answers, but some are taking shape. My own startup, Emu, took a stab at this back in 2013. Cola, another startup, is approaching it as a platform, as is Apple with iMessage app extensions.

At Facebook, we’ve already taken our first few steps in that direction: richer message templates to show structured data like receipts; quick replies and a persistent menu so developers can enable experiences with less typing; interactive integrations with Uber and Lyft that are helping us learn how best to integrate GUIs into the thread. We’re figuring out what next-generation bots will be, and what tools to give developers so they can quickly, easily build great ones.

And we’re not doing it in isolation. If you’re excited about this, head over to the Messenger Platform site and start building now. Then join Messenger’s developer community and let us know what you need in order to invent the next big thing.