The future of messaging bots and chat for business

Post-webinar thoughts and answers to attendee questions

Earlier this week I gave a webinar in collaboration with Infobip on messaging bots and chat for business. You can watch my slides with verbal commentary or browse the visual here:

To be honest, there was so much content to cover that I really struggled with how to distill it into a 30 minute talk. In the end, I decided to go with this outline:

  • About me (i.e. to establish some basic credibility)
  • A bit of a recap on 2016 to discuss the big picture
  • A brief history explaining how computing has become simpler and more casual over time
  • Some terms to help us get on the same page (I updated these terms in the above slides after some feedback)
  • Some challenges and some benefits of conversational UIs
  • An overview of the platforms and the anatomy of a bot (borrowed from Andy Mauro)
  • A sample of bots (primarily on Messenger) from across the landscape
  • Additional resources (notably my Refind links and #ConvComm collection on Product Hunt)

I also previously answered some questions about MessinaBot, what I’m expecting in 2017, and a few other topics in a pre-webinar Q&A.

At the end of the webinar, I offered to take questions, and so I’ll answer them here (some I already covered in the webinar, but I’ll write up answers for completeness).

Q: Isn’t chatbot just another variation of IVR (either DTMF or speech recognition based)?

This question wants to know whether the hype around messaging just means shifting conventional IVR (interactive voice response) systems to a new context, and my answer is no, not really.

If you think of it one way, IVR is a kind of assistive technology (like a screen-reader) that helps you navigate a set of known options in a tree. For companies especially, they tend to get the same questions over and over again, and rather than devote human effort and attention to addressing these issues, it helps (ideally) both parties to walk the caller down a tree of potential options to try to resolve their request automatically without involving a human. So isn’t that what we’re doing with chatbots?

Some chatbots, yes, for sure. Some chatbots literally take an existing IVR system and port it to a new platform or context. That’s not unusual or unexpected, given how budgets work. But to stop there ignores the potential for these platforms, as well as what advances in AI and machine learning may offer in the way that they adapt, respond, synthesize, and respond to inbound user queries. Typical IVR systems (the ones I’m familiar with as a consumer anyway) tend to force you down known paths. In messaging, the opportunity lies in a hybrid model: to allow a user to express herself in her words (to specify an intent) to have the system respond with a relevant and contextual response, avoiding the multi-layer decision tree workflow altogether. Furthermore, thanks to implicit identity and the interface affordances available in the messaging context, each clarifying question can suggest personalized “quick replies” that provide a range of options that are clearer and more efficient. It’s the emergent possibilities of these platform aspects that specifically open up smarter and more contextual experiences heretofore infeasible.

Q: I’m looking forward to seeing Whatsapp incorporate bots. It will be a game changer, as even my parents use Whatsapp now. What do you think?

I agree! However what I’ve heard is that the way that WhatsApp was built — to privilege privacy and encryption—makes it very difficult to introduce bots into existing conversation threads. In other words, because messages between the sender and receiver are encrypted, WhatsApp, nor third parties, can access the flow of conversation. From an architectural perspective, it may not be currently possible for WhatsApp to support bots, though I’m quite confident that they’re gearing up to figure out how to support businesses on the platform, somehow.

Q: Could we teach machines programming languages and ask them to create software based on our voice and visual inputs and design elements at some point in the future?

This is a fun question! In some ways, this is already happening. You can create a 3D model of yourself. Microsoft has research that can synthesize your voice from samples of you talking. Adobe’s VoCo can also synthesize speech that sounds like you. There are also emergent products, like Replika, you can train by feeding them data about you and that in turn can mimic the way you talk. So, in a word, yes, sometime in the not-too-distant future.

Q: If a bot fails to understand the conversation, how many times it should try to understand before it passes the chat to customer service. What would the communication method be, continue the chat (even though the customer might be annoyed) or take it to voice channel?

The right answer depends on your product, customer, and situation, but I’d suggest that if you’re going the route of NLP, you should monitor it actively, and when new user intents surface, fall back to alternative resolution after asking the user to clarify once. It’s possible that you may have the intent trained, but the user specified what they wanted in an uncommon way. It seems reasonable that you’d get a second shot to get it right; after that, the user is likely to be annoyed and bail (depending on the task, of course).

Q: Do you not think that the user has acclimatised to the asynchronicity of messaging and therefore is OK with a business not responding instantaneously? That is how they interact with friends on Messenger for example?

It depends on the business and the users’ expectations. There’s no hard and fast rule; if you’re a local business with conventional hours, it’s reasonable to setup an autoresponder that can set an expectation on when the user will hear from you. However, you should really do the work to receive inbound messages, triage them, and assign a ticket ID so that the user doesn’t have to go out of their way to reinitiate contact with you.

The point isn’t necessarily to solve all user requests immediately over messaging, but to be available via these channels, communicate what you’re setup to handle in these channels, and what kind of responsiveness your customers can expect. There’s a reason why Facebook identifies this attribute clearly on Pages:

You can be considered “Very responsive” even if you don’t solve your customers’ problems immediately.

Q: Are you aware of any examples or case studies of businesses using chatbots internally, to reach their employees?

Absolutely! For a list of bots focused on this use case, check out the HR category in Slack’s App Directory and then Slack’s Customer Stories. For a specific example, take a look at Kip Café, a service that makes it easy for teams to order food!

Q: I came a little late to this webinar and would love to have the recording please.

Here you go:

Q: Most of these examples are for Messenger but many don’t know how to do this. How can bot/AI be integrated straight into iMessage like a normal SMS conversation?

That’s a complex question.

First, while SMS and iMessage do occupy the same context on iOS, and Apple has recently expanded iMessage’s capabilities with iMessage Apps… these two things shouldn’t be confused. SMS is a low information density channel because it’s limited to 160 characters. iMessage is much richer, but has its own challenges from a cross-platform perspective (i.e. there’s no support for iMessage on Android). Even with iMessage’s new extension model, it’s unclear whether anything but stickers will see much retention and use. We may see some new features announced at WWDC this year, so it’s a little too soon to tell what will happen there next.

I’d also suggest that the privacy model that Apple espouses is more conservative than Facebook’s or Slack’s. As a result, developers have to work within a different paradigm when offering services, which sometimes makes it hard for them to integrate deeply and thus accessibly into Apple’s user experience.

Q: I see a lot more space for Bots being used as “Persona” — It feels like Bots have to mimic a real person to be successful

If you think of Alexa, Siri, Google Assistant, and Cortana as bots — then I can see your point. However, for most bots, a personality can be both very challenging to create and may be needlessly pedantic. Oftentimes users just want to get something done or retrieve some information, and too much personality can get in the way (think of how you might avoid a used car salesperson). In some cases, the personality is the content or entertainment that someone’s looking for — like with Poncho, PullString’s Humani, Jessie’s Story, or Microsoft’s Zo. But I’d apply personality sparingly and only when it actually serves to deepen the connection, build the brand’s resonance, or is essential to the experience.

Q: What about business models ? Thanks

Well, this is a broad question! Since communication is key to any business, I’d start there and contemplate how more efficient and responsive communication with your customers or clients can improve your business. But then, if you’re thinking of building a bot-focused business, obviously offering something of real value is crucial, like any other business.

But, I won’t bore you with banalities. This piece lays out seven options to consider:

Q: Do you think the apps and the bots will coexist? If yes, in which contexts do you see them coexisting?

Absolutely. We’re increasingly living in a multimodal computing environment, and we’ll use different computing “surfaces” depending on what body parts are available to us, as so excellently illustrated by Des Traynor’s post on the benefits of voice UI:

Bill Buxton introduced the concept of a “place-ona”, adapting the concept of a persona (which we all love to hate) to show how a location can place limits on the type of interactions that makes sense. There is no “one best input” or “one best output”. It all depends on where you are, which in turn defines what you have free to use.
At a very simple level, humans have hands, eyes, ears and a voice. (Let’s ignore the ability to ‘feel’ vibrations as that’s alert-only for the moment). Let’s look at some real world scenarios:
• The “in a library wearing headphones” placeona is “hands free, eyes free, voice restricted, ears free”.
•The “cooking” placeona is “hands dirty, eyes free, ears free, voice free”.
•The “nightclub” placeona is “hands free, eyes free, ears busy (you can’t hear), voice busy (you likely can’t speak/can’t be heard)”.
•The “driving” placeona is “hands busy, eyes busy, ears free, voice free”.
Based on the above, you can see which scenarios voice UI are useful in and in general the role of voice as an input mechanism.

This insight applies equally well to bots and apps—sometimes you just want to be able to talk to a customer service rep without downloading an app. Other times you want to coordinate with friends who are already gathered in a chat thread. And other times you want to just scroll through a feed of products or look around a map. It all depends on what the user’s current task is.

Q: Please can this work for other social media apart from facebook?

If the question is about other platforms besides Facebook Messenger, yes, absolutely. Each platform has its own audience and capabilities though.

Take a look at Andy Mauro’s Platforms Overview slide deck to get a sense for what else is out there.

Q: Chris, you once said in the O’Reilly podcast that the bot “Hi Poncho” sucks or he failed. How would you improve the chatbot?

Poncho is an interesting product. I know the team and respect their work; what they built for Messenger was turned around in an incredibly short period of time before the platform was fully baked (indeed, it’s still being actively worked on!). I think Poncho partially failed because it took content that is conventionally suited for a visual (screen) or audible (radio) presentation and turned it into text, with an awful lot of personality thrown in. The team tried to make the cat engaging and funny, but for many tech journos at least, it was confounding:

Whether Poncho sucked or failed is a matter of perspective, depending on whether people like me are in their target demographic (I may not be!). Certainly many early bots were panned as well, so it wasn’t just them. It’s just that people had such high expectations for what bots should be able to do and instead Facebook touted, among other things, a weather cat as the next great thing. For many people, it wasn’t. As a result, many people are still waiting to be wowed by bots even though it’s incredibly early in this new era of casual computing.

Q: Regarding The Dream here, how close are we?

It looks like work on this ended in 2003, but judging by the use cases, I’d say we’ve arrived and surpassed what that project intended. It’s unclear how much of the technology to enable The Dream are available as part of the open web platform yet, but certainly the underlying enabling technologies are available via proprietary APIs.

Q: What’s the best example of a chatbot that works with more than one human?

If I understand this question, it’s about a chatbot providing value in a group context. Once again, I think Kip Café provides a straightforward use case where a team in an office wants to order lunch. Kip collects everyone’s preferences from a menu and then places the order on behalf of the group. The use case is familiar and the problem is one that’s hard to solve.

Chris reads every response on Medium or reply on Twitter, so don’t hesitate to let him know what you think .

☞ To hear from him in the future, sign up for his newsletter or talk to his bot.

Please tap or click “︎❤” to help to promote this piece to others.