Many of you probably know that I have my own bot — but one of the more useful features of my bot is its ability to field private messages on my behalf. These messages get sent to me via Messenger, and I respond from within that context. Unlike email which I can and tend to put off indefinitely (sorry!), messages that I receive via MessinaBot stack up quickly, and therefore need to be addressed serially, as they’re received. This forces me to respond in a relatively timely and brief fashion.
Sometimes though, I get tougher questions that require a bit more thought. Given the frequency of questions that I receive related to bots, I asked Anne Cathrine Saarem whether she’d mind if I published my responses to her questions publicly. Fortunately, she said yes.
She’s working on a research project exploring how people perceive and interact with computers that talk. She’s specifically looking into research into anthropomorphism in HCI, HRI, and psychology to see if there are principles or lessons that can be used when designing chatbots. With that in mind, here are my responses to her questions.
What is, in your opinion, the greatest benefit for the user when it comes to using bots in CUIs?
There are several ways to address this question, because it’s misleading to suggest that there is a single “greatest benefit” for bots in conversational interfaces — it really depends on user context and goals, among other things. However, there are two general benefits worth calling out:
CUIs are convenient
The current model for accessing services on mobile devices usually starts with a search of some kind (typically with Google or in an app store), and then requires quickly learning to use a mobile web app, or downloading and installing a native app. A key benefit of conversational runtimes is the user already has the software necessary to enable them, and can bypass the installation, setup, and configuration steps.
Consider that most users probably already have a messaging app downloaded and authenticated, including Messenger, Telegram, Slack, Skype, Kik, or one of the other platforms. Or, they have an assistant-style bot integrated into their OS (like Siri, Cortana, or the Google Assistant). Consequently, the time from intent generation (“Oh hey, I wonder where my Uber is?”) to fulfillment (or partial satisfaction) is greatly reduced (“Your Uber is three minutes away”). There’s nothing to download, no need to sign in (again), and no new interface to learn. Users can just express what they want in their own words or through a set of guided prompts and execute the request. As more services and brands optimize for conversational channels, the time to intent fulfillment will continue to decrease, leading to an increase in convenience.
CUIs are adaptable
The second general benefit is adaptability through personalization and context-awareness.
What I find so interesting is that we live in a world where apps are designed to fit the device, rather than the user, and certainly not for the state of the user. This makes sense from a design, engineering, and testing perspectives, but it puts the onus of adapting (learning and mastering the software) on the human, rather than having the software adapt to us.
This is not how humans behave, but for some reason we make an exception for our technological extensions.
Consider these scenarios:
- I text Esther but she responds “Can’t talk, driving”; I’ll hold off from sending her a wall of texts
- We’re discussing upcoming travel plans on the phone but she wants to do more research. We can easily pick up from where we’ve left off, later.
- I’ve just returned from an international flight and I’m tired and hungry and on my way from from the airport, and we’re trying to coordinate on dinner plans for the evening. I don’t have enough headspace to make a decision, and so I ask her to just “pick something for me”.
In each of these cases, because we can either predict, infer, or know the context of the person we’re interacting with, we will adapt how we behave based on that information. Today’s apps, in contrast, largely take a one-size fits all approach, assuming that everyone is always rational and able to operate intelligently. And yes, some predictability and consistency in interface design is necessary, but designing services and experiences for mobile should do more to adapt themselves and their offerings to the user’s situation and contexts.
To apply this to CUIs, consider how your bot might support:
- Throttled communication: allow your user to let you know that they’re busy and to pause new messages for some period of time, or until some criteria is met (i.e. the user arrives in a geofenced area, like “home”)
- Stopping and resuming tasks: allow users to pause multi-step tasks and pick up where they left off later; this is especially relevant for complex onboarding processes
- Simplified interaction mode: sometimes your users just aren’t going to be playing with a full deck of cards (they may be sleepy, in a social setting, drinking, etc) and being able to pick up on their mental acuity may help your bot feel more responsive and empathetic if it can scale back interaction, offer to pause a task (see previous), or make a best-effort attempt to arrive at an outcome automatically (rather than reviewing every option).
To put it simply: if the humans you interact with adapt to your condition and situation, bots should too.
What do you see as the main limitations of bots?
Just because I’m a proponent of bots and conversational software doesn’t mean that I believe they’re a panacea. There are still many challenges to overcome. We’re in the opening innings of the shift to a more adaptive, conversational, and environmental computing paradigm. That said, here are a few issues currently limiting the success of bots:
- Discovery: iMessage has done a good job of attributing stickers inline, which helps these apps spread socially. Messenger currently doesn’t support bots in group threads, but Slack, Telegram, and Kik do. Thus the problem of bot virality and discovery is something that varies from platform to platform, and makes it difficult for newcomers to be found . You can now buy click-to-message ads on Facebook and Google , but that of course that requires 💸.
- Multi-player mode: related to the issue raised above, most bots aren’t designed or able to operate in multi-user contexts, with the obvious exception being Slack bots. Given that messaging is fundamentally a shared and social computing context, this will need to change over time.
- Utility: one big perception problem with bots (it’s not just perception, frankly) is that bots don’t really do much that’s useful yet. Perusing the Alexa Skills Store or Botlist , for example offers a lot of random stuff that seem esoteric at best. Still, so much experimentation means that someone will eventually get it right.
- Universality: because it’s early days and people are calling messaging the new browser , there’s a lot of hand-wringing about the lack of standards. I don’t think this is as big a problem (or opportunity) as others do, but I am concerned that interaction learnings from one platform may not translate easily to other platforms, leading to user confusion or hindering trust and adoption. Once again, because we’re early, best practices will emerge and likely be adopted across every platform over time.
- Flexibility: Again, in contrast to the web, native messaging interaction widgets are fairly limited. I consider this a good thing, since it forces the complexity to be managed by the service provider, rather than dumped on the user. Still, there are opportunities to expand native widgets to include things like date pickers, keyboard integrations, and more.
- Extensibility: Single sign-on, portable and cross-bot user preferences, notification aggregation, and payments are areas that could improve how bots and conversational services streamline interaction with users. Of course, many of these challenges exist on the web too — and while the rewards for solving these issues will be significant, dangers is also high.
- Sophistication: many bots are relatively straight-forward and don’t handle failure gracefully. More sophisticated bots would be able to learn from their mistakes and correct them in time for the next conversation, but would be able to do so without compromising the boundaries of user privacy. Furthermore, sophisticated conversational services would be able to hand-off between modalities seamlessly, switching from voice to messaging to browser without skipping a beat. We’re still not quite there yet.
- Contextuality: one reason that bots aren’t able to be more sophisticated is because we’re still in the early days of being able to universally track and monitor a user’s context. While Google’s Awareness API hints at the future here, it’s still relatively circumscribed in the data it offers, and how well it integrates with other contexts, like the connected home or connected car. Watch this space though — a user-centric contextual API platform could emerge that would bring an entirely new level of contextualization than we’ve seen before.
- Talent: as if the above issues weren’t significant enough, there’s a problem with a lack of skilled talent in this domain. Furthermore, traditional tech companies may be at a disadvantage, because the necessary skills for creating engaging and delightful bots are more likely to come from the humanities than computer science . Fortunately, classes are popping up to fill this void, but it may still take some time before applicants with the requisite skills enter the job market.
Which verticals would you recommend for bots?
Every vertical or industry that requires communication for success should consider how bots, automation, and conversational services can improve their accessibility, responsiveness, convenience, and offerings. So: just about all of them.
What do you think is important to keep in mind when designing chat bots?
Do you have a set of principles or methods you would recommend to designers starting out with their first bot?
Designing conversational interactions is different than designing pixel-based interfaces. There are already some great resources out there that go into this topic, so I won’t repeat too much of that here. I will however, offer some high level thoughts:
- Think: mobile always: Rather than designing for a static and continuous user context (i.e. a user seated at a workstation), focus on designing flows that can be broken into micro-sequences that fit into a user that’s on-the-go, and switching between devices. As I suggested above, being responsive to user context is key — which means being able to stop, start, and continue processes on-demand, and across platforms.
- Nail your core competency: just because it’s easy for your bot to offer the weather doesn’t mean that you should. Stick to your core use cases (unless you’re building a rival to Siri or Alexa!) and go deep. In the case of gift-granting bot Eva , it turned out that there were so many edge cases and considerations that just offering wine, chocolate, and coffee took up most of the team’s energy. As a result, they honed in on those products and got the experience right, rather than trying to recreate a marketplace like Amazon’s that catered to every whim and fancy.
- Be humanlike, but don’t confuse people: one of the advantages of the conversational context (exception: daily news bots) is that you can layer in humanlike language which is more approachable and relatable. However, it’s important that people have a sense for whether they’re interacting with an automated system or a human. If you intend to offer a hybrid experience with automated and human elements, make it clear which type of actor the user is interacting with. This will help increase trust and set the right expectations. Consider this exchange as a warning against pretending to be a human.
- Be playful but not silly: since we’re still early in natural language generation and understanding, and the safer path is to use structured interactions or quick replies , there’s going to be some frustrating moments as we figure out best practices. Add in some personality and character to your conversational service (like the occasional emoji), but don’t overdo it. As with most seasoning, a little can go a long way towards a more delightful and fun interaction.
- Solicit feedback: immediately after you’ve completed a task (or failed at it!), ask for some feedback on how the experience could be improved. Be sensitive about asking too forcefully or repeatedly, and accept feedback in the same channel where the task was performed unless you get specific permission, say, to send a brief web-based survey.
There’s plenty more that I could write about this topic, but given all the above, I’d love to know what you guys think, especially those of you who are building or have built bots. What’s working for you and what’s not? How have you evolved and changed your services over time? How would you respond to Anne’s prompts above?
Leave a response and let’s keep the conversation going!