What Defines a Successful Bot on Messaging Platforms?
Recent F8 conference, where FB introduced new features of its Messenger platform in an attempt to re-ignite “rocky roll-out” and slow adoption, triggered my thinking about what would make bots more successful.
If you look at top-5 bots on Product Hunt, 4 out of 5 will be platforms for building bots. The hype around bots, that started in 2016 with FB introducing bot-building API for Messenger, hasn’t materialized until now.
I see a few reasons for that.
First, the hype around natural language understanding (NLU) in voice/text interfaces as a substitute or universally better version of visual interfaces is misleading, see for example this great post by Benedict Evans with analysis of voice interfaces. It is simply not the case that texting or talking is universally more convenient than seeing and clicking. Think of any use case where you have to deal reach a decision through by analyzing multiple options, parameters and steps, for example finding the cheapest flight in the next 3 months, and what type of UI would be more convenient for you personally. Bandwidth of our vision is order of magnitude higher than of our typing, reading or hearing (see Moran Cerf’s data from recent Tim Urban’s post).
Second, platform owners, be it Facebook with Messenger or Amazon with Alexa, intentionally or not, positioned bots to developers as a great AI vehicle. This positioning formed expectations of both developers and users that bots are sophisticated agents, capable of complex reasoning. Such perception was naturally driven by expectation of NLU capabilities of bots that are inherent for messengers and voice platforms. But it turned out that human level NLU in chatting contexts is extremely hard do develop, despite all great tools made available to developers, such as Wit.ai or Api.ai. It’s even harder to enable bots with true reasoning capabilities, that would account for quickly changing intents in real life conversations. In many options- and reasoning-heavy cases, when bots promise to be natural and convenient way to transact with businesses, they turned out to be impractical (see Kayak Messenger bot screenshot below).
Moreover, in most of these cases user experience with bots is inferior to apps, where UI is manly in charge of clean visual representation of options, and is not scenario-dependent. Apps maybe a few extra clicks away from user’s current active flow, but it’s still worth switching to make an educated decision faster.
No doubts bots can work well in many complex cases, but development of a compelling bot is very hard. See, for example, Seth Rosenberg’s post, revealing how much effort the first KLM bot required from the Messenger team. Anyone who tried to develop a bot with NLU elements knows, that even the most basic conversations require developing processing logic for a vast number of possible scenarios and that even then a very large chunk of real life user behavior will remain uncovered by bot’s logic. This is so much different from apps development, where a developer doesn’t have to think through all possible permutations of user’s actions. Facebook et al. democratized bot creation a great deal and users saw a flood of bots, but majority of bots remain low quality products (that deservedly don’t see traction) due to hardness of bot’s logic development.
Lastly, even for good bots their utilities will most probably be quite limited, leaving a lot of functionality to traditional apps. This is why better bots, for example Epytome Stylist on Messenger, are very transparent about bot’s functionality up front and leverage all possible platform tools to mix NLU with traditional visual interfaces.
Despite limitations and unimpressive start, I believe bots as a new UI platform has great potential for boosting revenue and efficiency of various businesses if leveraged in certain use cases. I’m thinking of 2 broad categories.
The first category is more consumer-oriented and relates to cases when a bot automates relatively simple actions that currently require users to do multiple app switches and clicks. I’m essentially talking about aggregation and streamlining of well-defined scenarios (“save me clicks”). Imaging you’re looking for a fastest/cheapest way to get from point A to point B. Your current routine could include looking at possible routes, traffic jams and travel time estimates in Google Maps/Waze/Moovit apps, than switching to Uber and/or Lyft to check for ETAs and prices and maybe even checking Zip Car availability or Weather app to make a decision on how to commute best. Worst part — you’re doing the same routine almost every time you need to commute. A single-function bot, provided with location information, could make this task much easier for your by just presenting you top-3 options (cheapest, fastest, most convenient) and executing your choice in “no questions asked” fashion. The first KLM bot, built in collaboration between KLM and Messenger teams, falls into this category.
If you’re working on a use case like that — ping me, maybe we could invest in your company. :)
The second category is more business-oriented. It is about automation of repetitive and relatively basic interactions between customers and businesses (if they are not automated already)— bookings, payments, FAQ answers retrieval etc. Aggregation may add additional value to consumers here, when a bot is also a UI aggregator across various providers (for example, buying movies tickets for any theater in town), but it is not a must. Many of current success stories of enterprise bots fall into this category with customer service functionality.
An important edge case in this category is when a bot is just a substitute for typing in a Google search request or an URL or simply looking at the home screen of an app.
In this case conversing with bots doesn’t add much value over common visual interfaces. Take for example hyped fintech bots for so called conversational banking, when you can ask bot for account balance. Your banking app is probably a much more convenient way to quickly get a snapshot of your finances and to make transactions — simply way less clicking!
An argument in favor of this edge case is that a bot can keep your current conversational context uninterrupted and save you the app switching effort. This was probably the rational behind extensions — an option to access some service providers functionality from your current conversation, that Facebook has just added to Messenger, following the move by iMessage.
In many cases, e.g. OpenTable, extensions just open web pages from service’s web site in the Messenger shell.
Extensions are much less about conversational bots and NLU and much more about bringing traditional UI to your conversations seamlessly. In some cases the idea of inviting a third party to your conversation to do something may feel quite weird because that would interrupt the conversation flow. And sometimes, when it’s about content sharing or making a collective decision, e.g. sharing a song or ordering a take away for the team, it may feel natural and convenient.
Extensions can “save clicks” and be natural part of conversations. They can help Messenger become a more powerful ecosystem similar to WeChat. But one of the major bot challenges remains true for extensions — creating seamless mechanism for discovering a relevant bots is hard. Time will show if the “bot store” offered by Facebook solves the challenge. I’d personally love to see the evolution of the store from just a pop-up tab to a seamless in-conversation search engine, similar to the mentioning mechanism of messengers, e.g. add Spotify bot to a conversation by typing “@Spotify” or, more broadly, “@Music” for a default bot in the category.
As much as I’d love to see more progress in general AI and digital reasoning, it seems more likely that AI will keep developing in “verticals” — not just business verticals, but also tech verticals like object recognition in videos. This means that for bots/extensions to see real user adoption developers will need to 0) carefully think about bot value add vs. traditional channels 1) manage users’ expectations by being upfront and explicit about bot’s purpose, functionality and limitations 2) keep NLU functionality under tight control and 3) in many cases incorporate human handoff to bot flows. I see these conditions essential for building trust and true scaling.
Would love to hear your thoughts on the future of bots for messengers!