Humans vs Bots in Customer Support: Musings on the start of the corporate machine learning bot race

There’s an intensifying drumbeat of suggestion that artificial intelligence is approaching human-level intelligence. In this post I investigate the current frontiers of current generation enterprise bots and their most common use case, customer support.

Why do enterprises deploy the technology for this use case? What has our experience been with them so far, what are they good at? What are their failings compared to humans?

So far, it seems, bots have been much better than humans in the knowledge of a company’s products. They were mediocre at understanding language and very bad at answering. They were also lacking in the knowledge of customer’s’ perspective, their actual needs and challenges, as well as their ability to interact with a customer. These limits put a ceiling to their performance that was below humans.

Now new technologies are about to be deployed. As dialogue engines such as Rasa Core get more popular, we see the next generation of bots being much better in answering, their knowledge of customer’s’ perspective and their ability to interact with a customer.

Based on current training rates of Rasa Core clients I expect enterprise bots to surpass humans in customer support within a year. Based on current training rates from other Rasa Core clients, I also expect other corporate functions such as enterprise sales, contract management and HR operations have a superior bot performance in the coming years.

— — — — — — — — — — — — — — — — — — — — — — — — — — — — — — —

Messenger platforms with billions of users are pushing heavily for conversational AI. But despite all of their efforts, human-to-human chat remains the overwhelmingly dominant use case of these platforms.”

Yet, 9 months of seeing startups and enterprises build on top of Rasa, it’s clearer to me what messaging bots are being built in enterprise and where we are at. Individual cases in business directories, education or chat commerce are an exception. Early on we’ve only seen one super popular use case for enterprise bots using Rasa NLU: Customer support & FAQs. And that’s not only true for Rasa NLU but also for “all-in-one” enterprise bot platforms like Reply.ai or customer service automation tools like Digital Genius. There’s hundreds of “startups” (that are really more like software agencies) that are building out customer support for corporate clients.

Investigating the killer use case of the first generation of enterprise bots: customer support & FAQs

So the question is “why”? In my role as Rasa’s angel investor it’s been interesting in to take a closer look at this customer support boom and investigate the current frontiers of current generation enterprise bots.

Reason 1: Customer support is the first viable use case for rule-based enterprise bots

It’s likely that so far we have only seen the ‘pre-app store’ days of conversational AI, where bots built on some early simple tools offer limited utility to users under specific circumstances

The first generation setup: ML-based NLU and rule-based dialogue

In the first generation of bot developers deployed NLP tools such as Api.ai, wit.ai or Rasa NLU. What these tools do is take “natural language” (this means ‘human’) and process it into machine language software can understand.

The example above, for example, illustrates a bot that looks at a message from a human (“I’m looking for a Mexican restaurant in the center of town.”). With the help of the NLP tool, is able to identify the human is looking for a restaurant (intent: restaurant_search) and the type of cuisine (“Mexican”) and where (“center”).

There’s a UX layer here where the humans expectation were set (eg calling the bot “restaurant finder” and adding illustration relating to food) that trained the human to talk in a very specific context.

For the user, this is a little akin to walking through a decision tree. For the experience to work, the user can never stray from the pre-defined paths. The user has to ask the bot exactly the right questions and adapt to the bot to navigate.

For users the first generation bot experience is akin to walking through a decision tree

If the bot has enough information (through text or through additional information such as location data via the phone), then the bot answers based on predefined rules and texts (eg “I can recommend you “restaurant name + address” based on a call to an API such as Foursquare’s.

What’s been built in the last two years, is thousands of bots that have been machine learning-trained well enough to:

1) understand human language, in each node of a decision tree, in very specific contexts.

2) answer in very specific rules based on the most recent conversation

This, of course, sounds very much like a customer support call center experience we all have.

In other words, the popularity of the customer support use case can be explained by the limits of the first generation setup: ML-based NLU and rule-based dialogue. No other use case is as close to the decision tree paradigm.

There’s some industry voices such as Facebook’s that argue that this ML-based NLP/rule-based dialogue is a ‘good enough’ experience for customer support. But as Rasa’s CTO & co-founder Alan Nichol says in his deep-dive there’s many more challenges to be solved, even for customer support.

In my own view, despite large messaging user bases in the billions, it’s much more likely these are the ‘pre-app store’ days of conversational AI. Bots are being built with a very early generation of tools and these tools have not been powerful enough to enable more.

Reason 2: Corporations already have experience in advanced automation of customer support

Enterprises have been viewing customer support from the perspective of process automation for a long time. They understand the problem from this perspective: they have had many ongoing IT projects, have staff who can deal with it that knows how to implement the use case and know what the organization gets out of it.

Most of the enterprises I talked to about this, see their customer support bot efforts as another effort in process automation. Businesses have been engaging in process automation, a strategy to automate process to contain costs and setup organizational standards through software, for a while. They are very comfortable with it.

In non-customer support/FAQs cases, enterprises often go through months of internal coordination of what their briefing is. In contrast, they have established predefined rules and procedures needed for customer support. They can start straight away. Extending these to customer support bots feels natural to corporations.

Bots are available 24/7 at low cost, they won’t quit their job and they won’t ever forget any information they’ve been taught.

Very often the customer support FAQs they have already created become the basis document for the rules bots follow. In fact, this is such a frequent case, the Microsoft has created a simple service that translates FAQs into bots.

https://qnamaker.ai/

The current generation of customer support rule-based dialogue fits the way corporations think.

Even the knowledgeable VCs who are very negative about bots such as Data Collective’s Bradford Cross (“bots will go bust in 2017”) see an enterprise bot case for process automation.

But as mentioned above, automating simple one question/answer requests akin to FAQs has its limits. If you as a consumer have lost your credit card, then you don’t want a FAQ answer where to get a credit card. You want to get a new credit card.

Reason 3: Enterprises want to improve the quality of customer support and have hard time to do so

Pretty much all enterprises want to improve the quality of their customer support. Bots provide a new, limited approach to do both. Bots are available 24/7, won’t quit their job and are excellent in remembering a company’s products.

Improving customer’s happiness with their customer support experience — as measured via NPS for example — is really hard for companies to do. There’s a natural ceiling most of them hit and which they can’t go beyond, regardless of the merits of their products or the amount of resources enterprise throws at customer support.

The natural ceiling of customer support quality is tied to the short time customer support reps stay in their jobs

It turns out that this ceiling is tied to the very short life cycle of customer support employees. Very often, it’s only a couple of months on average that they stay in their jobs. In the illustrated idealized example above it’s 4 months which I have been hearing the most from Rasa NLU customers. In the beginning the quality of the support is fairly low as the customer support person is in training. We all have been through the understandable “Sorry, it’s my first day” excuses. As training kicks in, the quality of the support gets better.

But after a few months on average, CS representatives become demotivated and the quality of their work starts dropping before they leave the company, along with the knowledge they’ve gained about the company’s products, and, more importantly their customers.

The more complex a product portfolio and their fit for specific customer segments gets, the bigger the associated challenges. If the internal customer support FAQ includes over 100 products and +200 website urls — just like is common in insurance — then the customer support person will likely not be able in these 4 months to get to the point of being able to communicate the specific product-customer fits and address related customer problems well. It’s a know-how barrier large corporations can’t easily cross.

Even the current bots already improve upon this situation. Bots are available 24/7 at low cost, won’t quit their job and are excellent in remembering a company’s products. What they’re not good at, yet, is react to what the customer is actually saying if they stray from the predefined path or change the topic.

Rasa Core and the start of the corporate machine learning bot race

For us, launching a Machine Learning-based dialogue engine such as Rasa Core is a very big step into this next generation. Bots will not only be ok at understanding human language, but also start answering in much smarter ways allowing for more complex, multi-turn dialogues. They will gain knowledge of customer’s’ perspective, their actual needs and challenges and an ability to interact. They will be able to gain and apply knowledge of company policies on such as bank accounts and on how to interact with clients.

Rasa Core example — training a bot on a policy how to answer an account comparison question

Let’s take a closer look at the illustrated example above to briefly explain what a dialogue engine such as Rasa Core does.

In the example a bank bot makes suggestions for what it should do next by extrapolating from patterns it has seen in previous conversations. In this case the model is 80% sure that it should give the user a breakdown of the interest rates of different savings accounts. The bot provides its most likely next actions to the trainer, and asks for feedback to improve the model.

Rather than codifying company know-how as an explicit set of rules, it is taught through direct interaction with the bot.

In this interactive learning mode, the developer provides step-by-step feedback on what their bot decided to do. It’s kind of like reinforcement learning, but with feedback on every single step

When your bot chooses the wrong action, you tell it what the right one would have been. The model updates itself immediately (so you are less likely to encounter the same mistake again) and once you finish, the conversation gets logged to a file and added to your training data.

In a rule-based approach, in this moment you would have added a (50th or 101st) rule which very likely conflicts with a previous rule and creates an error. With Rasa NLU you’ve instantly resolved an edge case, without staring at your code for ages figuring out what went wrong.

Over time this builds up a database of real conversations that codify the company’s knowledge of how users interact with your corporate different processes such as account openings at your bank.

We think that we are at the start of an age where corporations will start to compete on training neural nets in the interaction know-how across their core corporate functions.

In 1,5 years, for example, we will have insurance companies whose customer support quality/NPS scores will be much higher than their non-ML brethren.

Prediction: Companies with dialogue training will outperform companies without it in customer support within the next year
This illustration shows how a bot that keeps getting trained on real-world customer interactions can boost the quality of CS to new levels. Over time, the bot accumulates the knowledge of dozens of CS employees and can reliably help customers navigate through hundreds of different edge cases.

Non-ML insurance companies will be stuck on the old ceiling of customer support quality. They will continue to onboard new employees every 4 months and will never cross the ceiling.

Some of their ML counterparts have been already using Rasa Core in Beta. They are on their way to be significantly better than their Non-ML counterparts in a year.

Rasa Core has already been used by 100s of devs in Closed Beta last 3 months. We have seen companies built on top of multiple enterprise data sets: customers (CRM), employees (HCM), and enterprise assets (ERP/Financials). We think that with Rasa Core, all these data sets will be go through a similar ML Race scenario. Examples of company verticals that we see our clients build on include sales, HR operations and IT helpdesks.

We expect Rasa Core to be used by thousands of devs soon and are excited to see what they will build.

Thanks to Alex, Alan, Raj, Danielle, Regina.