Building a truly conversational chatbot takes more than 30 minutes
Many chatbot developers and chatbot development tool providers claim that one can build a chatbot in 30, 10, 7, or even 5 minutes. While it is indeed possible to build a basic chatbot in a few minutes, the result is well in line with the effort. Typically, such quickly built chatbots provide a very simple tree-like directed dialog with limited choices in the form of buttons or list of options. More often than not (most of the time, really), just answering the question in text is enough to break the chatbot. Some understand a few text queries, but no sophisticated natural language understanding (NLU) can be built into a chatbot in mere minutes.
Building a truly conversational chatbot requires significantly more effort and the journey is much longer. This post is meant to describe part of our experience building a conversational chatbot in the context of an omnichannel customer service contact center, and more specifically, building our own chatbot dialogue design tools. Other posts from our team will explore several different topics, from development and integration challenges to benchmarking NLU engines.
Conversational Chatbot Requirements
Early on in our project, we defined a series of goals and objectives for all aspects that were in scope. Globally, we wanted to build a fully conversational chatbot, integrated in an omnichannel contact center solution, with the following main characteristics:
- Allow the user to complete a task from beginning to end: the first use case we have been working on is an address change/moving chatbot for a utilities organisation, where the user can change their address, schedule an appointment with a technician, select a communication channel to be notified prior to the technician visit, confirm and validate the information.
- Provide a conversational experience to the user: this means proposing a natural, non-linear dialogue, where the user is able to answer questions in their own words, switch topics, interrupt the flow with a query, etc. In other words, mixed initiative dialogues.
- On the other hand, the dialogue should be directive enough for users to know exactly what to do, and to bring them back to the main path when they deviate from the task or when issues occur.
- Understand a wide variety of natural language queries, in the scope of the chatbot’s use cases.
- Support buttons when they allow for a more efficient selection, but always support user typed responses.
- Build human assisted chatbot support: to monitor NLU performance and make corrections on the fly, to inject responses to the user in lieu of the chatbot for unsupported queries, to decide when to transfer the user to a human agent, etc.
- Support global “commands”, for example allow the user to ask for a human agent at any time.
Before we actually started creating our dialogues, we defined the main use cases that we wanted to support, in the scope of an address change/moving chatbot. We also defined a persona for our chatbot, as well as the main user archetypes. We then wrote a series of sample dialogues, in persona, to illustrate happy paths and alternate paths that we wanted to make sure we supported in initial versions. This methodology, inspired by voice user interface design, is also appropriate for designing chatbot interactions, especially when one of the goals is to create conversational interactions.
The next step was to design dialogues per se, and this is where we quickly realized that a tree-like dialogue structure was too restrictive and not appropriate to illustrate mixed initiative interactions. Most out-of-the-box available chatbot development solutions on the market offer a dialogue design interface consisting of, basically, boxes and arrows. As soon as the dialogue deviates ever so slightly from a linear path, the flows become entangled like spaghetti. Other development solution providers offer more sophisticated tools to define dialogues, but most fall short when trying to define complex mixed initiative interactions.
We came to the early conclusion that we needed to develop an in-house dialogue manager in order to meet our requirements.
Dialogue Manager Requirements
To design and develop mixed initiative conversational dialogues, we needed a model that would give us a lot of flexibility, but that could also be used by non developers. Why? Because we believe that user interface designers, especially VUI designers, have what it takes to write natural chatbot dialogues, but many do not have a developer’s background.
With these objectives in mind, a list of requirements was put together:
- User input should be intent and entity based & NLU should be engine independent.
- Rules can be based on context: taking into account where we are in the dialogue, what do we know about the user at that point, what task or subtask is in progress, what information we are collecting, etc.
- The DM should support contextual interpretation of generic entities, for example, cardinal or ordinal numbers.
- Provide the ability to disambiguate conflicting/ambiguous input.
- Provide the ability to confirm uncertain input (and define thresholds).
- Provide the ability to go back and forth in the dialogue to change or validate previously entered information.
- Provide reusable generic dialogue patterns, e.g. “collect & confirm”.
- Multilingual support: core dialogue must be language independent and multilingual NLU and prompts must be supported.
- Handle multiple elements in a response without having to define multiple rules, whether intents or entities.
- Provide tools that enable the dialogue designer to run and test the dialogue independently.
This list of requirements is evolving as we move forward in our project, and the dialogue manager is also evolving, following an iterative process. Tools are also being added to our portfolio, among them a unit testing suite and sophisticated NLU benchmarking tools.
Although we are still in the process of improving our tools and methodology, the resulting dialogues that we have been able to create so far via this dialogue manager are complex and flexible, and we are getting closer to meet our initial conversational chatbot requirements.
Oh, has it been 30 minutes already?
Our experience building truly conversational chatbots is quite different from what is advertised, and I think that claiming that it is possible to build a chatbot in minutes creates expectations that cannot possibly be met with such a simplistic approach. Building a conversational chatbot is not much easier than creating a natural language speech application, if one truly wants to propose a real conversational experience. If the goal is to create a basic chatbot with buttons, it has to be advertised as such, otherwise, we are all shooting ourselves in the foot by pretending to offer a so-called conversational experience and grossly underdeliver.