An Answer to Anatomy of a chatbot

Or why “develop a chatbot in 10 minutes” is simply a false promise!

“Develop a chatbot in 10 minutes!” We’ve all read those catchy headlines! And I’m a bit tired of them. They create false expectations, they’re false promises, and may in the longer term damage this nascent industry. That’s what happened in the speech recognition industry and it took years to recover.

In an attempt to build a more realistic business case for their platform, Hristo Borisov from Progress compared the cost of developing a chatbot from scratch using NLU services like Wit.ai, Api.ai, or LUIS, with the darvin.ai platform. Unfortunately, the comparison falls short in several respects. Let me address some of them here and show that developing chatbots is not as simple as many say.

In fact, I see really two problems with the post.

First, claiming the application can be developed 16x faster is highly misleading. The comparison ignores most of the development activities involved in building a chatbot. Building the dialogue itself is just a fraction of the total effort/cost. (In the comparison, the integration costs are simply ignored!). Here are some of those missing activities:

  • Discovery. Unless your customers know exactly what they want, you will need a discovery phase to determine their needs, and how a chatbot can benefit their organization in complement to other channels.
  • Design of the conversational user interface (CUI). This should go without saying, but this was completely obliterated from the evaluation. Choosing the right persona, the tone of the conversation, scripting the conversation, etc. is fundamental to the success of your chatbot. And please, do yourself a favor and hire a real CUI designer to do that, not the first available mobile developer in search for a job (now that apps are dead ;-)…
  • Initial data gathering/extraction/labelling for training the natural language understanding (NLU/NLP) module. Ideally, you would start with some data from available chat sessions or other relevant sources. Otherwise, you will have to provide a first approximation for the training data (generating sentences from a more compact representation, like regular expressions or grammars, can help) and carefully tune and optimize the NLU once in production.
  • Interaction with the customer & project management. Of course, you’re gonna be doing your development in agile mode, with scrum meetings, and the like. As a rule of thumb, you can add 15% overhead to the project just for project management.
  • Setting up your various environments. Please, don’t tell me you develop in your production environment! You’re not such a risk taker, right? You will set up multiple environments (at least dev, staging, prod). And how do you migrate your chatbot from dev to staging or from staging to prod? You may need to write some scripts to do that.
  • Testing! How will you test your chatbot? Will you just use Facebook Messenger and do some anecdotal tests? Or will you set up automated tests? How do you make sure modifications to the dialogue structure do not break your chatbot? These considerations must be well thought out at the start of the project, not as an afterthought.
  • Post-deployment tuning and optimization. You just cannot put your chatbot in the wild and forget about it. Unless you are Google or Facebook and can rely on terabytes of data to train your NLP, you will always encounter unexpected input from chatbot users. You will need to collect interactions and perform (semi-)manual annotations of your data prior to benchmarking your system.

But I think the more important issue is that it completely ignores the main problem with chatbots today: it’s not that they take so long to develop, but that most of them provide such a terrible user experience, as soon as you go beyond simple menus and allow freeform text (my colleague Linda wrote a nice post about that, by the way.) NLU is far from being a solved problem and a highly tuned NLU system is required in order to offer a great UX. Add to that the fact that robust dialogue management is also still an active research field (I have not yet seen a chatbot with graceful and effective error recovery).

Note that all the missing activities above aim at developing chatbots with great CUI.

The real question is: are we just, as an industry, trying to provide tools that accelerate the development of bad chatbots? I don’t think that’s a worthwhile goal!

Show your support

Clapping shows how much you appreciated Dominique Boucher’s story.