Photo Credit: Jeremy Thomas

How to implement Human-in-the-loop for chatbots and natural language interfaces

Natural interfaces, with users expressing their needs in natural language, are the new way to engage customers.

What dot they all, including chatbots like Facebook M and meeting schedulers like, have in common? Behind the scenes, algorithms and humans are working in tandem to answer requests.

Handling a natural language request is typically a 3-step process:

Understand -> Act -> Respond

Let’s describe it with a simple flight information example, where the user asks “What’s the status of FR 8542?”.

  1. Understand. User is interested in a flight status and the flight is “FR 8542”.
  2. Act. Some databases or APIs are interrogated to find flight status.
  3. Respond. An answer is written using the information found during the previous steps.

Having people intervene in any of these steps is called Human-in-the-loop. The goal of Human-in-the-loop is to provide a good user experience when algorithms are not up to the task, while collecting data so that algorithms can improve and reduce human intervention.

There are different reasons for human intervention, let’s see some.

When humans intervene

Photo Credit: Danielle MacInnes
  • Algorithms don’t understand. Simpler interfaces could use regular expressions (regex) instead of machine learning. Regex could be programmed to parse “Hello”, “Hi”, and “Hey”, but if the user says “Ciao” then the algorithms would not understand it and the request would get routed to a human. Where machine learning is used to understand requests, algorithms may be designed to provide also a confidence level. Below a certain threshold level (ex. 80% confidence of having understood the user), humans intervene.
  • Algorithms don’t know how to perform the action. There may be actions that are not automated yet, or actions that can’t be yet automated .In Facebook M, you can ask for a drawing of a flower, a person will draw it and send you a picture.
  • Action not successful. Going back to the flight example, if the interrogation of the database does not return a flight status, then it could be that algorithms did not understand something properly or the user made a typo. There can be several failure processes put in place, routing to humans being the easiest to start with.
  • Request is out of scope. What if in our flight status interface users asks to change their reservation?

Companies using Human-in-the-loop are betting that they can reduce human factor over time by improving algorithms. Consider the case of real-time interactions, like a chat. How could you scale such a service if algorithms don’t improve over time?

Improving algorithms can be done only if the process of collecting data is implemented properly. That’s why it’s so important how humans intervene.

How to implement Human-in-the-loop correctly


Human-in-the-loop process should be implemented correctly from the beginning.

Humans can be supervisors, providing guidance and teaching the algorithms, or colleagues, carrying out the tasks instead of algorithms.

It may be tempting to start with colleagues doing all the activities and no algorithms until you collect enough data. There are several drawbacks to this approach:

  • No way to handle peaks. If you have algorithms, you have the option to have them handle some users autonomously during peak times.
  • Organization misalignment. You will have people who got used to performing colleagues tasks and now they will need to learn to work differently as supervisors. Changes will take place when load and user expectations are already high.
  • A dataset from which algorithms cannot learn. Your dataset of interactions will be raw, without the additional information required by the algorithms to learn, so you will need to have people to process the dataset before it can be fed to the algorithms. You may use other algorithms for this task, but again algorithms make mistakes and introducing errors in the training dataset is not a good idea.

In contrast, Supervisors override algorithms in a way that is informative and can be fed directly back to algorithms for learning. Supervisors usually act through a dedicated interface. Here are some examples:

  • Understanding the user. Let’s say an algorithm can’t find the flight number in the user request. The supervisor would check the request and add flight number to the other structured information already found by the algorithm. The supervisor interface would record this human activity and add the annotated request either to the training dataset, so that algorithms can learn from it, or to the test dataset, so that it can be checked if they learned. A good supervisor interface would let humans go further in helping the algorithms, like highlighting the part of the request from which the information was taken, tagging requests with information such as “out of scope” to discard them from the training, or with “field not implemented” to pinpoint missing algorithms features.
  • Taking action. Algorithms can output a tentative plan of how they would perform the action (ex. how they would structure the query to the database) and if required humans could change that plan. Again, supervisor interface should enable humans to change the plan in a way that can be easily fed back to the algorithms.
  • Generating the answer. The simpler way to supervise answer generation is having the algorithm suggest top 3 answers, like Google Inbox Smart Reply, and having humans choosing the most appropriate. If none is correct, humans could amend the best answer with differences being tracked.

As a general rule, humans do not perform the full task but rather make some changes to algorithms output, so that those changes can be recorded in a structured way. The changed output becomes the input for the next step, the process continuing as usual.

There are also more kinds of human interventions, which are less operational and more strategical.


Photo credit: REFE

Every interaction in natural language is an opportunity to hear the voice of the customer. Customer requests, when analyzed as a whole and acted upon, can drive growth, increase engagement, and generate new opportunities. Here are some of the opportunities.

  • Manage out of scope. How out of scope is managed can be very important for customer satisfaction. The basic response is telling the user that the request can’t be satisfied. Going further, integrating other services or pointing the user to a solution can result in good karma points.
  • Meet unsatisfied needs. This is a special case of the previous one. If you are an E-commerce and you have a natural language interface, users may ask for products and services that you don’t have. You could simply ignore it, like having a search result with no items, or find solutions, like partnering with other suppliers, creating a waiting list, etc.
  • Understand the user. Everything unexpected coming from the user is something to learn from and the basis of future informed decisions. Natural interfaces do not constrain the user with a limited set of interactions, they are much more informative. Users for example could search in different ways than how you structured your databases. Still, every interface brings its biases and this should always be taken into account.

Like Human-in-the-loop has its processes and tools, so should Business-in-the-loop be integrated into the process and considered in advance.

There’s no doubt that Human-in-the-loop will see an increase in adoption. Will your organization be made of Colleagues or Supervisors?

Conversate is a new Artificial Intelligence startup. We built an innovative Natural Language Understanding engine, with state of the art algorithms and enterprise services. Currently in private beta, if you are interested you can sign up at or write us at

Like what you read? Give Conversate a round of applause.

From a quick cheer to a standing ovation, clap to show how much you enjoyed this story.