Teach your AI to say NO

Damien GILLES
alter-way-innovation
3 min readAug 29, 2019

Automation has taken an ever growing place in the society, but we are facing a new problem: can we trust those automated systems? Reducing to the bare minimum the error rate and detecting edge cases become more and more important as our AIs takes part in more critical missions. Moreover, humans tend to act blindly in front of the machines’ decisions, making them potentially even more catastrophic. To mitigate those issues we researched a way for the AIs to call for human help.

Our use case

Alter Way offers application maintenance services to their client. The current workflow requires the clients to fill a form by filling a category (demand or anomaly), a title and a description. Then, the issue is further classified and resolved by our agents as discussed in our previous article.

To improve our workflow we work on automating most of the classifications. However, we must ensure that the tickets are correctly classified to prioritize the issues and deal with the critical ones in priority.

A third class: “human”

We realized early on that automating our process may lead to mistakes and that the operators are unlikely to edit the classifications once a prediction as been made. In this context, we explored ways to call for human help when our algorithm was uncertain.

Our algorithm’s binary classifier is based on a stochastic gradient descent, so it returns the probability of belonging to a class. If the probability is over 50% the ticket is tagged with that class. The naive approach to discriminate cases where our algorithm isn’t sure how to classify an issue is to create a third class for the tickets which are close to 50% chance to belonging to a class. However, this approach throughput many errors on issues that our algorithm just does not understand.

We found a way out while working with OpenReq, by exploring and benchmarking multiple algorithms that you can find over their github. We realized that while both our and their algorithms have about the same precision, they made quite different mistakes.

By combining the naive approach with the opposition between both algorithm results we improved our error rate dramatically. Indeed, our error rate was reduced by 63% by adopting this workflow (SS is Alter Way, RI is OpenReq):

This decision tree reduces our error rate to 3%, while requiring human assistance only 5% of the time.

In a nutshell

We managed to gain trust in our system by allowing it to refuse to qualify an issue. When we do so, we require human qualification to fully qualify the issue. This mechanism reduced our error rate by 5% in comparison to our initial algorithm. This kind of cross-validation is a security mechanism we must implement to avoid inexplicable and potentially harmful decision by our automated systems.

Thanks to Jonathan Rivalan for making this topic possible.

--

--