5 Metrics to Use While Evaluating Cognitive Automation Solutions

Published in

Ikarus

6 min readJul 13, 2017

Choose the right metrics to evaluate your buys

Moore’s Law suggests that the rate of technological change is exponential — computing power doubles every 18 months. New technologies are bombarding us with new use cases for old problems. Machine Learning (ML) or cognitive solutions are solving problems from healthcare to waste management. The concern, however, is that while we are adapting faster to technological change, we are still lagging compared to the rate at which technology changes. What this means is that in some cases adoption lags can be beneficial: it makes sense to hold out for iPhone that promises more than a better camera and processing speed.

But in other situations, like when an organization fails to capitalize on an early-stage ML adoption, it becomes a problem. They miss an opportunity to capitalize on competitive advantages in the long-run.

That is not to say that that early-stage adoption is simple. Decision makers in an organization face multiple challenges like:

· The technology is not easily understood by all decision makers

· It can be hard to determine the right use case for the technology

· The ROI becomes a challenge to calculate given the lack of defined metrics

These challenges can explain decision makers’ growing concerns towards large-scale Cognitive Process Automation (CPA) projects. Business owners lack clarity about the real value of the project and find their expectations unmet.

While the first two challenges are context-specific, we have designed a technical guide to support your understanding of important metrics for assessing ML/cognitive solutions. These metrics provide an objective, evaluative frame that lets you see the solution beyond its glossy sales pitch or a brand name from another domain.

To provide a little background for these metrics, it helps to think of cognitive automation solutions as predictive ML algorithms. For instance, an ML solution that helps doctors at a hospital by predicting if an X-ray contains a fracture. Similarly, a chatbot can be used to ‘predict’ responses to user queries.

Hence, the first two metrics you should be looking at are:

1. Precision

Precision (or Accuracy) is a measure of the quality of the solution: if the work done is done well. For e.g., you have 100 documents that need to be sorted. Your ML solution correctly sorts 60 of them, incorrectly sorts 30 of them, and ignores 10 of them. In this case, the ML solution’s precision is:

Going back to the chatbot example, let us assume that the chatbot receives 500 queries. It answers correctly 400 times, incorrectly 50 times, and does not respond at all 50 times. In this case, the cognitive solution’s recall is:

2. Recall

Recall is a measure of the amount of work the system can do correctly. Continuing with our example of the ML solution that sorts 100 documents, the recall is:

Similarly, let us again consider the example of the customer service chatbot designed to answer customers’ queries. The recall for the chatbot is:

A Common Pitfall: At Ikarus, we have often noticed that buyers misunderstand precision and recall. This largely happens because in a typical manual process there are no cases where the output (‘prediction’) is not available. Hence, precision and recall are the same. But understanding the difference between these terms is crucial to realizing the RoI from cognitive solutions.

Tradeoff between Precision and Recall: Solutions can be trained to improve both quality and quantity of work done (i.e., precision and recall). However, after a certain point, a tradeoff will need to be made between precision and recall.

Higher Precision: For business cases where the value of quality is higher, higher precision should be chosen. A good example would be many processes in the healthcare domain, given the high overall costs of wrong diagnosis or treatments. Cognitive solutions designed for healthcare should be optimized for higher precision so that the ML solution responds only when it can give a correct output.

Higher Recall: Processes that are low-risk and high-volume can be optimized for high recall. Document sorting is one such process where it is important for an organization to prevent backlogs in their incoming communication by sorting as many mails as possible. The business tradeoff in this case is that few correctly sorted documents would be less beneficial than a solution that speedily sorts many documents for further review.

3. Confidence Level

So far, we have spoken about the big-picture metrics. Confidence level is a metric for individual predictions that tells you how confident the algorithm is for each prediction. For instance, let’s say you are extracting information from a document, such as invoices, and want different pieces of information like ‘dates’, ‘quantity’, item description, etc. Confidence level gives you a score of how sure the solution is towards a prediction. This can be used to fix high-confidence level for most critical fields in your process to improve precision while keeping low-confidence level for non-critical fields so that the overall recall doesn’t suffer.

4. Throughput

Throughput is a measure of the processing speed of the solution. It can be defined as the number of documents/cases the solution can process in a day. Typically this can be adjusted as per requirement by changing the solution’s computation resources.

5. Retraining

Retraining is a useful metric for businesses where data changes frequently. In those cases, the cognitive solution requires retraining so that precision and recall scores don’t drop. Before embarking on any RPA or CPA project, decision makers should anticipate any scenarios where the data changes. This can help you answer and prepare for questions like: will the software be learning by itself or will any engineering effort will be required? If so, who provides the retraining effort? What scenarios require it? How frequently will the solution require retraining?

Additional metrics like computation requirements, data validity, data diversity etc., are context-driven and may not apply to your process or organization. However, the list of metrics listed above are valid across all processes and industries that utilize predictive cognitive solutions. As a decision maker for your organization, it would be beneficial to map your process to these metrics, and prioritize the metrics that hold the most value for your process. This would help you create a personalized framework that evaluates different solutions against your own list of benchmarks.

The key is, thus, to not let technical speak or slick marketing impede your ability to make an informed decision about any investments into a cognitive automation solution for your organization.

Ikarus is a software company that develops solutions to automate manual repetitive tasks for businesses. We have created a Cognitive Process Automation framework which leverages advanced machine learning algorithms to work on unstructured text documents like invoices, forms, contracts and emails. Using our solutions, businesses can automate their processes with high accuracy and significant savings in cost and time.
Subscribe to our newsletter here: http://eepurl.com/cL2Etr

5 Metrics to Use While Evaluating Cognitive Automation Solutions

Written by Team Ikarus