Approachable AI — Explainability (part 2)

Jeff Kimmel
Elipsa
Published in
6 min readOct 13, 2020
connect the dots of the black box with explainable AI

Elipsa’s mission is to create Approachable AI, enabling organizations to scale by empowering their business users to assume the role of the data scientist through a no-code solution. The three pillars of this mission are useability, explainability, and accessibility. This is a three-part series on those pillars and how a focus on each of these will enable broader adoption of predictive analytics and a faster journey from data to insight.

Explainability

In the first post, we discussed how useability is a hindrance preventing companies from scaling their AI capabilities. Equally as important, if not more so, than an easy to use system to create models is the ability to interpret and understand the results. Even for organizations that have data scientists, there is often difficulties getting their hard work into production because of an AI language barrier. In other words, the typical business user does not understand the model references or statistical metrics that is common in data science vocabulary. On top of this, many of the models still appear to be black boxes. There are many organizations that are not able to implement a machine learning model if they cannot explain how it got to the results. That requirement is sometimes internal but often is required of regulators and so companies are reluctant to incorporate a black box product into their processes out of concern for regulatory backlash.

The second pillar of Approachable AI is thus explainability. The elipsa platform tackles this issue by defining the results of a model in terms that business users understand, and then seeks to look inside the black box to shine some light on why the algorithm comes to a predicted conclusion

Business Terms

Precision, recall, F1, accuracy. These are metrics that are integral to a machine learning classification problem. However, chances are accuracy is the only one of the four that the typical business user is familiar with. How many predictions did the model get right vs how many total predictions. Accuracy intuitively makes sense to the business as a metric to use. We want to increase the % of answers we get right? Well unfortunately relying solely on accuracy can result in very poor models as accuracy does not tell the full story. In fact, an over-reliance on accuracy can lead to what is referred to as overfitting. We will not get into too much technical detail on overfitting but think of it in terms of memorizing the answers to old tests with no ability to correctly answer the questions on new unseen test questions.

Looking at that point in a different way, most data sets are imbalanced. In other words, when trying to predict an event, your data is often lopsided towards containing more of one result than another. If you were trying to perform predictive maintenance on a piece of machinery using sensors and other information, you are more than likely to have a working machine more often than a failed machine. So, let’s say in your data, the machine part failed 10% of the time. A model relying too heavily on accuracy could overfit by leaning too heavily on the data showing the machine is performing well. As a result, you could run all the data through the model and it could predict 100% of the time that the machine is operating fine. We know that is not the case but statistically, it is still 90% accurate even though it didn’t predict a single failure correctly.

This is where additional statistics come in handy for producing high performing models but it is also where legacy systems begin to get over-complicated for the typical business user. The elipsa approach is to automate the optimization of the model to the more advanced metrics behind the scenes and then present the results in terms that users understand. We show accuracy but also introduce the concept of minority class accuracy and error. In addition to showing the overall accuracy, elipsa takes the event class that is underrepresented in the data (the part failure in the example above). This gives the business user confidence that not only is the model strong as a whole but also to see how effective is it on data sets that it does not have sufficient examples for). In addition to presenting the accuracy in multiple ways, we look at the error rate of our predictions. Each prediction in the elipsa system will give a predicted answer coupled with a confidence score of how high of conviction it has on that prediction. For those that the model got wrong, the error rate shows how high was the conviction. How confident is it in its wrong answer? Lower error does not fix the fact that the model got it wrong but gives the user confidence that it is more likely to get similar data correct in the future.

(sample model results)

Transparency

With model results put into terms that the business user can understand, they are more likely to be able to feel confident in the model coming to the correct answer. However, that still leaves the concern of how did the model get to that conclusion. Unlike a human working through the process manually, the computer can not explain in plain language how it got to the answer. This black box approach and lack of transparency is a primary driver for why organizations are in some cases reluctant to incorporate AI into their existing processes, and in the case of regulated industries, why they often are not allowed to incorporate them.

The elipsa platform seeks to increase the explainability of models in terms that business users understand by explaining aspects of the model as a whole and further explaining each prediction it makes. For model transparency, we focus on providing more clarity around the predictors used by the model and showing which predictors are the most relevant to model performance. We do this by automating a process that tests each predictor against the final model to show how much the model improved by including that specific predictor in the model. This provides the business user with key insights into which predictor is actually having the strongest influence on the predictions that it is making.

In addition to analyzing the predictors on the model as a whole, we focus on explaining each prediction for the user. For each prediction against the model, the platform decomposes the data predictors to explain how each value contributed (positively or negatively) to the predicted answer and confidence value. As an example, you could build a model that used age, education level, and salary to predict whether they would buy a luxury vehicle. If you have a prospective buyer whose age and education level contribute positively towards predicting that he or she will buy a car but their below-average salary negatively weighs down the confidence resulting in a prediction that the prospect will in fact not end up buying a vehicle.

Conclusion

AI, and specifically machine learning and predictive analytics, has come a long way over the past few years. However, we are still not at the point where organizations can trust the models without hesitation. To overcome this hesitation, they need to build trust through understanding. As a result, the lack of explainability by legacy systems has significantly slowed the adoption of AI across the enterprise. This focus on explainability is the second pillar of Approachable AI. Our focus is on creating predictive models that business users can understand so that they build confidence in the use of such models and begin to benefit from the advanced technologies to relieve them of key pain points in their processes.

--

--