Nerd For Tech
Published in

Nerd For Tech

What is Machine Learning model Interpretation?

Nowadays with the more usage of Machine Learning Models in the industry, Finding the most appropriate model is not an easy task. The Accuracy, Precision, or Recall may not represent the usefulness of the model in the real world. So to find the most suitable model, A new hot topic is available named Interpretation of Machine Learning Models.

A Calm beach representing the calmness after you trust your Machine Learning Model
Keep Calm, Now you can trust your ML model! (src:

The field Interpretation of Machine Learning model is a new hot topic that talks about how a model works and represent the output. It is chained to the fact of trustworthiness of a model. In other words, this topic answers the question “Why do I need to trust this model?” and “How to predict the model's output?”.

In this article, we are going to explain the main concept of this field and how to go beyond the old metrics Accuracy, Precision, or Recall. Before we start, I need to mention that this article’s reference is a paper written by Xuhong Li and Haoyi Xiong[1].

Interpretation Algorithms

Interpretation algorithms are used to reveal how a model makes a decision. It highlights the data in order to show how effective it is for a special output. For example, how the snow is effective in a wolf picture to classify the image as a wolf, not a dog.

Model Interpretability

Model interpretation refers to the question “As a human, how to understand a model makes decisions?” and “how to predict the result of a model?”. As it is clear, to find the model interpretability, we need to use the interpretation algorithms. Using interpretation methods helps us to find a better understanding of how a model works and how it outputs a result.


The trustworthiness here means the degree of trust we can apply to a model interpretation. Or in other words, it can be said that the trustworthy shows how a model interpretation results are correct and applied in general.

The Trade-off

On one side, fully trustworthy models are usually simple models that are not capable of real-world problems, but on the other side, the trustworthy models’ output is known and does not have unpredicted outputs. Due to this fact it is always a trade-off to use the more trusted model or to use the model with better performance.

Properties to describe Interpretation Algorithms

There are various properties to describe different views of Interpretation algorithms. Each is used to highlight an aspect of Interpretation algorithms. They are categorized as below

  • Human Understandable: Such as Informativeness, Plausibility, and Satisfaction
  • TrustWorthiness: Reliability, Robustness, Trus, Fidelity, and Transparency
  • Underlying Rationale: Casuality, Transparency
  • Model Verification: Fairness, Transferability, privacy
  • etc

The category named Underlying Rationale answers the question “How the model is parameterized to generate the output?”. As you can see some properties are in more than one category, Which means it can highlight more than one category of the Interpretation Algorithms.

Different Categories of Interpretation Algorithms

There are different types of Interpretation Algorithms that are used for different domains. Some may be used in more than one domain and some may be applicable just for one domain.

  • Some methods treat the model as a black box, It just compares the output with different inputs.
  • Some others find the underlying process of each model, i.e. they try to trace the parameters of the model in a classification/prediction problem, such as decision trees.
  • And The last category tries to find a closed-form solution for a model, for example formulating the process of a model (SVM or Linear Regression models).

These were some categories of Interpretation Algorithms we could mention here. Of course, there are more different categorizations of algorithms available. If you want to read more about these methods, you can read the referenced paper.

Thank you for reading this article and Have a great day!


[1] Li, X., Xiong, H., Li, X., Wu, X., Zhang, X., Liu, J., Bian, J. and Dou, D., 2022. Interpretable Deep Learning: Interpretation, Interpretability, Trustworthiness, and Beyond.



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Mohammad Amin Dadgar

Mohammad Amin Dadgar

Computer Science in the field of Artificial Intelligence Master’s degree student. Learning Machine-Learning. My Github page link: