XAI in meta-learning

Katarzyna Woźnica
ResponsibleML
Published in
3 min readNov 26, 2020

Hello!

In our series BASIC XAI with DALEX, you could find more information about Explainable artificial intelligence(XAI) methods but today we are going to take a closer look at what benefits we can derive from application XAI methods in meta-learning. Presented results come from our article Towards better understanding of meta-features contributions.

What is meta-learning?

Nowadays, the most common approach to develop a predictive model is focused on one specific task, problem which is represented as a dataset. Every model is parametrized with a hyperparameter configuration and has a different structure. Each model is evaluated with loss function or performance measure such as AUC or RMSE. As an optimal model, we consider model with hyperparameters setup which minimizes model performance.

We can define the space of hyperparameters in a different way as random search or sequential model-based optimization. The common point for all of these scenarios is this process need to be repeated for every problem and data independently. You can do it 10 000 times, and for the 10 001st you will start from scratch again.

Could we take advantage of previous experience from training models on different tasks and transfer them to new data? This approach to optimization is the foundation of meta-learning: we want to learn how to learn an optimal model for a new dataset. The intuition behind this is straightforward: we assume that similar dataset/problems should have similar learning curves. So one of the proposed approaches is building meta-model which predicts performance measure, the outcome of the evaluation step for every data and hyperparameter configuration. Simplified plan of meta-learning concept you can see at the schema below.

by Katarzyna Woźnica

Meta-features

Inputs for meta-model are called meta-features. Basically, meta-features are numeric or categorical values describing tasks. Components of this meta-vector come from datasets and model nature.

  • We need to assess the similarity between datasets: a popular approach is set of statistical and information-theoretic properties. An alternative to them are landmarkers: performance measure of simple, effective algorithms on this dataset. This kind of features tries to capture the complex structure of data dismissed by statistical properties.
  • Next to datasets we need to characterize model parametrization. The obvious is passing hyperparameters’ values as input to meta-model.

Our meta-model

We collect performance measures of gradient boosting models with 100 different hyperparameters configurations on diverse 20 datasets from OpenML repository.

  1. Every dataset was mapped on the numeric vector of statistical meta-features and then we merge hyperparameters configuration and model performance as meta-response.
  2. As meta-model, we train gradient boosting algorithm.
  3. XAI methods are applied to insight into meta-model structure.

Meta-features importance

The first aspect is the examination of the influence of individual meta-variables. We have used permutation-based measures of variable importance. This investigation helps to identify presumptive noisy aspects and may be significant in deliberation to exclude these meta-features from new generations meta-models.

Lengths of the bars correspond to the permutational feature importance, colors indicate the origin of meta-features: red-model hyperparameters, green-landmarkers and blue-statistical properties of datasets.

Besides the overall impact of explanatory meta-variables, we can apply more XAI methods, for instance, Ceteris Paribus profile for the effect of selected meta-predictor. Our approach is universal and generic to the explainable analysis of any meta-learning model but exemplary results for our meta-model you can find in the article.

--

--