XAI in the jungle of competing frameworks for machine learning

Przemyslaw Biecek
4 min readOct 18, 2019

Recently, I have faced a tough challenge. We were doing a proof of concept for the champion-challenger analysis + XAI exploration. It was done with a business partner that has its models written as if-else queries in salesforce. Part of our team were eager to build ML challengers in mlr (R) while others were after scikit-learn (python). As if that were not enough I was going to try h2o automl (java+wrappers in R/python) as a benchmark. Data was already cleaned and preprocessed, so the training phase was relatively easy in each framework. But how to cross-compare models created in 4 different frameworks?

We need more adapters not more standards

The answer is: you need a single place from which you can access all of these models. Then you will need a library for contrastive model comparisons. DALEX and DALEXtra are lifesavers in such cases.

DALEX and DALEXtra wrap models and create standardized interfaces. Other libraries for XAI, like: ingredients, auditor etc. may work on top of these uniform wrappers.

DALEX is an R package that creates a standardized uniform interface for model exploration. Its main function explain() takes a model and additional metadata to prepare a uniform interface with exposed functions for calculation of predictions, calculation of residuals, operations on the data and the target variable (operations like feature permutations). All of these elements are needed for model exploration. Once the wrapper is created, one can use XAI tools available in various useful R packages (e.g., ingredients, iBreakDown, auditor, modelStudio, shapper and vivo) without worrying, which framework was used to develop the model. It does’t matter if it’s R, python, java or any future framework.

The figure below shows an example model exploration created in modelStudio. The internal structure of the explored model is separated from the model interface. In the same way we can explore a model created in R, python or h2o.

The modelStudio package creates an interactive dashboard for model exploration. It’s based on DALEX wrappers thus is independent from model internals and can be used to explore R, scikit-learn, keras or h2o models. Here you can play with a demo.

Smart model wrappers

Preparation of a wrapper is automated for the most common ML frameworks. You can specify your own function for calculation of predictions or residuals but you do not have to. For most frameworks such information can be extracted in an automated way. Below we present an example for the randomForest model from the randomForest package. The explain() function knows how to wrap objects of the randomForest class, thus the wrapper definition is reduced to a specification of a model, a validation data and a unique label that will be used in explanations. DALEXtra is an extension pack with predefined wrappers for scikit-learn, keras, mljar, h2o and mlr models.

For R models the explain() function creates a wrapper with uniform interface. You can train your model in any framework and later let DALEX/DALEXtra guess how to transform your model into a standardized wrapper. Here is the full example.
For python models one needs to first serialize the model to a pkl file. Then it will be read by a DALEXtra::explain_scikitlearn() function via the reticulate package. Here you will find more examples for python models.

Just to show an example. In the figure below we present a partial dependency profiles for four models overlaid in a single figure. One can see that the average model behaviour is pretty similar except for very low values of the age variable for which catboost is giving higher predictions than gbm. This plot was generated with a single plot function that takes four partial dependency explanations as arguments. Each explanation knows how to access the model (created with python, mlr or java) through an explainer.

Partial dependency profiles for four models created in four different frameworks. Common interface to models created in different frameworks helps to cross-compare models. Here is the full example.

In the DALEXtra vignette you will find more examples on how to create and use wrappers for different frameworks.


For the champion-challenger analysis it makes sense to compare models created in different frameworks. After all it’s an exploration, in which you want to try different tools to compare their strengths and weaknesses. Your task or problem could gain a lot by using technology/framework X. To facilitate such comparisons you need wrappers that can be easily created, and that can be used for model cross comparisons. DALEX and DALEXtra create such wrappers. They provide an abstraction over internal structure of predictive models. You can build your own package for model exploration on top of this abstraction. Take as an example ingredients, iBreakDown, auditor or modelStudio.


DALEXtra is maintained by Szymon Maksymiuk, who is also a contributor in DALEX. Both tools are developed as a part of the DrWhy.AI framework. This description was greatly improved based on comments from Hubert Baniecki, Wojciech Kretowicz, Anna Kozak and Alicja Gosiewska.



Przemyslaw Biecek

Interested in innovations in predictive modeling. Posts about eXplainable AI, IML, AutoML, AutoEDA and Evidence-Based Machine Learning. Part of r-bloggers.com.