That’s tricky with trees of depth greater than 1. I could offer suggestions here, but I think it would be better to just reference more complete discussions of fairness like http://research.google.com/bigpicture/attacking-discrimination-in-ml/ or https://towardsdatascience.com/a-tutorial-on-fairness-in-machine-learning-3ff8ba1040cb
Glad you found it interesting!
If you use the TreeExplainer object then it should be fast, and it can get all the info directly from the model. If you use the model-agnostic KernelExplainer then it will be slower because it has to use sampling to locally approximate the model for each explanation.
The core algorithm is implemented in C++. I know that XGBoost has a Java wrapper, but it doesn’t look like the SHAP values are exposed in the Java wrapper interface. It would probably not be hard to update the Java bindings from XGBoost to support this if you want to. You would need to add a predContribs boolean option to the predict function.
Great questions! For the reference values in Kernel SHAP we allow the user to provide a background dataset that is used to represent “missing”. So if you pass the mean value for each feature as the background then that will be used to represent missing, but you can also pass more samples if you want to integrate over several samples and not just use…
The ratio of sample to features is more important for the training of the model than the interpretation of it. So if you have a model that you trust even in the presence of high dimensionality then the same interpretation methods will work in theory.
Thanks! Yes, it works for regression just like for classification. The only difference is that the units of the feature importances will be different (they won’t be log odds values anymore, they will be the same units as your regression labels)
Yes you can use the sum of the absolute values as a measure of the relative importance of the features across the whole dataset. This is exactly what the shap packages does when it makes bar plots (except is doesn’t normalize).