Using KS statistic as a model evaluation metric in Scikit-Learn’s GridSearchCV

1 min readApr 4, 2019

The KS-test of significance is frequently used in response or credit risk modeling. Sklearn doesn’t provide this metric in it’s list of supported scores under sklearn.metrics.

You can define your own custom evaluation metrics to be used for model evaluation. Using Scipy’s ks_2samp along with the sklearn.metrics.make_scorer functions to create a custom scorer that can be used in GridSearchCV.

Below I have created scorers for ROC, KS-stat, as well as log loss that can be used in grid search. You can specify as many different metrics for model evaluation in the ‘scoring’ parameter for GridSearchCV. However, only the one named under the ‘refit’ parameter will be used as the evaluation metric during cross validation.

Your primary evaluation metric will be the metric that corresponds to the refit parameter.

Using KS statistic as a model evaluation metric in Scikit-Learn’s GridSearchCV

Written by Xiao Wei