Kolmogorov-Smirnov Diagnostics.
In predictive modeling, it is very important to check whether the model is able to distinguish between events and non-events. There is a performance statistics called “Kolmogorov-Smirnov” (KS) statistics which measures the discriminatory power of a model.
It looks at the maximum difference between the distribution of cumulative events and cumulative non-events.
It is a very popular metric used in credit risk and response modeling.
Input:
To run Kolmogorov Smirnov Diagnostics, select the binary target variable (coded as zero and one) and the predictor variables (numeric only) and select the functions using the following path:
Machine Learning à Regression Analysis (Non-linear) à Kolmogorov-Smirnov Diagnostics
Application & Interpretation
Using the logistic model, each record is scored with a probability of event. The complete sample is then divided into 10 or 20 groups in decreasing order of probability. The cumulative % of events and non-events is calculated for each decile or demi-decile and KS for each decile or demi-decile is the difference between the two. KS for the overall population is calculated as below:
KS statistic = max over (i = 1 to n) {Cumulative % of responder in groups (1 to i) — Cumulative % of non-responder in groups(1 to i) }
Higher the KS, better is the model.
The below example illustrates how we calculate KS for a logistic model:
See Also
Log Odds Ratio, Odds Ratio, Logistic Regression, Hosmer-Lemeshow Goodness of Fit.