Default Prediction with Machine Learning

Christoph Hirnschall
All About Advanon
Published in
4 min readMay 29, 2019

This blog post gives a brief introduction to using machine learning for default prediction and summarizes the results of our paper Grabit: Gradient tree-boosted Tobit models for default prediction, published in the Journal of Banking & Finance.

Predicting company defaults

Default prediction has been of major interest to both researchers and practitioners in the financial sector for almost a century. Early proposals for statistical default prediction focused primarily on linear classification models such as linear discriminant analysis (Altman, 1968) or logistic regression (Ohlson, 1980). More recently, to account for nonlinear dependencies between features and default events, different popular machine learning models, such as neural networks, classification trees, and ensemble methods have been applied to default prediction (see e.g. Brown, 2012).

A common problem in default prediction is class imbalance between defaults and non-defaults; i.e. typically only a small fraction of observations in a given dataset are defaults. This poses a problem for machine learning models, as the number of observations of defaults might be too small to identify patterns in the data and use them to accurately predict future defaults.

Using auxiliary data for default prediction

In many cases, financial institutions collect additional information about the performance of a company or loan, such as delays in repayment or changes in ratings. Intuitively, such auxiliary data is closely related to default events, since loans with large delays in repayment are likely “closer” to defaults than loans without any repayments. Yet, traditional classification methods typically neglect this additional information and only consider binary default events.

To combine the binary default data with continuous auxiliary data, we build on top of the Tobit model, a commonly used censored regression model. From an economic point of view, the Tobit model learns the default potential of a company, as represented for example by the numbers of delay days. In the model, a default occurs if the default potential exceeds a so-called default threshold. Note that additional information such as delay days are not observed for default cases, so the exact value of the default potential is only known below the default threshold (see Figure 1).

Figure 1: (Left) binary default classification; (right) linear regression on delay days for non-default cases

Importantly, the Tobit loss is asymmetric for defaulted observations, such that the loss decreases for predictions even past the default threshold value. Intuitively, this leads to the desired behavior of predicting a higher default potential for default cases, instead of penalizing predictions above the threshold. Yet, the Tobit model only considers linear dependencies on the features and cannot learn complex nonlinear relations in the data.

The Grabit model: applying gradient tree boosting to the Tobit model

We overcome this restriction by applying gradient tree boosting to the Tobit model to obtain the Grabit model, achieving the best of two worlds: the ability of gradient tree boosting to learn nonlinear dependencies and interactions in the data, with the natural way of the Tobit model to combine auxiliary data with binary default events. Compared to other state-of-the-art models, the Grabit model achieves significantly better performance in both a simulation study (Figure 2) and on Advanon’s loan dataset (Figure 3).

Figure 2: Results from a simulation study on the impact of the class imbalance ratio on the performance of the Grabit model and other approaches.
Figure 3: Performance comparison of different models on Advanon’s loan dataset.

Interestingly, if the auxiliary variable is independent of the decision function, i.e., the auxiliary data contains no additional information, the Grabit model still performs as well as the best competing binary classifier in our simulation study. Further, we observe that the Grabit model outperforms other models also in cases of larger datasets if the decision function is sufficiently complex, for example, having strong nonlinearities or interactions among predictors.

At Advanon, an efficient and accurate default prediction is at the core of our business, and we are committed to further research in this area. With our research efforts, we hope to increase the efficiency and fairness of the loan market for the benefit of both SMEs and investors. Other fields for the application of our algorithm are yet to be explored. Potentially, the Grabit model could even have an influence on making rather unpredictable areas of everyday life like rainfall more predictable — or rather the predictions more reliable (Sanso and Guenni, 1999; Sigrist et al., 2012).

This research project was done jointly by Prof. Fabio Sigrist from the Lucerne University of Applied Sciences and Arts and Christoph Hirnschall from Advanon as part of an Innosuisse project under grant number “25746.1 PFES-ES”.

--

--