Default Prediction with Machine Learning

This blog post gives a brief introduction to using machine learning for default prediction and summarizes the results of our paper Grabit: Gradient tree-boosted Tobit models for default prediction, published in the Journal of Banking & Finance.

Predicting company defaults

A common problem in default prediction is class imbalance between defaults and non-defaults; i.e. typically only a small fraction of observations in a given dataset are defaults. This poses a problem for machine learning models, as the number of observations of defaults might be too small to identify patterns in the data and use them to accurately predict future defaults.

Using auxiliary data for default prediction

To combine the binary default data with continuous auxiliary data, we build on top of the Tobit model, a commonly used censored regression model. From an economic point of view, the Tobit model learns the default potential of a company, as represented for example by the numbers of delay days. In the model, a default occurs if the default potential exceeds a so-called default threshold. Note that additional information such as delay days are not observed for default cases, so the exact value of the default potential is only known below the default threshold (see Figure 1).

Image for post
Image for post
Figure 1: (Left) binary default classification; (right) linear regression on delay days for non-default cases

Importantly, the Tobit loss is asymmetric for defaulted observations, such that the loss decreases for predictions even past the default threshold value. Intuitively, this leads to the desired behavior of predicting a higher default potential for default cases, instead of penalizing predictions above the threshold. Yet, the Tobit model only considers linear dependencies on the features and cannot learn complex nonlinear relations in the data.

The Grabit model: applying gradient tree boosting to the Tobit model

Image for post
Image for post
Figure 2: Results from a simulation study on the impact of the class imbalance ratio on the performance of the Grabit model and other approaches.
Image for post
Image for post
Figure 3: Performance comparison of different models on Advanon’s loan dataset.

Interestingly, if the auxiliary variable is independent of the decision function, i.e., the auxiliary data contains no additional information, the Grabit model still performs as well as the best competing binary classifier in our simulation study. Further, we observe that the Grabit model outperforms other models also in cases of larger datasets if the decision function is sufficiently complex, for example, having strong nonlinearities or interactions among predictors.

At Advanon, an efficient and accurate default prediction is at the core of our business, and we are committed to further research in this area. With our research efforts, we hope to increase the efficiency and fairness of the loan market for the benefit of both SMEs and investors. Other fields for the application of our algorithm are yet to be explored. Potentially, the Grabit model could even have an influence on making rather unpredictable areas of everyday life like rainfall more predictable — or rather the predictions more reliable (Sanso and Guenni, 1999; Sigrist et al., 2012).

This research project was done jointly by Prof. Fabio Sigrist from the Lucerne University of Applied Sciences and Arts and Christoph Hirnschall from Advanon as part of an Innosuisse project under grant number “25746.1 PFES-ES”.

All About Advanon

This is the blog of Swiss online platform

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store