The Logistic Sigmoid function as the Bayesian Binary Classifier
Why it makes sense to use the logistic sigmoid function to get the probability of being in either one of 2 classes.
A bit of background: I work in financial risk. One of the things we do a lot is try to predict customers falling into either one of two classes: default or not default? And for these models, we always find the sigmoid function in use. Why is this so? Is there a mathematical proof? It turns out there is, thanks to Bayes Theorem!
First we establish the Bayesian formula. Given data x, what is the probability of belonging to class C_1:
And we recall the logistic sigmoid function, see equation 2:
Let´s assume a as
Getting the exponential on -a, we have
We plug equation 3 to equation 2, we get the following
Voilà! We have shown the link between Bayes´ Theorem and the logistic sigmoid function.
Key insight especially for the risk analyst is that it does make sense to use the sigmoid function for models to predict probability of default (and for any binary classification model for that matter).