Understanding Logistic Regression using Log Odds

Published in

Geek Culture

3 min readMay 2, 2021

I had to adjust my thinking when it comes to logistic regression because it models a probability rather than a mean and it involves the non-linear transformation. In this article, I will explain the log odds interpretation of logistical regression in math, and also run a simple logistical regression model with real data.

The odds of an event are the probability of an event that it happens over the probability that it doesn’t. For example, if the P (success) = 0.8, and P (failure) = 0.2, the odds of success will be 0.8/0.2=4.

We use logistic regression to model a binary outcome variable (y is either 0 or 1). Similar to a Bernoulli random variable, we want to consider the 𝑃 (𝑦=1|𝑥) and 𝑃 (𝑦=0|𝑥). Also, we know that the general linear model specification: 𝐸(𝑦|𝑥)=𝑓(𝑥′𝛽), we can derive the conditional mean in the case of linear regression to be:

𝐸(𝑦|𝑥)=𝑃 (𝑦=1|𝑥)×1+𝑃 (𝑦=0|𝑥)×0=𝑃 (𝑦=1|𝑥)

Therefore, the expectation we are modeling is a probability: 𝑃 (𝑦=1|𝑥). However, if we model it with a linear combination of the independent variable and parameters:𝑃(𝑦=1|𝑥)=𝑥′𝛽, it does not work because probability should be bounded between 0 and 1.

Understanding Logistic Regression using Log Odds

Written by Shuangyuan (Sharon) Wei