Adjusted R-Squared: Formula Explanation
As the name suggests, Adjusted R-Squared is an adjusted version of R-Squared. The question arises why we need to adjust R-Squared.
So in this article, we are going to see why Adjusted R-Squared is needed and we will break down its formula and try to understand the impact of each term on value Adjusted R-Squared.
I encourage you to read my article R-Squared: Formula Explanation. It will help you in the understanding of R-Squared.
Let start with answering our very first question, “why do we need to adjust R-Squared?”. For that, we need to discuss the drawbacks of R-Squared.
R-Squared value only works for simple linear regression. For multiple linear regression as the number of the independent variables increases, the value of R-Squared also gets increases even if the independent variable is insignificant. Whereas Adjusted R-squared increases only when the independent variable is significant and affects the dependent variable.
Case 1: When independent features are insignificant
So as p increases the denominator (N-p-1) going to be a smaller term, it means that the whole [(1-R²)(N-1)/(N-p-1)] will become a large number and when this larger term is subtracted from 1 the value of Adjusted R² going to be a smaller term.
Case 2: When independent features are significant
When features have some significance, the value of R² going to increase, and when larger R² is going to subtract from 1, (1-R²) becomes a smaller term. When we multiple smaller terms with [(N-1)/(N-p-1)] it will become a smaller term and at the last when the whole term is subtracted from 1 Adjusted R² is going to be a larger value.
Adjusted R-Squared is Negative or Zero
Adjusted R-Squared can be zero or negative in two conditions
- R² is very small or close to zero. (You can put R²=0 in the Adjusted R-Squared formula and check by yourself.)
- When N is less than or equal to p. (In a real-world scenario p is always smaller than N)
Final Thoughts
R-Squared has drawbacks when it comes to multiple regression. We have overcome those drawbacks by modifying the formula and created a new term Adjusted R-Squared. It’s easy to understand its Formula when we break down the terms and study the impact of each term separately.