Coefficient Interpretations of Linear Models

Wendy Hu
4 min readFeb 4, 2023

--

We will interpret the coefficients of the following linear models using intuitive and concrete examples.

Simple Linear Regression

A simple linear regression line fit through data points. Source: An Introduction to Statistical Learning

Used when the response variable is continuous and there is only one explanatory variable → models the value of the response variable

When X is continuous, the model is: sales = 0.1 + 0.2 * TV spending

  • Intercept: sales is $0.1 when TV spending is $0
  • Slope: sales increases by $0.2 for a one-unit increase in TV spending

When X is categorical, the model is: sales = 0.1 + 0.2 * is TV spending (1 for TV spending, 0 for no TV spending)

  • Intercept: sales is $0.1 when there is no TV spending
  • Slope: sales increases by $0.2 for TV spending relative to no TV spending

Multiple Linear Regression

A multiple linear regression plane fit through data points. Source: An Introduction to Statistical Learning

Used when the response variable is continuous and there are multiple explanatory variables → models the value of the response variable

When X is continuous, the model is: sales = 0.1 + 0.2 * TV spending + 0.3 * radio spending

  • Intercept: sales is $0.1 when TV spending is $0 and radio spending is $0
  • Slope 1: sales increases by $0.2 for a one-unit increase in TV spending, all else being equal
  • Slope 2: sales increases by $0.3 for a one-unit increase in radio spending, all else being equal

When X is categorical, the model is: sales = 0.1 + 0.2 * is TV spending + 0.3 * is radio spending (1 for TV spending, 0 for no TV spending, 1 for radio spending, 0 for no radio spending)

  • Intercept: sales is $0.1 when there is no TV spending and no radio spending
  • Slope 1: sales increases by $0.2 for TV spending relative to no TV spending, all else being equal
  • Slope 2: sales increases by $0.3 for radio spending relative to no radio spending, all else being equal

Multiple Linear Regression with Interactions

A multiple linear regression with interactions plane. Source: https://online.stat.psu.edu/stat501/book/export/html/947

Used when the response variable is continuous and there are multiple explanatory variables and interaction term(s) → models the value of the response variable

When X is continuous, the model is: sales = 0.1 + 0.2 * TV spending + 0.3 * radio spending + 0.4 * TV spending * radio spending

The model simplifies to: sales = 0.1 + (0.2 + 0.4 * radio spending) * TV spending + 0.3 * radio spending

  • Slope 1: sales increases by $(0.2 + 0.4 * radio spending) for a one-unit increase in TV spending, all else being equal
  • Slope 2: sales increases by $0.3 for a one-unit increase in radio spending, all else being equal
  • Interaction: effectiveness of TV spending increases by $0.4 for a one-unit increase in radio spending; effectiveness of radio spending increases by $0.4 for a one unit increase in TV spending, all else being equal

When X is both categorical and continuous, the model is: sales = 0.1 + 0.2 * TV spending + 0.3 * is radio spending + 0.4 * TV spending * is radio spending

The model simplifies to: sales = (0.1 + 0.3) + (0.2 + 0.4) * TV spending if there is radio spending; sales = 0.1 + 0.2 * TV spending if there is no radio spending

  • Interaction: a one-unit increase in TV spending impacts sales with and without radio spending differently, all else being equal

Polynomial Regression

A 2nd degree polynomial regression line (blue) fit through data points. Source: An Introduction to Statistical Learning

Used when the response variable is continuous and there is a non-linear relationship between the explanatory variable(s) and the response variable → models the value of the response variable

The model is: sales = 0.1 + 0.2 * TV spending + 0.3 * TV spending ^ 2

  • Slope: a one-unit increase in TV spending impacts sales differently depending on its value, all else being equal. The slope is not constant but changing with the value of TV spending

Logistic Regression

A logistic regression line fit through data points. Source: An Introduction to Statistical Learning

Used when the response variable is categorical → models the probability that the response variable belongs to a certain class

The model is: log odds of default = -0.1 + 0.2 * balance — 0.3 * income — 0.4 * is student (1 for student and 0 for non-student)

Slope 1: log odds of default increases by 0.2 for a one-unit increase in balance

Slope 2: log odds of default decreases by 0.3 for a one-unit increase in income

Slope 3: log odds of default decreases by 0.4 for students relative to non-students

References

An Introduction to Statistical Learning with Applications in R, Gareth James, Daniela Witten, Trevor Hastie, Robert Tibshirani

--

--