TDS Archive

An archive of data science, data analytics, data engineering, machine learning, and artificial intelligence writing from the former Towards Data Science Medium publication.

Image by Author

Member-only story

Ordinary Least Squares Regression

The definitive mathematical guide.

8 min readJan 14, 2021

--

Ordinary least squares regression is a standard technique everyone should be familiar with. We motivate the linear model from the perspective of the Gauss-Markov Theorem, discern between the overdetermined and underdetermined cases, and apply OLS regression to a wine quality dataset.

Contents

  1. The Linear Model
  2. The Gauss Markov Theorem
  3. The Underdetermined and Overdetermined Case
  4. Analyzing the Red Wines Dataset
  5. Summary

The Linear Model

The linear model assumes the following ansatz:

The independent variable is related to dependent variable by a multiplication and the addition of a constant term. In other words, the predicted label is a linear combination of the feature vector plus a constant. However, without loss of generality, we may drop the constant term, because it can be absorbed into the linear combination as follows:

We have extended the feature with a dummy constant 1, and concatenated the unknown variables to be learned into a single unknown vector. Now given a full training set of data features and labels, we can fit the data, or learn the optimal…

--

--

TDS Archive
TDS Archive

Published in TDS Archive

An archive of data science, data analytics, data engineering, machine learning, and artificial intelligence writing from the former Towards Data Science Medium publication.

Alex Powell
Alex Powell

Written by Alex Powell

I write about data science, stats, ML, software, programming, and computing.

No responses yet