Normal Equation

6/20 Machine Learning via Stanford notes


normal equation is another algorithm to choose the parameters. It requires less steps/iterations.


X includes X0 (which is a column of 1)

this reminds me of something in stat102b… what is it??

Pros and Cons

Gradient descent

  1. requires many iterations
  2. works well with high dimensional model
  3. need to choose alpha


Normal Equiation

  1. no need to choose alpha
  2. 1 iteration
  3. slow when data has high dimensions (n>10,000)

What if X^t*X is non-invertible?

  • there are redundant features
  • too many features (regularization or delete features)