The Gradients of Linear Regression cost function
The cost function for linear regression is:
J(theta) = (1/2m) * (X * theta — y)^T * (X * theta — y)
where X is the m by (n+1) design matrix, theta is the (n+1) by 1 parameter vector, y is the m by 1 target vector, and ^T denotes the transpose operator.
To calculate the gradient of J(theta), we need to take the derivative of J(theta) with respect to each element of theta. This can be done using vector multiplication as follows: