Published in

Analytics Vidhya

# Linear Regression

Single And Multiple Dependent Variables

For instance we want to predict the marks of students given how many hours they study ( Single Feature) or when we predict price of house given many features like area ,proximity to market , hospital and many other features.

θ=[ θ₀ θ₁ ……θᵣ] where θ₀ is c(intercept) while θ₁ is m(slope) and other θ values for other features.

Gradient descent is a first-order iterative optimization algorithm for finding a local minimum of a differentiable function. The idea is to take repeated steps in the opposite direction of the gradient (or approximate gradient) of the function at the current point, because this is the direction of steepest descent.

`def hypothesis(x,theta):    y_ = 0.0    n = x.shape[0]    for i in range(n):        y_ += (theta[i]*x[i])    return y_def error(X,y,theta):    e=0.0    m = X.shape[0]        for i in range(m):        y_ = hypothesis(X[i],theta)        e += (y[i] - y_)**2    return e/mdef gradient(X,y,theta):    m,n = X.shape    grad = np.zeros((n,))    # for all values of j    for j in range(n):        # sum over all examples        for i in range(m):            y_ = hypothesis(X[i],theta)            grad[j] += (y_ - y[i])*X[i][j]    # out of loops    return grad/mdef gradient_descent(X,y,learning_rate=0.1,max_epochs=300):    m,n = X.shape    theta = np.zeros((n,))    error_list = []        for i in range(max_epochs):        e = error(X,y,theta)        error_list.append(e)                # Gradient Descent        grad = gradient(X,y,theta)        for j in range(n):            theta[j] = theta[j] - learning_rate*grad[j]    return theta,error_list`

This is method is quite slow!! We can improve this by doing vectorization in our code.

`def hypothesis(X,theta):    return np.dot(X,theta)def error(X,y,theta):    error =0.0    y_ = hypothesis(X,theta)    m = X.shape[0]    error = np.sum((y-y_)**2)        return error/mdef gradient(X,y,theta):    y_ = hypothesis(X,theta)    grad = np.dot(X.T,(y_-y))    m = X.shape[0]        return grad/mdef gradient_descent(X,y,learning_rate=0.1,max_iter=300):    n = X.shape[1]    theta = np.zeros((n,))    error_list = []        for i in range(max_iter):        e = error(X,y,theta)        error_list.append(e)                # Gradient Descent        grad = gradient(X,y,theta)        theta = theta - learning_rate*grad            return theta,error_list`

# R2 Score

What Is Goodness-of-Fit for a Linear Model? — R2Score

`def r2Score(y,y_): num = np.sum((y-y_)**2) deno = np.sum((y-y.mean())**2) score = (1-num/deno) return score*100`

--

--

Analytics Vidhya is a community of Analytics and Data Science professionals. We are building the next-gen data science ecosystem https://www.analyticsvidhya.com