Linear Regression With Ordinary Least Squared Method and Gradient Descent From Scratch In Python.

Linear regression is modeling the linear relationships between real-valued predictors and target variables. We find the function that best representative of the data.

Oct 21, 2020 · 4 min read

Introduction

Linear regression is the first model we learn about regression analysis since high school. We are looking for regression line that fit the samples drawn from population with the most commonly used statistical method, ordinary least square regression (OLS). Weestimate model parameters. However, estimates functions get complex as we have more independent variables to be included in the model. Gradient descent (GD)is another option that is widely applied to a range of models training. It is simpler to implement for linear regression model.

Here, for simplicity, we are trying to build a linear regression from scratch for simple linear regression.

OLS

Ordinary least square method is non-iterative method to fit a model by seeking to minimize sum of squared errors. There is a list of assumptions to satisfy when we are applying OLS.

Assumption:

• Function must be linear in parameters and error term.
• Error terms are independent with each other and all independent variables.
• Error terms are normally distributed with mean of zero and constant variance.
• No independent variables are perfectly correlated.

Estimates of coefficients for simple linear regression are derived as below:

1. Compute sum of square error, SSE.

2. Differentiate with respect to parameters.

3. Let differential equation equal to zero. Solve the simultaneous equation to get estimates of parameters.

The process above gets lengthy and complicated as we have more independent variables included, and hence more estimate functions to be derived.

OLS from Scratch

Objective Function / Loss function

Loss function is the cost function associated with error in prediction. Error is difference between our predictions and true values. Often, we square the error for ease of derivatives computation. Loss function, mean squared error will be applied in gradient descent method below.

This is a popular optimization method. We are working iterative to find best coefficients for regression line, having the minimum loss. The logic behind gradient descent is like when a ball rolling down a curve. To reach the bottom, it should move in opposite direction to the slope. The scenario is illustrated below.

What is the step size to be taken to ensure we do not miss the bottom? This will be the challenge in GD method. When the step size is too large, we miss the destination. When the step size is too small, we take too long to the point.

1. Compute gradient of loss function.

2. Selecting appropriate learning rate. Randomly select parameters of linear regression function.

3. Update the parameters.

4. Repeat the process until the loss is within our acceptance level or parameters converges.

Model from Scratch by Python

Summary

Looking at gradient descent visualization above, the degree at which the fitted line rotates and shifts appears to slow down as it approaches the final result. This is similar to the ball rolling illustration, as the ball approaches bottom, gradient decreases, and hence update size (delta in Fig 13) decreases. As discussed, OLS is a single run where information is substituting into equation derived to get estimates of parameter directly, while GD is running iterative until it arrived at the best result satisfying required condition. Obviously, OLS will become tougher to apply as features dimension increases.

The Startup

Get smarter at building your thing. Join The Startup’s +785K followers.

By The Startup

Get smarter at building your thing. Subscribe to receive The Startup's top 10 most read stories — delivered straight into your inbox, once a week. Take a look.

Medium sent you an email at to complete your subscription.

Written by

Data.

The Startup

Get smarter at building your thing. Follow to join The Startup’s +8 million monthly readers & +785K followers.

Written by

Data.

The Startup

Get smarter at building your thing. Follow to join The Startup’s +8 million monthly readers & +785K followers.

Artistic Voronoi Diagrams in Python

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app