Exploring Linear Regression

Joseph B. Driven
2 min readApr 24, 2017

--

What is linear regression? Linear regression is the study of linear relationships between variables. It models the relationship between a dependent variable y and one or more explanatory variables denoted x. Simple linear regression uses only one explanatory variable. Multiple linear regression uses more than one explanatory variable.

There are three primary uses for regression analysis. They include casual analysis, predicting an effect and trend forecasting.

Casual analysis

  • The regression may be used to determine the strength of the effect that the independent x variable has on a dependent y variable.
  • Some questions may include what is the strength between dose and effect, sales and advertising spent, age and income.

Predicting an effect

  • It can help us determine how much the dependent y variable changes with a change in one or more independent x variables.
  • How much additional y results for each one additional unit of x?

Trend forecasting

  • It can predict future values, outcomes and trends.
  • What will the price of gas be 6 months from now?

The linear regression line equation is y = mx + b. x is the explanatory variable and y is the dependent variable. The slope of the line is m, and b is the intercept (the value of y when x = 0). m and b are unknowns and require a dataset in order to determine their values

Solving for the unknown variables m and b can be accomplished by hand; however, the equation is long and daunting. Thankfully due to modern technology, there are plenty of programs and software available solving the linear regression should never have to be computed by hand. With that said, if you find yourself a little more on the old school side and want to try your hand at it, the values for m and b can be solved by the equations found here.

Correlation shows how close the line fits the points found in any given experiment. The closer R is to 1, the better the line fits the data. Again if you like knowing how to calculate this by hand, the equation can be found here. However, just like with the slope of the line and the intercept discussed above, it is highly recommend that you simply let the software do the work for you.

Let’s use our knowledge of linear regression to predict future values in the example found here.

For further questions or additional resources you can visit Khan Academy or review documentation on sklearn.

--

--