What does it mean? (stats pun)

Sainjeev Srikantha
Human Systems Data
Published in
2 min readFeb 28, 2017

After reading chapter 3 on linear regression I learned more about simple and multiple linear regressions. So, what did I learn?

I found the example of advertising sales for three mediums of advertisement to be an appropriate example, which kept me engaged in the article despite the heavy statistical jargon. What I found the most interesting was how to fit your data to predict a correlation between them. For example model fit. It was pretty interesting to see the breakdowns of how sales were affected by one advertising medium, for example TV, but then to incorporate other advertising mediums to create a multiple linear regression was even more interesting. Something that i did not think about before was that the addition of variables increases the power of R². For example when TV , radio, and newspaper advertising are compared to predict sales there is a higher R² value than when just TV and radio are compared. This would suggest that newspaper is an important advertising medium. Think again. When looking at a simple linear regression, newspaper advertising is actually not significant, and the reason why the R² value is higher with the three advertising mediums is only because another another variable, newspaper, is added. This can be misleading when analyzing data, and shows the importance of attacking the data from multiple points of view.

After reading the chapter I looked further into regression. According to Dr. Peter Flom, multiple regression makes four assumptions. 1) The assumptions are about the errors from the model, 2) The errors are the difference between the predicted value of the dependent variable and the actual value of the dependent variable, 3) The errors from the model are normally distributed, 4) the errors have constant variance (The Advantages & Disadvantages of Multiple Regression). Keeping these assumptions in mind when analyzing a multiple regression can help avoid some of the pitfalls of data analysis, but it is important to not only run a multiple regression but also a simple regression as well.

The big advantage of multiple regression is that it is commonly known and used within research domains, but can be troublesome communicating the results. If I had not taken a statistics class prior I would of been lost when reading chapter 3, which is why to further connect withe the audience it is important to use simpler statistics or a graph as well and not only rely on multiple regression.

External References

http://sciencing.com/advantages-disadvantages-multiple-regression-model-12070171.html

--

--