Physics can help you understand regression better

Saliha Akca-Hobbins
Human Systems Data
Published in
3 min readFeb 28, 2017

If you are coming from a STEM major to a social science program, at the beginning you think that you don’t know anything. At least that is how I felt. It is a completely new area.

I have been trained for years to be a physicist. Since I came to a different discipline, I thought that years of training didn’t mean anything in a social science major. But this week’s reading showed me that I was wrong. Physics can help you to understand social studies better.

The reading assignment was about linear regression. It was quite math intense, which makes it easier to read. After I completed the reading, I realized that the linear regression analysis is actually a well known “ slope analysis” of physics labs. Basically the “slope equation” is renamed as “linear regression” in social science. Maybe the terms are different, but still the logic is the same.

Slope equation Y=f(x)+ e, where f(x)=Ax+B

A=Slope (the tangent function)

B= Y intercept where X=0.

e=mean-zero random error (usually, we ignore this value in physics labs)

In the reading, “A and B” are defined as coefficient numbers and the lists of hypothesis analysis are given based coefficient numbers. In other words, it is a “tangent ” function analysis. For example;

H 0= Null Hypothesis, which means mathematically slope function is equal to “zero” or tan(q)=0.

H 1 =There is a relationship between two variables, which means slope function is not equal to zero (tan is not equal to zero, it can have a positive or negative value.)

Moreover, if the slope has a negative value, then there is a negative correlation between two variables. If it has a positive value, it has a positive correlation between X and Y.

Although, linear regression has a fancy name, I realized that it is nothing more than slope analysis of two variables. Up to this point, it was easy to understand. But how about “multiple linear regression? The name might sound be a bit scary. But I realized that it is one of well-known methods from vectors class; the vector matrix analysis. Basically,

Y= [vector of x values ] x [vector of B coefficient] , where Y is depended on set of X and B variables.

Mainly, you are describing Y values, which is a cross product of vector X and vector B.

Although I learned all these methods from physics classes, the examples in the reading helped me to visualize social science applications. For example, in Figure 3.5 (James, Witten, Hastie, & Tibshirani, 2013, p. 81), the graph will help you to visualize a space where TV, Radio and Sales are the main dimensions/predictors and the cross product of your matrix is shown as three dimensional area.

I realized that at the end, all disciplines are using the same mathematical models with different names. For more information, you can watch a matrix notation of multi regression video. In the video it is explaining mathematical method behind multiple regression calculation. If you are interested with the mathematical meaning of a multiple regression, the beginning of the video (0–5 min) will give you the basic information you need.

Reference:

James, G., Witten, D., Hastie, T., & Tibshirani, R. (2013). An introduction to statistical learning (Vol. 6). New York: springer.

--

--