Geometric Interpretation of the Correlation between Two Variables
Pearson’s r and its geometric interpretation
Data exploration is the first and very crucial step to take, when you analyze a set of data. During this process, you utilize statistical methods and visualization techniques to investigate and explore the dataset so that you can find some meaningful information from it. Understanding the existing linear correlations within the variables is very important and the correlation coefficients would be of a great help.
One of the ways to interpret the linear correlation between two variables is to use Pearson correlation coefficient (also called Pearson’s r). The correlation is expressed with a value between -1 to 1, where -1 shows negative correlation while 1 indicates positive correlation. Following is the formula for Pearson’s r:
where n is the sample size, x_i and y_i are the sample points, and x̄ and ȳ are the means of the samples. Pearson’s r is essentially the covariance divided by the product of the standard deviations.
Another way of expressing the correlation is rather geometrical and uses linear algebra. In order to understand this, you first need to understand the geometric form of dot product.
In a geometric sense, the dot product tells you how much of the vector a is pointing in the same direction as the vector b. To do so, you need to project the vector a onto the vector b. However, simply projecting the vector a onto the vector b would cause the a to be scaled to the length of b; hence, you first need to normalize the vector b so that it has a length 1. This vector of length 1 is called unit vector and the unit vector of b can be obtained as below (vector b divided by the magnitude of its own):
Magnitude of a vector can be calculated by squaring each component of the vector, add them, and taking a square of the sum. Once the unit vector, which is parallel to the direction of the vector b is generated, you can now project the vector a to it. As you can see in the following diagram, when projected onto the unit vector u (the result of a dot u), the vector a has the magnitude of |a|cos⍬ while ⍬ being the angle between the two vectors. This is based on the simple trigonometric identity (adjacent over hypotenuse = cosine theta).
The resulting scaler product of a・u can also be understood as the component of a in the direction of b:
From earlier, we know that the unit vector in the direction of b can be denoted as b/|b|. Therefore, the above formula can also be written as:
Cleaning up the formula, we get:
This is the geometric expression of the dot product. Just by looking at this formula, it seems that the dot product has something to do with the magnitude of the vector b. But keep in mind that only the direction of the vector b matters when projection the vector a onto b.
The way of quantifying the correlation between vectors is to use the cosine of the angle ⍬ between the vectors. By rearranging the formula for the dot product, the correlation coefficient r can be expressed as:
Doesn’t this look awfully familiar? Indeed, a little tweaking around will bring it down to the formula for Pearson’s r:
As the following graph of cosine shows, cos(⍬) takes the value -1 at ⍬=π, 0 at ⍬ = π/2, 3π/2, and 1 at ⍬ = 0, 2π.
When the angle between the vectors is acute (0< ⍬ < 90), a・b is positive as cos(⍬) is positive, and there is a positive correlation between the two vectors. At right angle (⍬ = 90), a・b is zero as cos(⍬) is zero, and there is no correlation between the vectors. Finally, when at obtuse angle (90< ⍬ < 180), a・b is negative as cos(⍬) is negative, and there is a negative correlation between the two vectors.
This is quite intuitive — highly and positively correlated vectors point towards the similar directions while negatively correlated vectors point towards the opposite directions.