What are Covariance and Correlation coefficients and their significance?
Covariance and Correlation are very helpful in understanding the relationship between two continuous variables. Covariance tells whether both variables vary in the same direction (positive covariance) or in the opposite direction (negative covariance). There is no meaning of covariance numerical value only sign is useful. Whereas Correlation explains the change in one variable leads how much proportion change in the second variable. Correlation varies between -1 to +1. If the correlation value is 0 then it means there is no Linear Relationship between variables however other functional relationship may exist.
Let’s understand these terms in detail:
Covariance:
In the study of covariance only sign matters. A positive value shows that both variables vary in the same direction and negative value shows that they vary in the opposite direction.
Covariance between two variables x and y can be calculated as follows:
Where:
- x̄ is the sample mean of x
- ȳ is sample mean of y
- x_i and y_i are the values of x and y for ith record in the sample.
- n is the no of records in the sample
Significance of the formula:
- Numerator: Quantity of variance in x multiplied by the quantity of variance in y.
- Unit of covariance: Unit of x multiplied by a unit of y
- Hence if we change the unit of variables, covariance will have new value however sign will remain the same.
- Therefore the numerical value of covariance does not have any significance however if it is positive then both variables vary in the same direction else if it is negative then they vary in the opposite direction.
Correlation:
As covariance only tells about the direction which is not enough to understand the relationship completely, we divide the covariance with a standard deviation of x and y respectively and get correlation coefficient which varies between -1 to +1.
- -1 and +1 tell that both variables have a perfect linear relationship.
- Negative means they are inversely proportional to each other with the factor of correlation coefficient value.
- Positive means they are directly proportional to each other mean vary in the same direction with the factor of correlation coefficient value.
- if the correlation coefficient is 0 then it means there is no linear relationship between variables however there could exist other functional relationship.
- if there is no relationship at all between two variables then correlation coefficient will certainly be 0 however if it is 0 then we can only say that there is no linear relationship but there could exist other functional relationship.
Correlation between x and y can be calculated as follows:
Where:
- S_xy is the covariance between x and y.
- S_x and S_y are the standard deviations of x and y respectively.
- r_xy is the correlation coefficient.
- The correlation coefficient is a dimensionless quantity. Hence if we change the unit of x and y then also the coefficient value will remain the same.
Let’s understand what is the significance of the correlation coefficient with the help of the below graph:
Please share your ideas/thoughts in the comments section below. If you have any doubts on this topic, you are most welcome to write me a mail using the contact form. I will be happy to answer your queries.
Originally published at http://ashutoshtripathi.com on January 15, 2019.