What is correlation?
The term “correlation” refers to a relationship or association between two variables. It is a measure which shows us how closely the variables are related.
How can it be measured?
Correlation between the variables can be measured using the “correlation coefficient” ; it usually varies between ±1. Usually if correlation coefficient value is less than ±0.5 then it’s called as weak correlation, if value lies between ±0.5 to ±0.7 it’s called moderate correlation, if value is more than ±0.7 then it’s called as strong correlation.
Correlation coefficients can be measured in two ways:
- Pearson Correlation Coefficient:
It is used when the data is normally distributed, it is used to evaluate the linear relation between two variable (continuous) and it is usually denoted by “r”
Ø Assumptions:
- Scale of measurement should be interval or ratio
- Variables should be approximately normally distributed
- The association should be linear
- There should be no outliers in the data
Spearman Correlation Coefficient:
It is used when the data is non normally distributed, it is used to evaluate whose relationship is monotonic (i.e., a change in one variable is generally associated with a change in a specific direction in another variable), in spearman correlation the coefficient is based on ranks instead of their original values it is usually denoted by “Rs”
Ø Assumptions:
- Scale of measurement must be ordinal (or interval, ratio)
- Data must be in the form of matched pairs
- The association must be monotonic (i.e., variables increase in value together, or one increases while the other decreases)
- There should be no outliers in the data
References:
- https://www.jmp.com/en_in/statistics-knowledge-portal/what-is-correlation.html
- https://www.cuemath.com/correlation-coefficient-formula/
- https://statistics.laerd.com/statistical-guides/spearmans-rank-order-correlation-statistical-guide-2.php