Main       Free Edition       Commercial Edition       FAQ       Forum       About Us

Significance test for correlation coefficient

After calculating a correlation coefficient, it is usually reasonable to check its significance. Even if the variables have no correlation, for samples of finite size the correlation coefficient will be non-zero. Zero correlation coefficient is even more improbable than exactly 500 heads from 1000 coin tosses.

The algorithms represented on this page let us take 3 tests for correlation coefficient significance. The first test is a two-tailed test checking a hypothesis about zero correlation between two variables. The left-tailed test checks null hypotheses about non-negative correlation (i.e. correlation coefficient is greater than or equal to 0). Right-tailed test checks null hypothesis about non-positive correlation.

Significance test for Pearson's correlation coefficient is performed by PearsonCorrelationSignificance subroutine. This subroutine requires samples to be normal, because tails of Pearson's correlation coefficient distribution have been calculated for normal samples only. If samples differ slightly from normal distribution, this test is applicable, but its results will be not accurate. As deviation increases, the results become less credible. Therefore, if you are not confident that samples are close enough to normal distribution, it's better to use non-parametric correlation coefficient (Spearman's rank correlation coefficient) and the corresponding test which doesn't require sample normality. This test is performed by SpearmanRankCorrelationSignificance subroutine.

As it was noted above, the significance test for rank correlation doesn't depend on sample distribution. One more advantage of the non-parametric correlation coefficient is that it is less affected by the outliers. If the sample size is small, one big outlier can enlarge Pearson's correlation coefficient and make the wrong conclusion. Spearman's rank correlation coefficient is less affected by outliers (independently of the outlier size, its impact on correlation coefficient is bounded from above), which makes it irreplaceable when processing noisy data.

Manual entries

C++ correlationtests subpackage   
C# correlationtests subpackage   

This article is intended for personal use only.

Download ALGLIB

C#

C# source.

Downloads page

 

C++

C++ source.

Downloads page

 

C++, multiple precision arithmetic

C++ source. MPFR/GMP is used.

GMP source is available from gmplib.org. MPFR source is available from www.mpfr.org.

Downloads page

 

FreePascal

FreePascal version.

Downloads page

 

Delphi

Delphi version.

Downloads page

 

VB.NET

VB.NET version.

Downloads page

 

VBA

VBA version.

Downloads page

 

Python

Python version (CPython and IronPython are supported).

Downloads page

 

 

ALGLIB® - numerical analysis library, 1999-2017.
ALGLIB is a registered trademark of the ALGLIB Project.
Policies for this site: privacy policy, trademark policy.