Main       Free Edition       Commercial Edition       FAQ       Forum       About Us

Principal component analysis

The Principal Component Analysis (PCA) is one of the dimension reduction methods consisting in the transfer to a new orthogonal basis, whose axes oriented in the directions of the maximum variance of an input data set. The variance is maximum along the first axis of the new basis, whilst the second axis will maximize variance subject to the first axis orthogonality, and so forth, the last axis having the least variance of all possible ones. Such transformation permits information to be reduced by rejecting the coordinates that correspond to the directions with a minimum variance. If one of the base vectors needs to be rejected, that should preferably be the vector along which the input data set is less changeable.

It may be noted that the PCA is based on the following assumptions:

As it is evident, these conditions are by no means always met. For example, if points of an input set are positioned on the surface of a hypersphere, no linear transformation can reduce dimension (nonlinear transformation, however, can easily cope with this task). This disadvantage is equally attributable to all linear algorithms, and it can be eliminated due to the use of complementary dummy variables that are nonlinear functions of the input data set elements (the so-called "kernel trick").

The second disadvantage of the PCA consists in the fact that the directions maximizing variance do not always maximize information. The page of the LDA subroutines gives an example of such a task, wherein the maximum-variance variable affords almost no information, whilst the minimum-variance variable permits classes to be wholly separated. In this case, the PCA will give preference to the first (less informative) variable. This drawback is closely connected to the fact that the PCA does not perform linear separation of classes, linear regression or other similar operations, but it merely permits the input vector to be best restored on the basis of the partial information about it. All additional information pertaining to the vector (such as the identification of an image with one of the classes) is ignored.

This article is intended for personal use only.

Download ALGLIB

ALGLIB Project offers you two editions of ALGLIB:
ALGLIB Free Edition:
delivered for free
offers full set of numerical functionality
single-threaded, no low-level optimizations
non-commercial license (GPL or Personal/Academic)
ALGLIB Commercial Edition:
flexible pricing
offers full set of numerical functionality
high performance (multithreading, SIMD, Intel MKL)
commercial license with support plan
Links to download sections for Free and Commercial editions can be found below:
ALGLIB for C++
C++ library.
Delivered with sources.
Monolithic design.
Extreme portability.
Editions:    FREE     COMMERCIAL
ALGLIB for Delphi
Delphi wrapper around generic C core.
Delivered as precompiled binary.
Compatible with FreePascal.
Editions:    FREE     COMMERCIAL
Generic C# library.
Delivered with sources.
VB.NET and IronPython wrappers.
Extreme portability.
Editions:    FREE     COMMERCIAL
ALGLIB® - numerical analysis library, 1999-2017.
ALGLIB is a registered trademark of the ALGLIB Project.
Policies for this site: privacy policy, trademark policy.