Linear regression is one of the best-known regression methods. The main advantages of the algorithm are simplicity and high speed. There is only one disadvantage - its unsuitability for solving inherently nonlinear problems.

## Contents
1 Linear Regression in ALGLIB |

Operations with linear models are performed in two stages:

- Linear model construction by calling one of the subroutines (subroutine choice will depend on the problem to be solved). Result is a LinearModel structure containing the model built.
- Operations with the model (data processing, model copying/serialization, etc.)

The linear regression algorithm that is included in the ALGLIB package uses singular value decomposition (SVD). However, there are a number of improvements in the algorithm, as compared with the classical approach to this problem:

- First, prior to proceeding with the work, the ALGLIB package will standardize variables in order to improve condition number and to solve some of the problems, which may arise if variables are inadequately scaled.
- Second, the ALGLIB package will calculate a cross-validation estimate of the generalization error using fast algorithm. This algorithm uses Sherman-Morrison formula to update SVD of the task matrix when one row is left out of the training set. Fast algorithm permits generalization error to be calculated in
*O(N·M)*time (where*N*is the size of the training set, and*M*is the number of variables). For comparison, the time solve a linear regression problem is*O(N·M*, whereas the straightforward "leave-one-out" cross-validation will take^{ 2})*O(N*time.^{ 2}·M^{ 2})

There are two distinctive features of fast cross-validation evaluation that should be mentioned. First, the "fast" formula is applicable only to non-degenerate problems: when dealing with a degenerate task, the dimension of the problem shall be reduced using the Principal Component Method. It is automatically done if necessary, but it slows down the algorithm approximately twice as much. Secondly, there are more refined types of degeneracy: in view of the solution of linear equations, the problem is non-degenerate, but the Sherman-Morrison formula turns inapplicable to some vectors of the training set (for some of the vectors which are ordinary in appearance, it leads to the division by zero). Theoretically, a training set may contain up to M+1 of such "defective" elements (out of the available N), although there is mostly none. These "defective" vectors are not taken into account when a cross-validation error estimate is made, which brings about some misrepresentation of the algorithm's result, but it is not a bug (in case somebody is determined to perform a testing of fast cross-validation algorithm, and accidentally runs into one of such vectors).

*This article is intended for personal use only.*

ALGLIB Project offers you two editions of ALGLIB:

delivered for free

offers full set of numerical functionality

single-threaded, no low-level optimizations

non-commercial license (GPL or Personal/Academic)

flexible pricing

offers full set of numerical functionality

high performance (multithreading, SIMD, Intel MKL)

commercial license with support plan

Links to download sections for Free and Commercial editions can be found below:

ALGLIB for C++

C++ library.Delivered with sources.

Monolithic design.

Extreme portability.

ALGLIB for Delphi

Delphi wrapper around generic C core.Delivered as precompiled binary.

Compatible with FreePascal.

ALGLIB for C#

Generic C# library.Delivered with sources.

VB.NET and IronPython wrappers.

Extreme portability.

ALGLIB® - numerical analysis library, 1999-2017.

ALGLIB is a registered trademark of the ALGLIB Project.

Policies for this site: privacy policy, trademark policy.

ALGLIB is a registered trademark of the ALGLIB Project.

Policies for this site: privacy policy, trademark policy.