Contents
Main
Site map
Links
Site and author
News
Contact

Levenberg-Marquardt algorithm for multivariate optimization

The Levenberg-Marquardt Method is recommended, if such a function as F = f 2(x,...,x) + ... + f 2(x,...,x) needs to be minimized. The algorithm combines advantages of the steepest descent method (that is, minimization along the direction of the gradient) with the Newton method (that is, using a quadratic model to speed up the process of finding the minimum of a function). This algorithm obtained its operating stability from the steepest descent method, and adopted its accelerated convergence in the minimum vicinity from the Newton method. Standard implementation of the Levenberg-Marquardt algorithm (LMA), its drawbacks, and the updated algorithm version in the ALGLIB package are discussed below. The reader is recommended to familiarise himself/herself with the algorithm description on Wikipedia or in Numerical Recipes, prior to studying the following part. It is assumed hereinafter that the reader is familiar with the general principles of operating the Levenberg-Marquardt algorithm.

Implementation of the Levenberg-Marquardt Method in ALGLIB

The traditional version of the Levenberg-Marquardt algorithm supposes using the Jacobian matrix to make a quadratic model, taking one iteration step per the model, and the construction of a subsequent new model. There are a number of disadvantages in this approach, such as the algorithm's general inefficiency, high demands for hardware resources, and a decrease in convergence towards the minimum when solving problems of a certain class. These shortcomings are fully detailed below:

  • The effect of several factors accounts for the inefficiency of the LMA. First, the quadratic model of a function in the original algorithm is updated whenever a new iteration step is taken, whereas the model's updating is a very time consuming procedure. The model can be reused, and several successive steps can be taken. There will be a decline in the efficiency of such steps. However, this disadvantage will be outweighed by a reduction of step cost due to the elimination of the need to update the model too often. Second, the quadratic model construction during initial optimization periods is just unpractical. In any case, even if it is built, all the same the LMA will be taking iteration steps pertinent to the steepest descent method.
  • High demands for hardware resources. The number of functions M is very often much greater than the dimension of task N. The Jacobian matrix storage requires O(M·N) of computer memory, whereas the overall capacity of the resultant quadratic model is usually a mere O(N·N). This difference may practically range from a few times to several dozens of times. If the Hessian of function F is known, then the quadratic model construction on its basis may substantially bring down the requirements for computer memory, without any significant change in the speed. With reference to some tasks (such as the problems involved in making neural network classifiers), the Hessian can be calculated much more quickly than the Jacobian.
  • Convergence towards the minimum. As shown in Numerical Recipes, the fact that a number of second-order terms is neglected accounts for simplicity of the Levenberg-Marquardt algorithm. In case the minimum is reached with a low value F (good fit), the neglected terms cancel each other out, and the algorithm convergence is similar to the non-truncated quadratic one. However, if the minimum is reached in the range of high values F (bad fit), the convergence will become much worse. Using the precise Hessian (if available) will allow the problem to be solved.

Implemented in the ALGLIB package is an updated algorithm version that is adjusted subject to the afore-mentioned observations. The new algorithm may use more comprehensive information on the task, including values of function f, the Jacobian matrix, the gradient of function F, and the Hessian of function F. The following algorithm versions may be used, depending on the information that is available:

  • FJ (function+Jacobian) - The original Levenberg-Marquardt algorithm.
  • FGJ (function+gradient+Jacobian) - A hybrid algorithm with quadratic model reuse. The L-BFGS algorithm is used at the first stage to provide for quick convergence towards the range of comparatively adequate function values, whereupon the Levenberg-Marquardt method is used. The quadratic model built is used after each iteration as a preconditioner for several iteration steps of the L-BFGS algorithm.
  • FGH (function+gradient+Hessian) - A hybrid algorithm using the precise Hessian and reusing a quadratic model. It is an optimal choice for the most part (if there is an opportunity to choose between using either the Jacobian or the Hessian).

Use of Algorithm and Reverse Communication

The optimization algorithm shall obtain values of a function/gradient/... during its operation. This problem is solved in most program packages by transferring the pointer to the function (C++, Delphi) or delegate (C#) which is used to calculate function value/gradient/Hessian.

The ALGLIB package, differently from other libraries, makes use of reverse communication to solve this problem. When a value (or derivatives) of a function needs/need to be calculated, the algorithm state is stored within a special structure, control is returned to the calling program, which makes all calculations and recalls the computing subroutine.

Thus, the optimization algorithm is operated in accordance in the following order:

  1. LMState data structure preparation by means of one of the algorithm initialization subrotuines (MinLMFJ, MinLMFGJ, MinLMFGH).
  2. MinLMIteration subroutine call.
  3. If False is returned from the subroutine, then the algorithm operation is completed, and the minimum is found (the minimum itself can be obtained by calling the MinLMResults subprogram).
  4. If True is returned from the subroutine, the latter will make a request for information on the function. The function/gradient/Jacobian/Hessian shall be calculated according to the field of the LMState structure, which is set as True (this issue is fully detailed below).
  5. The MinLMIteration subroutine needs to be called again after the requested information is loaded into the LMState structure.

The following fields of the LMState structure are used in order to exchange information with the user:

  • LMState.X[0..N-1] – An array storing information on coordinates of the point x
  • LMState.F – The value of function F(x) should be stored in this field (if it is requested)
  • LMState.FI[0..M-1] – The vector of functions f(x) should be stored in this field (if it is requested)
  • LMState.G[0..N-1] – The gradient grad F(x) should be stored in this field (if it is requested)
  • LMState.J[0..M-1,0..N-1] – The Jacobian should be stored in this field (if it is requested)
  • LMState.H[0..N-1,0..N-1] – The Hessian matrix H(F) should be stored in this field (if it is requested)

The MinLMIteration subprogram can set one, and only one, of the following fields to True, subject to what exactly needs to be calculated:

  • NeedF – the value of function F(x) needs to be calculated (Not to be confused with vector f)
  • NeedFG – the value of function F(x) and gradient grad F(x) need to be calculated
  • NeedFiJ – the vector of functions F(x) and Jacobian need to be calculated
  • NeedFGH – the value of function F(x), gradient grad F(x) and Hessian H(F)}} need to be calculated

The table given below will show which of the fields can be set by different LMA versions.

         NeedF   NeedFG  NeedFiJ NeedFGH
MinLMFJ     X               X           
MinLMFGJ    X       X       X           
MinLMFGH    X       X               X

Manual entries

C++ minlm.h   
C# minlm.cs   
MPFR minlm.h   
Delphi minlm.pas   
FreePascal minlm.pas   
VBA minlm.bas   

This article is intended for personal use only.

Download ALGLIB

C#

C# source.

alglib-2.4.0.csharp.zip

 

C++

C++ source.

alglib-2.4.0.cpp.zip

 

C++, multiple precision arithmetic

C++ source. MPFR/GMP is used.

GMP source is available from gmplib.org. MPFR source is available from www.mpfr.org.

alglib-2.4.0.mpfr.zip

 

FreePascal

FreePascal source.

alglib-2.4.0.freepascal.zip

 

Delphi

Delphi source.

alglib-2.4.0.delphi.zip

 

Visual Basic

VBA source.

alglib-2.4.0.vb6.zip

 


 
 
Sergey Bochkanov, Vladimir Bystritsky
Copyright © 1999-2010