The Levenberg-Marquardt Method is recommended, if such a function as F = f_{1 }^{ 2}(x_{1 },...,x_{n }) + ... + f_{m }^{ 2}(x_{1 },...,x_{n }) needs to be minimized. The algorithm combines advantages of the steepest descent method (that is, minimization along the direction of the gradient) with the Newton method (that is, using a quadratic model to speed up the process of finding the minimum of a function). This algorithm obtained its operating stability from the steepest descent method, and adopted its accelerated convergence in the minimum vicinity from the Newton method. Standard implementation of the Levenberg-Marquardt algorithm (LMA), its drawbacks, and the updated algorithm version in the ALGLIB package are discussed below. The reader is recommended to familiarise himself/herself with the algorithm description on Wikipedia or in Numerical Recipes, prior to studying the following part. It is assumed hereinafter that the reader is familiar with the general principles of operating the Levenberg-Marquardt algorithm.
The most frequent use of the Levenberg-Marquardt algorithm is for solution of nonlinear regression problems. Well, nothing prevents you from using minlm subpackage (which is described here) as nonlinear least squares solver. However, ALGLIB package offers specialized interface for such problems which is included into lsfit subpackage. Working with specialized interface is more convenient that using underlying optimization algorithm directly.
However, interface implemented by lsfit subpackage is somewhat slower than original algorithm described here because of additional level of abstraction it provides. It is especially important for small-scale problems (1-5 parameters to fit) with very cheap functions/gradients.
What to choose - performance or convenience? If you are new to ALGLIB and want to quickly get some code which works, we recommend you to use nonlinear least squares solver built on top of Levenberg-Marquardt optimizer. You will save some development time and you will be able to qiuckly build working prototype.
However, if you need high performance on a small scale problems, we recommend you to work directly with underlying optimizer (which is described here). Or alternatively you can start from lsfit subpackage (specialized interface) and, after getting familiar with ALGLIB, switch to minlm subpackage (underlying optimizer).
Levenberg-Marquardt algorithm is also used to find minimum of functions which can be represented as sum of squares:
Although we can use nonlinear CG or L-BFGS algorithms to optimize F(x), Levenberg-Marquardt method allows us to make use of our knowledge about internal structure of F.
Last, less known application of the Levenberg-Marquardt algorithm is optimization of general form functions, i.e. functions which can not be decomposed into the sum of squares. We can use Levenberg-Marquardt algorithm if we have Hessian of F(x) and we want to use it for optimization. In this case Levenberg-Marquardt algorithm behaves like Trust Region Newton method.
The following algorithm versions may be used, depending on the information that is available:
Version name (written in bold) is a suffix which is appended to the minlmcreate
- name of function used to create optimizer.
I.e., following functions can be called by ALGLIB users: minlmcreatev, minlmcreatevj, minlmcreatefgh.
What scheme to choose? For a quick start, we recommend to use V scheme first (minlmcreatev function), because it is very simple to implement. Only function vector f is needed, and that's all. No Jacobian, no other higher order information. You calculate function vector, ALGLIB solves all differentiation-related issues.
After you've got familiar with ALGLIB, you can add Jacobian calculation and switch to VJ scheme. Jacobian-free optimization is easy to setup, but its performance is too low for time-critical applications. Furthermore, numerical differentiation doesn't allow us to find minimum with accuracy significantly larger than the numerical differentiation step. So if you need high performance or high accuracy, you should implement analytic Jacobian.
Note #1
If you optimize general form function with LM, then you should start straight from the FGH scheme and implement all at once - function, gradient, Hessian.
ALGLIB package offers four kinds of stopping conditions:
You can set one or several conditions by calling minlmsetcond function. After algorithm termination you can examine completion code to determine what condition was satisfied at the last point.
We recommend you to use first criterion - small value of gradient norm. This criterion guarantees that algorithm will stop only near the minimum, independently if how fast/slow we converge to it. Second and third criteria are less reliable because sometimes algorithm makes small steps even when far away from minimum. For example, it can be the case when we don't have analytic Jacobian or with different quadratic model reuse strategies.
Note #2
You should not expect that algorithm will be terminated by and only by stopping criterion you've specified.
For example, algorithm may take step which will lead it exactly to the function minimum - and it will be terminated by first criterion (gradient norm is zero),
even when you told it to "make 100 iterations no matter what".
Leveberg-Marquardt algorithm supports optimization with boundary constraints, i.e. constraints of the form l_{i }≤x_{i }≤u_{i }. Boundary constraints can be set with minlmsetbc function. These constraints are handled very efficiently - computational overhead for having N constraints is just O(N) additional operations per function evaluation. Finally, these constraints are always exactly satisfied. We won't calculate function at points outside of the interval given by [l_{i },u_{i }]. Optimization result will be inside [l_{i },u_{i }] or exactly at its boundary.
After creation and tuning of the optimizer object you can begin optimization using minlmoptimize function. It accepts as parameters optimizer object and callbacks which calculate function/gradient. Optimization result can be obtained using minlmresults function.
ALGLIB Reference Manual includes several examples on Levenberg-Marquardt algorithm:
They cover the most typical use cases for LM. You can copy example source to your IDE, compile, run, modify it. We recommend to read these examples before writing your own code.
When you solve a sequence of optimization problems with same N/M, you can allocate new optimizer object every time you want to solve new problem. However, creation of the new optimizer leads to extensive memory allocation, which slows down optimization process. More efficient solution is to use minlmrestartfrom function, which restarts algorithm from the new point (and with new function) without additional dynamic allocations.
ALGLIB implementation of the Levenberg-Marquardt algorithm includes acceleration which can be used to improve optimizer performance.
Original Levenberg-Marquardt algorithm builds quadratic model of a function, makes one step and then builds 100% new quadratic model. This algorithm is used by default in the VJ and FGH versions of ALGLIB LM. However, building quadratic model from scratch can be time consuming process, which leads us to the next idea.
The idea is that we update Jacobian using secant update (also known as Broyden's rank one update) instead of recalculating it using analytic or numerical differentiation. Secant update is constructed using function vector at the new point. Such updated Jacobian is less accurate, so our next step will be somewhat worse than step using exact Jacobian. But is still will decrease function value. As result, we make more steps (and solve more linear systems on each step), but Jacobian is reused several times before recalculation. It is obvious that we can benefit from such strategy only when the cost of the Jacobian evaluation is higher that the linear solver cost.
This strategy is turned on by calling minlmsetacctype(state,1)
and can be used with any optimization scheme which relies on function vector (V, VJ, VGJ).
It is turned on by default when we use V-scheme.
In this case Jacobian is calculated using numerical differentiation, which is very slow process, and first strategy is always beneficial.
In other cases acceleration should be explicitly turned on by minlmsetacctype call.
This article is intended for personal use only.