Inverse distance weighting is a scattered data interpolation algorithm. There exists several variations of the algorithms, different both in conceptual and implementation aspects. ALGLIB package contains local version of inverse distance weighting algorithm, which generates C1-continuous interpolant and have *O(N·logN)* construction complexity and *O(logN)* interpolation complexity.

IMPORTANT: this algorithm is considered obsolete and was superseded by improved RBF models. Although it is still supported because of backward compatibility, we recommend you to use ALGLIB implementation of RBFs for scattered interpolation/fitting.

1 Inverse distance weighting: Shepard's method

2 Modified Shepard's method: scattered data interpolation

3 Modified Shepard's method: scattered data fitting

4 Tuning: *Nw* and *Nq*

5 Tuning: nodal functions

6 Downloads section

Consider the simplest form of inverse distance weighting interpolation before proceeding to a more complex algorithm implemented in the ALGLIB. Let *x _{i }* are points in

Such algorithm have both benefits and drawbacks. Its benefits are:

- Extreme simplicity of implementation
- Lack of tunable parameters
- Ability to work in N-dimensional space
- Ability to interpolate scattered data and to work on ANY grid. Multicollinearity is not a problem, algorithm may work even when all points lie in a low-dimensional subspace. When properly implemented, it may even work with coinciding nodes!

However, it has a number of important drawbacks:

- Low performance on large datasets: interpolation complexity is
*O(N)*. - Algorithm gives too much weight for distant nodes. Their total weight may be larger than weight of nearby nodes. It is more pronounced in high-dimensional spaces.
- Shepard method is a global method, which is a problem by itself. Interpolant is too sensitive to distant outliers.
*f(x)*is flat at*x*, i.e. it has zero derivative at the nodes._{i }

For these reasons original Shepard's method is rarely used in practice.

ALGLIB package contains more advanced version of inverse distance weighting interpolation algorithm - modified Shepard's method, first proposed by Robert J. Renka ('Multivariate Interpolation of Large Sets of Scattered Data', 1988). Modified interpolant have following form:

Modified Shepard's method differs from original algorithm in the following aspects:

- Only
*N*points is used for interpolation - a set of_{w }*N*nearest neighbors of_{w }*x*, which is denoted by*K*. - Weight functions
*W*has become more complex. Now they are equal to zero at the boundary of the sphere surrounding_{i }(x)*K*. - Instead of constant
*y*we've got nodal functions_{i }*Q*. These functions may be quadratic, linear or constant (at your choice)._{i }(x)*Q*is obtained as a result of weighted least squares fitting on a set of_{i }(x)*N*nearest neighbors of_{q }*x*with constraint_{i }*Q*.Formula (2) is used to calculate weights._{i }(x_{i })=y_{i } - kd-trees are used for efficient NN search with
*O(logN)*complexity.

Main benefits of modified algorithm are:

- Ability to work in N-dimensional space
- Ability to interpolate scattered data and to work on ANY grid. Multicollinearity is not a problem, algorithm may work even when all points lie in a low-dimensional subspace. When properly implemented, it may even work with coinciding nodes!
*f(x)*is C1-continuous (except for several special cases, see below)- Good performance even for large datasets (achieved due to efficient NN search algorithm). Interpolant construction has
*O(N·logN)*complexity, interpolation itself has*O(logN)*complexity. - Absence of "flat spots" near interpolation nodes (when quadratic or linear functions are used)
- Locality of interpolation algorithm.
*f(x)*depends only on nearest neighbors of*x*which significantly improves quality of interpolation. - Easy generalization to regression task, i.e. ability to work with noisy data (see below).
- Only two tunable parameters:
*N*è_{w }*N*(see below)._{q }

However, several drawbacks may be noted too:

- Modified Shepard's method is faster than original inverse distance weighting algorithm only for large datasets (hundreds of points) and only in low dimensional spaces (2-5 dimensions).
- Although algorithm may work on any grid, its performance degrades when points lie in a low-dimensional subspace. In such cases slow SVD-based least squares solver is used instead of faster QR-based one.
- In some
*rare*cases*f(x)*may have discontinuities. Discontinuity will appear when more than*N*nearest neighbors of_{w }*x*are equidistant from*x*. In such cases k-NN search algorithm will choose neighbors at random, depending on roundoff errors or order at which they were extracted from search structure. However,*f(x)*will remain bounded, and increasing*N*will decrease probability of such situations. For instance, lower bound for_{w }*N*is set to prevent such errors issues on Cartesian grid._{w }

However, these drawbacks did not outweigh the benefits of modified Shepard's algorithm, so we can recommend it as a standard tool for multidimensional interpolation (either scattered or regular).

Modified Shepard's method can be slightly modified to fit noisy data. The only modification is just calculation of *Q _{i }(x)*. First, constraint

**Note #1**

(1), (2) and (3) are left unchanged. All modifications are related to calculation of *A*, *b* and *c* from (3).

This algorithm should be applied only to noisy tasks. In absence of noise it have no benefits over interpolation algorithm.

*N _{w }* and

*N*parameter controls algorithm locality. Too large value will make interpolant too global (slow and unable to reproduce local changes in the function). From the other side, too small_{w }*N*will lead us to sharp and inaccurate interpolant (for example,_{w }*N*is just a nearest neighbor interpolation). Good_{w }=1*N*is usually slightly larger than_{w }*max(1.5·N*._{q },2^{ D}+1)*N*controls another aspect of locality: number of nodes used to build nodal functions_{q }*Q*. Good nodal function must pass through_{i }(x)*(x*and it should approximately reproduce function in the neighborhood of_{i },y_{i })*x*. Too small_{i }*N*will make us unable to build good approximation. But too large_{q }*N*will lead to_{q }*Q*which reproduces_{i }(x)*global*behavior of function, but is not very good where it is really needed: in the sphere of influence of*x*. Good_{i }*N*is usually 1.5 times larger than number of free parameters of the nodal function:_{q }*1+D*for a linear nodal function,*(D+2)·(D+1)/2*for a quadratic nodal function.

Situation is somewhat different with fitting tasks. Meaning of *N _{w }* is still the same, but

User can choose between four types of nodal functions:

__constant nodal function__. This function is used in the original Shepard's method and it is left only for evaluation purposes.__linear nodal function__, whose coefficients are calculated using least squares fitting. This function provides better accuracy than constant function.__quadratic nodal function__, whose coefficients are calculated using least squares fitting. This function provides best quality as long as you have enough data to robustly calculate its coefficients. If your data are sparse, may be linear nodal function will perform better.__"fast" linear nodal function__, whose coefficients are calculated with fast algorithm (gradient interpolation) instead of least squares-based algorithm. This is a special nodal function type for problems where speed is more important than quality. Coefficients of such nodal function did't minimize some least squares-like error function, but usually they provide moderate precision. However, this kind of nodal functions is less robust and noise-tolerant than previous ones, so you should use it only when you can test its quality/stability.

As for fitting problems, user can choose between only two types of nodal functions: linear and quadratic. Quadratic nodal function is still best choice - as long as you have enough data to fight with the noise. If noise level is too high, or data are too sparse, linear nodal function may be better option.

*This article is licensed for personal use only.*

ALGLIB Project offers you two editions of ALGLIB:

**ALGLIB Free Edition**:

delivered for free

offers full set of numerical functionality

extensive algorithmic optimizations

no low level optimizations

non-commercial license

**ALGLIB Commercial Edition**:

flexible pricing

offers full set of numerical functionality

extensive algorithmic optimizations

high performance (SMP, SIMD)

commercial license with support plan

Links to download sections for Free and Commercial editions can be found below:

C++ library.

Delivered with sources.

Monolithic design.

Extreme portability.

Delivered with sources.

Monolithic design.

Extreme portability.

C# library with native kernels.

Delivered with sources.

VB.NET and IronPython wrappers.

Extreme portability.

Delivered with sources.

VB.NET and IronPython wrappers.

Extreme portability.

Delphi wrapper around C core.

Delivered as precompiled binary.

Compatible with FreePascal.

Delivered as precompiled binary.

Compatible with FreePascal.

CPython wrapper around C core.

Delivered as precompiled binary.

Delivered as precompiled binary.