# Interpolation and Extrapolation

This course belongs to the track Numerical Algorithms in the Department of Scientific Computing in the School of Computer Science.

In this course, students will learn how to approximate discrete data. Different representations and techniques for the approximation are discussed.

## Discrete Data

Discrete data arises from measurements of ${\displaystyle f(x)}$ or when ${\displaystyle f(x)}$ is difficult or expensive to evaluate.

In general, discrete data is given by a set of nodes ${\displaystyle x_{i}}$, ${\displaystyle i=1\dots n}$ and the function ${\displaystyle f(x)}$ evaluated at these points: ${\displaystyle f_{i}=f(x_{i})}$, ${\displaystyle i=1\dots n}$.

In the following, we will refer to the individual nodes and function values as ${\displaystyle x_{i}}$ and ${\displaystyle f_{i}}$ respectively. The vectors of the ${\displaystyle n}$ nodes or function values are written as ${\displaystyle {\underline {\mathbf {x} }}}$ and ${\displaystyle {\underline {\mathbf {f} }}}$ respectively.

We will also always assume that the ${\displaystyle x_{i}}$ are distinct.

## Polynomial Interpolation

Although there are many ways of representing a function ${\displaystyle f(x)}$, polynomials are the tool of choice for interpolation. As the Taylor series expansion of a function ${\displaystyle f(x)}$ around ${\displaystyle 0}$ shows, every continuous, differentiable function has a polynomial representation

 ${\displaystyle f(x)}$ ${\displaystyle =}$ ${\displaystyle f(0)+f'(0)x+{\frac {1}{2}}f''(0)x^{2}+{\frac {1}{6}}f'''(0)x^{3}+\dots }$ ${\displaystyle =}$ ${\displaystyle \sum _{n=0}^{\infty }\underbrace {{\frac {1}{n!}}f^{(n)}(0)} _{=a_{n}}x^{n}}$ ${\displaystyle =}$ ${\displaystyle \sum _{n=0}^{\infty }a_{n}x^{n}}$

in which the ${\displaystyle a_{n}={\frac {1}{n!}}f^{(n)}(0)}$ are the exact weights.

If a function ${\displaystyle f(x)}$ is assumed to be continuous and differentiable and we assume that it's derivatives ${\displaystyle f^{(n)}(0)}$ are zero or of negligible magnitude for increasing ${\displaystyle n}$, it is a good idea to approximate the function with a polynomial.

Given ${\displaystyle n}$ distinct nodes ${\displaystyle x_{i}}$, ${\displaystyle i=1\dots n}$ at which the function values ${\displaystyle f_{i}=f(x_{i})}$ are known, we can construct an interpolating polynomial ${\displaystyle g(x)}$ such that ${\displaystyle g(x_{i})=f_{i}}$ for all ${\displaystyle i=1\dots n}$. Such an interpolating polynomial with degree ${\displaystyle \leq n}$ is unique. However infinitely many such polynomials can be found with degree ${\displaystyle >n\;}$.

Any nice way of deriving the interpolation error?

## The Vandermonde Matrix

By definition as given above, our polynomial

${\displaystyle g(x)=\sum _{i=0}^{n-1}a_{i}x^{i}}$

must interpolate the function ${\displaystyle f(x)}$ at the ${\displaystyle n}$ nodes ${\displaystyle x_{i}}$. This leads to the following system of ${\displaystyle n}$ linear equations in the coefficients ${\displaystyle a_{i}}$:

 ${\displaystyle f(x_{1})}$ ${\displaystyle =}$ ${\displaystyle a_{0}+a_{1}x_{1}+a_{2}x_{1}^{2}+\dots +a_{n-1}x_{1}^{n-1}}$ ${\displaystyle f(x_{2})}$ ${\displaystyle =}$ ${\displaystyle a_{0}+a_{1}x_{2}+a_{2}x_{2}^{2}+\dots +a_{n-1}x_{2}^{n-1}}$ ${\displaystyle \vdots }$ ${\displaystyle f(x_{n})}$ ${\displaystyle =}$ ${\displaystyle a_{0}+a_{1}x_{n}+a_{2}x_{n}^{2}+\dots +a_{n-1}x_{n}^{n-1}}$

We can re-write this system of linear equation in matrix notation as

${\displaystyle \mathbf {M} \mathbf {a} =\mathbf {f} }$

where ${\displaystyle \mathbf {a} }$ is the vector of the ${\displaystyle a_{i}}$, ${\displaystyle i=0\dots n-1}$ coefficients and ${\displaystyle \mathbf {M} }$ is the so-called moment matrix or Vandermonde matrix with

${\displaystyle \mathbf {M} ={\begin{bmatrix}1&x_{1}&x_{1}^{2}&\dots &x_{1}^{n-1}\\1&x_{2}&x_{2}^{2}&\dots &x_{2}^{n-1}\\\vdots &\vdots &\vdots &\ddots &\vdots \\1&x_{n}&x_{n}^{2}&\dots &x_{n}^{n-1}\end{bmatrix}}}$

or

${\displaystyle \mathbf {M} _{i,j}=x_{i}^{j-1}}$

This linear system of equations can be solved for ${\displaystyle \mathbf {a} }$ using Gauss elimination or other methods.

The following shows how this can be done in Matlab or GNU Octave.

% declare our function f
f = inline('sin(pi*2*x)','x')

% order of the interpolation?
n = 5

% declare the nodes x_i as random in [0,1]
x = rand(n,1)

% create the moment matrix
M = zeros(n)
for i=1:n
M(:,i) = x.^(i-1);
end

% solve for the coefficients a
a = M \ f(x)

% compute g(x)
g = ones(n,1) * a(1)
for i=2:n
g = g + x.^(i-1)*a(i);
end;

% how good was the approximation?
g - f(x)

However practical it may seem, the Vandermonde matrix is not especially suited for more than just a few nodes. Due to it's special structure, the Condition Number of the Vandermode matrix increases with ${\displaystyle n}$.

If in the above example we set ${\displaystyle n=10}$ and compute ${\displaystyle \mathbf {a} }$ with

% solve for the coefficients a
a = inv(M) * f(x)

the approximation is only good up to 10 digits. For ${\displaystyle n=20}$ it is accurate only to one digit.

We must therefore consider other means of computing and/or representing ${\displaystyle g(x)}$.

## Lagrange Interpolation

Given a set of ${\displaystyle n}$ nodes ${\displaystyle x_{i}}$, ${\displaystyle i=1\dots n}$, we define the ${\displaystyle k}$th Lagrange Polynomial as the polynomial of degree ${\displaystyle n-1}$ that satisifies

${\displaystyle \ell _{k}(x_{i})={\begin{cases}0\quad &i\neq k\\1&i=k\end{cases}}}$

Maybe add a nice plot of a Lagrange polynomial over a few nodes?

The first condition, namely that ${\displaystyle \ell _{k}(x_{i})=0}$ for ${\displaystyle i\neq k}$ can be satisfied by construction. Since every polynomial of degree ${\displaystyle n-1}$ can be represented as the product

${\displaystyle g(x)=c\prod _{i=1}^{n-1}(x-\xi _{i})}$

where the ${\displaystyle \xi _{i}}$ are the roots of ${\displaystyle g(x)}$ and ${\displaystyle c}$ is some constant, we can construct ${\displaystyle \ell _{k}^{\star }(x)}$ as

${\displaystyle \ell _{k}^{\star }(x)=\prod _{i=1,i\neq k}^{n}(x-x_{i})}$.

The polynomial ${\displaystyle \ell _{k}^{\star }(x)}$ is of order ${\displaystyle n-1}$ and satisifies the first condition, namely that ${\displaystyle \ell _{k}^{\star }(x_{i})=0}$ for ${\displaystyle i\neq k}$.

Since, by construction, the ${\displaystyle n-1}$ roots of ${\displaystyle \ell _{k}^{\star }(x)}$ are at the nodes ${\displaystyle x_{i}}$, ${\displaystyle i\neq k}$, we know that the value of ${\displaystyle \ell _{k}^{\star }(x_{k})}$ cannot be zero. This value, however, should be ${\displaystyle 1}$ by definition. We can enforce this by scaling ${\displaystyle \ell _{k}^{\star }(x)}$ by ${\displaystyle \ell _{k}^{\star }(x_{k})^{-1}}$:

 ${\displaystyle \ell _{k}(x)}$ ${\displaystyle =}$ ${\displaystyle {\frac {\ell _{k}^{\star }(x)}{\ell _{k}^{\star }(x_{k})}}}$ ${\displaystyle =}$ ${\displaystyle {\frac {\prod _{i=1,i\neq k}^{n}(x-x_{i})}{\prod _{i=1,i\neq k}^{n}(x_{k}-x_{i})}}}$

If we insert any ${\displaystyle x_{i}}$, ${\displaystyle i\neq k}$ into the above equation, the nominator becomes ${\displaystyle 0}$ and the first condition is satisified. If we insert ${\displaystyle x_{k}}$, the nominator and the denominator are the same and we get ${\displaystyle 1}$, which means that the second condition is also satisfied.

We can now use the Lagrange polynomials ${\displaystyle \ell _{k}(x)}$ to construct a polynomial ${\displaystyle g(x)}$ of degree ${\displaystyle n-1}$ which interpolates the ${\displaystyle n}$ function values ${\displaystyle f_{i}}$ at the points ${\displaystyle x_{i}}$.

Consider that, given the definition of ${\displaystyle \ell _{k}(x)}$,

${\displaystyle f_{k}\ell _{k}(x_{i})={\begin{cases}f_{k}\quad &i=k\\0&i\neq k\end{cases}}}$.

Therefore, if we define ${\displaystyle g(x)}$ as

${\displaystyle g(x)=\sum _{k=1}^{n}f_{k}\ell _{k}(x)}$

then ${\displaystyle g(x_{i})=f_{i}}$ for all ${\displaystyle i=1\dots n}$. Since the ${\displaystyle \ell _{k}(x)}$ are all of degree ${\displaystyle n-1}$, the weighted sum of these polynomials is also of degree ${\displaystyle n-1}$.

Therefore, ${\displaystyle g(x)}$ is the polynomial of degree ${\displaystyle n-1}$ which interpolates the ${\displaystyle n}$ values ${\displaystyle f_{i}}$ at the nodes ${\displaystyle x_{i}}$.

The representation with Lagrange polynomials has some advantages over the monomial representation ${\displaystyle g(x)=\sum _{i=0}^{n-1}a_{i}x^{i}}$ described in the previous section.

First of all, once the Lagrange polynomials have been computed over a set of nodes ${\displaystyle x_{i}}$, computing the interpolating polynomial for a new set of function values ${\displaystyle f_{i}}$ is trivial, since these only need to be multiplied with the Lagrange polynomials, which are independent of the ${\displaystyle f_{i}}$.

Another advantage is that the representation can easily be extended if additional points are added. The Lagrange polynomial ${\displaystyle \ell _{k}^{+}(x)}$, obtained by adding a point ${\displaystyle x_{n+1}}$ to the interpolation, is

${\displaystyle \ell _{k}^{+}(x)=\ell _{k}(x){\frac {x-x_{n+1}}{x_{k}-x_{n+1}}}}$.

Only the new polynomial ${\displaystyle \ell _{n+1}^{+}(x)}$ would need to be re-computed from scratch.

In the monomial representation, adding a point would involve adding an extra line and column to the Vandermonde matrix and re-computing the solution.

## Newton Interpolation

Given a set of ${\displaystyle n}$ function values ${\displaystyle f_{i}}$ evaluated at the nodes ${\displaystyle x_{i}}$, the Newton polynomial is defined as the sum

${\displaystyle g(x)=\sum _{i=1}^{n}a_{i}n_{i}(x)}$

where the ${\displaystyle n_{i}(x)}$ are the ${\displaystyle n}$ Newton basis polynomials defined by

 ${\displaystyle n_{1}(x)}$ ${\displaystyle =}$ ${\displaystyle 1}$ ${\displaystyle n_{i}(x)}$ ${\displaystyle =}$ ${\displaystyle \prod _{k=1}^{i-1}(x-x_{k})\quad {\mbox{for }}i>1}$.

An interpolation polynomial of degree n+1 can be easily obtained from that of degree n by just adding one more node point ${\displaystyle x_{n+1}}$ and adding a polynomial of degree n+1 to ${\displaystyle g(x)}$.

The Newton form of the interpolating polynomial is particularly suited to computations by hand, and underlies Neville's algorithm for polynomial interpolation. It is also directly related to the exact formula for the error term ${\displaystyle f_{i}-g(x_{i})}$. Any good text in numerical analysis will prove this formula and fully specify Neville's algorithm, usually doing both via divided differences.

(Should link to error formula, divided differences and Neville's algorithm, adding articles where they don't exist)

## Interpolation error

When interpolating a given function f by a polynomial of degree n at the nodes x0,...,xn we get the error

${\displaystyle f(x)-p_{n}(x)=f[x_{0},\ldots ,x_{n},x]\prod _{i=0}^{n}(x-x_{i})}$

where

${\displaystyle f[x_{0},\ldots ,x_{n},x]}$

is the notation for divided differences.

If f is n + 1 times continuously differentiable on a closed interval I and ${\displaystyle p_{n}(x)}$ be a polynomial of degree at most n that interpolates f at n + 1 distinct points {xi} (i=0,1,...,n) in that interval. Then for each x in the interval there exists ${\displaystyle \xi }$ in that interval such that

${\displaystyle f(x)-p_{n}(x)={\frac {f^{(n+1)}(\xi )}{(n+1)!}}\prod _{i=0}^{n}(x-x_{i})}$

The above error bound suggests choosing the interpolation points xi such that the product | Π (xxi) | is as small as possible.

### Proof

Let's set the error term is

${\displaystyle R_{n}(x)=f(x)-p_{n}(x)}$

and set up a auxiliary function ${\displaystyle Y(t)}$ and the function is

${\displaystyle Y(t)=R_{n}(t)-{\frac {R_{n}(x)}{W(x)}}\ W(t)}$

where

${\displaystyle W(t)=\prod _{i=0}^{n}(t-x_{i})}$

and

${\displaystyle W(x)=\prod _{i=0}^{n}(x-x_{i})}$

Since ${\displaystyle x_{i}}$ are roots of function f and ${\displaystyle p_{n}(x)}$, so we will have

${\displaystyle Y(x)=R_{n}(x)-{\frac {R_{n}(x)}{W(x)}}\ W(x)=0}$

and

${\displaystyle Y(x_{i})=R_{n}(x_{i})-{\frac {R_{n}(x)}{W(x)}}\ W(x_{i})=0}$

Then ${\displaystyle Y(t)}$ has n+2 roots. From Rolle's theorem, ${\displaystyle Y^{\prime }(t)}$ has n+1 roots, then ${\displaystyle Y^{(n+1)}(t)}$ has one root ${\displaystyle \xi }$, where ${\displaystyle \xi }$ is in the interval I.

So we can get

${\displaystyle Y^{(n+1)}(t)=R_{n}^{(n+1)}(t)-{\frac {R_{n}(x)}{W(x)}}\ (n+1)!}$

Since ${\displaystyle p_{n}(x)}$ is a polynomial of degree at most n, then

${\displaystyle R_{n}^{(n+1)}(t)=f^{(n+1)}(t)}$

Thus

${\displaystyle Y^{(n+1)}(t)=f^{(n+1)}(t)-{\frac {R_{n}(x)}{W(x)}}\ (n+1)!}$

Since ${\displaystyle \xi }$ is the root of ${\displaystyle Y^{(n+1)}(t)}$, so

${\displaystyle Y^{(n+1)}(\xi )=f^{(n+1)}(\xi )-{\frac {R_{n}(x)}{W(x)}}\ (n+1)!=0}$

Therefore

${\displaystyle R_{n}(x)=f(x)-p_{n}(x)={\frac {f^{(n+1)}(\xi )}{(n+1)!}}\prod _{i=0}^{n}(x-x_{i})}$.