Applied linear operators and spectral methods/Lecture 2

Norms in inner product spaces[edit | edit source]

Inner product spaces have $L_{p}$ norms which are defined as

\lVert \mathbf {x} \rVert _{p}=\langle \mathbf {x} ,\mathbf {x} \rangle ^{1/p},\quad p=1,2,\dots \infty

When $p=1$ , we get the $L_{1}$ norm

\lVert \mathbf {x} \rVert _{1}=\langle \mathbf {x} ,\mathbf {x} \rangle

When $p=2$ , we get the $L_{2}$ norm

\lVert \mathbf {x} \rVert _{2}={\sqrt {\langle \mathbf {x} ,\mathbf {x} \rangle }}

In the limit as $p\rightarrow \infty$ we get the $L_{\infty }$ norm or the sup norm

\lVert \mathbf {x} \rVert _{\infty }=max|x_{k}|

The adjacent figure shows a geometric interpretation of the three norms.

Geomtric interpretation of various norms

If a vector space has an inner product then the norm

\lVert \mathbf {x} \rVert ={\sqrt {\langle \mathbf {x} ,\mathbf {x} \rangle }}=\lVert \mathbf {x} \rVert _{2}

is called the induced norm. Clearly, the induced norm is nonnegative and zero only if $\mathbf {x} =\mathbf {0}$ . It is also linear under multiplication by a positive vector. You can think of the induced norm as a measure of length for the vector space.

So useful results that follow from the definition of the norm are discussed below.

Schwarz inequality[edit | edit source]

In an inner product space

|\langle \mathbf {x} ,\mathbf {y} \rangle |\leq \lVert \mathbf {x} \rVert ~\lVert \mathbf {y} \rVert

Proof

This statement is true if $\mathbf {y} =\mathbf {0}$ .

If $\mathbf {y} \neq \mathbf {0}$ we have

0<\lVert \mathbf {x} -\alpha ~\mathbf {y} \rVert ^{2}=\langle (\mathbf {x} -\alpha ~\mathbf {y} ),(\mathbf {x} -\alpha ~\mathbf {y} )\rangle =\langle \mathbf {x} ,\mathbf {x} \rangle -\langle \mathbf {x} ,\alpha ~\mathbf {y} \rangle -\langle \alpha ~\mathbf {y} ,\mathbf {x} \rangle +|\alpha ^{2}|~\langle \mathbf {y} ,\mathbf {y} \rangle

Now

\langle \mathbf {x} ,\alpha ~\mathbf {y} \rangle +\langle \alpha ~\mathbf {y} ,\mathbf {x} \rangle ={\overline {\alpha }}~\langle \mathbf {x} ,\mathbf {y} \rangle +\alpha ~\langle \mathbf {x} ,\mathbf {y} \rangle =2~{\text{Re}}(\alpha )~\langle \mathbf {x} ,\mathbf {y} \rangle

Therefore,

\lVert \mathbf {x} \rVert ^{2}-2~{\text{Re}}(\alpha )\langle \mathbf {x} ,\mathbf {y} \rangle +|\alpha ^{2}|~\lVert \mathbf {y} \rVert ^{2}>0

Let us choose $\alpha$ such that it minimizes the left hand side above. This value is clearly

\alpha ={\cfrac {\langle \mathbf {x} ,\mathbf {y} \rangle }{\lVert \mathbf {y} \rVert ^{2}}}

which gives us

\lVert \mathbf {x} \rVert ^{2}-2~{\cfrac {|\langle \mathbf {x} ,\mathbf {y} \rangle |^{2}}{\lVert \mathbf {y} \rVert ^{2}}}+{\cfrac {|\langle \mathbf {x} ,\mathbf {y} \rangle |^{2}}{\lVert \mathbf {y} \rVert ^{2}}}>0

Therefore,

\lVert \mathbf {x} \rVert ^{2}~\lVert \mathbf {y} \rVert ^{2}\geq |\langle \mathbf {x} ,\mathbf {y} \rangle |^{2}\qquad \square

Triangle inequality[edit | edit source]

The triangle inequality states that

\lVert \mathbf {x} +\mathbf {y} \rVert \leq \lVert \mathbf {x} \rVert +\lVert \mathbf {y} \rVert

Proof

\lVert \mathbf {x} +\mathbf {y} \rVert ^{2}=\lVert \mathbf {x} \rVert ^{2}+2{\text{Re}}\langle \mathbf {x} ,\mathbf {y} \rangle +\lVert \mathbf {y} \rVert ^{2}

From the Schwarz inequality

\lVert \mathbf {x} +\mathbf {y} \rVert ^{2}<\lVert \mathbf {x} \rVert ^{2}+2\lVert \mathbf {x} \rVert \lVert \mathbf {y} \rVert +\lVert \mathbf {y} \rVert ^{2}=(\lVert \mathbf {x} \rVert +\lVert \mathbf {y} \rVert )^{2}

Hence

\lVert \mathbf {x} +\mathbf {y} \rVert \leq \lVert \mathbf {x} \rVert +\lVert \mathbf {y} \rVert \qquad \square

Angle between two vectors[edit | edit source]

In $\mathbb {R} ^{2}$ or $\mathbb {R} ^{3}$ we have

\cos \theta ={\cfrac {\langle \mathbf {x} ,\mathbf {y} \rangle }{\lVert \mathbf {x} \rVert \lVert \mathbf {y} \rVert }}

So it makes sense to define $\cos \theta$ in this way for any real vector space.

We then have

\lVert \mathbf {x} +\mathbf {y} \rVert ^{2}=\lVert \mathbf {x} \rVert ^{2}+2~\lVert \mathbf {x} \rVert ~\lVert \mathbf {y} \rVert \cos \theta +\lVert \mathbf {y} \rVert ^{2}

Orthogonality[edit | edit source]

In particular, if $\cos \theta =0$ we have an analog of the Pythagoras theorem.

\lVert \mathbf {x} +\mathbf {y} \rVert ^{2}=\lVert \mathbf {x} \rVert ^{2}+\lVert \mathbf {y} \rVert ^{2}

In that case the vectors are said to be orthogonal.

If $\langle \mathbf {x} ,\mathbf {y} \rangle =0$ then the vectors are said to be orthogonal even in a complex vector space.

Orthogonal vectors have a lot of nice properties.

Linear independence of orthogonal vectors[edit | edit source]

A set of nonzero orthogonal vectors is linearly independent.

If the vectors ${\boldsymbol {\varphi }}_{i}$ are linearly dependent

\alpha _{1}~{\boldsymbol {\varphi }}_{1}+\alpha _{2}~{\boldsymbol {\varphi }}_{2}+\dots +\alpha _{n}~{\boldsymbol {\varphi }}_{n}=0

and the ${\boldsymbol {\varphi }}_{i}$ are orthogonal, then taking an inner product with ${\boldsymbol {\varphi }}_{j}$ gives

\alpha _{j}~\langle {\boldsymbol {\varphi }}_{j},{\boldsymbol {\varphi }}_{j}\rangle =0\quad \implies \quad \alpha _{j}=0~\forall j

since

\langle {\boldsymbol {\varphi }}_{i},{\boldsymbol {\varphi }}_{j}\rangle =0\quad {\text{if}}~i\neq j~.

Therefore the only nontrivial case is that the vectors are linearly independent.

Expressing a vector in terms of an orthogonal basis[edit | edit source]

If we have a basis $\{{\boldsymbol {\varphi }}_{1},{\boldsymbol {\varphi }}_{2},\dots ,{\boldsymbol {\varphi }}_{n}\}$ and wish to express a vector $\mathbf {f}$ in terms of it we have

\mathbf {f} =\sum _{j=1}^{n}\beta _{j}~{\boldsymbol {\varphi }}_{j}

The problem is to find the $\beta _{j}$ s.

If we take the inner product with respect to ${\boldsymbol {\varphi }}_{i}$ , we get

\langle \mathbf {f} ,{\boldsymbol {\varphi }}_{i}\rangle =\sum _{j=1}^{n}\beta _{j}~\langle {\boldsymbol {\varphi }}_{i},{\boldsymbol {\varphi }}_{j}\rangle

In matrix form,

{\boldsymbol {\eta }}={\boldsymbol {B}}~{\boldsymbol {\beta }}

where $B_{ij}=\langle {\boldsymbol {\varphi }}_{i},{\boldsymbol {\varphi }}_{j}\rangle$ and $\eta _{i}=\langle \mathbf {f} ,{\boldsymbol {\varphi }}_{i}\rangle$ .

Generally, getting the $\beta _{j}$ s involves inverting the $n\times n$ matrix ${\boldsymbol {B}}$ , which is an identity matrix ${\boldsymbol {I_{n}}}$ , because $\langle {\boldsymbol {\varphi }}_{i},{\boldsymbol {\varphi }}_{j}\rangle ={\boldsymbol {\delta }}_{ij}$ , where ${\boldsymbol {\delta }}_{ij}$ is the Kronecker delta.

Provided that the ${\boldsymbol {\varphi }}_{i}$ s are orthogonal then we have

\beta _{j}={\cfrac {\langle \mathbf {f} ,{\boldsymbol {\varphi }}_{j}\rangle }{\lVert {\boldsymbol {\varphi }}_{j}\rVert ^{2}}}

and the quantity

\mathbf {p} ={\cfrac {\langle \mathbf {f} ,{\boldsymbol {\varphi }}_{j}\rangle }{\lVert {\boldsymbol {\varphi }}_{j}\rVert ^{2}}}~{\boldsymbol {\varphi }}_{j}

is called the projection of $\mathbf {f}$ onto ${\boldsymbol {\varphi }}_{j}$ .

Therefore the sum

$\mathbf {f} =\sum _{j}\beta _{j}~{\boldsymbol {\varphi }}_{j}$

says that $\mathbf {f}$ is just a sum of its projections onto the orthogonal basis.

Let us check whether $\mathbf {p}$ is actually a projection. Let

\mathbf {a} =\mathbf {f} -\mathbf {p} =\mathbf {f} -{\cfrac {\langle \mathbf {f} ,{\boldsymbol {\varphi }}\rangle }{\lVert {\boldsymbol {\varphi }}\rVert ^{2}}}~{\boldsymbol {\varphi }}

Then,

\langle \mathbf {a} ,{\boldsymbol {\varphi }}\rangle =\langle \mathbf {f} ,{\boldsymbol {\varphi }}\rangle -{\cfrac {\langle \mathbf {f} ,{\boldsymbol {\varphi }}\rangle }{\lVert {\boldsymbol {\varphi }}\rVert ^{2}}}~\langle {\boldsymbol {\varphi }},{\boldsymbol {\varphi }}\rangle =0

Therefore $\mathbf {a}$ and ${\boldsymbol {\varphi }}$ are indeed orthogonal.

Note that we can normalize ${\boldsymbol {\varphi }}_{i}$ by defining

{\tilde {\boldsymbol {\varphi }}}_{i}={\cfrac {{\boldsymbol {\varphi }}_{i}}{\lVert {\boldsymbol {\varphi }}_{i}\rVert }}

Then the basis $\{{\tilde {\boldsymbol {\varphi }}}_{1},{\tilde {\boldsymbol {\varphi }}}_{2},\dots ,{\tilde {\boldsymbol {\varphi }}}_{n}\}$ is called an orthonormal basis.

It follows from the equation for $\beta _{j}$ that

{\tilde {\beta }}_{j}=\langle \mathbf {f} ,{\tilde {{\boldsymbol {\varphi }}\rangle _{j}}}

and

\mathbf {f} =\sum _{j=1}^{n}{\tilde {\beta }}_{j}~{\tilde {\boldsymbol {\varphi }}}_{j}

You can think of the vectors ${\tilde {\boldsymbol {\varphi }}}_{i}$ as orthogonal unit vectors in an $n$ -dimensional space.

Biorthogonal basis[edit | edit source]

However, using an orthogonal basis is not the only way to do things. An alternative that is useful (for instance when using wavelets) is the biorthonormal basis.

The problem in this case is converted into one where, given any basis $\{{\boldsymbol {\varphi }}_{1},{\boldsymbol {\varphi }}_{2},\dots ,{\boldsymbol {\varphi }}_{n}\}$ , we want to find another set of vectors $\{{\boldsymbol {\psi }}_{1},{\boldsymbol {\psi }}_{2},\dots ,{\boldsymbol {\psi }}_{n}\}$ such that

\langle {\boldsymbol {\varphi }}_{i},{\boldsymbol {\psi }}_{j}\rangle =\delta _{ij}

In that case, if

\mathbf {f} =\sum _{j=1}^{n}\beta _{j}~{\boldsymbol {\varphi }}_{j}

it follows that

\langle \mathbf {f} ,{\boldsymbol {\psi }}_{k}\rangle =\sum _{j=1}^{n}\beta _{j}~\langle {\boldsymbol {\varphi }}_{j},{\boldsymbol {\psi }}_{k}\rangle =\beta _{k}

So the coefficients $\beta _{k}$ can easily be recovered. You can see a schematic of the two sets of vectors in the adjacent figure.

Gram-Schmidt orthogonalization[edit | edit source]

One technique for getting an orthogonal baisis is to use the process of Gram-Schmidt orthogonalization.

The goal is to produce an orthogonal set of vectors $\{{\boldsymbol {\varphi }}_{1},{\boldsymbol {\varphi }}_{2},\dots ,{\boldsymbol {\varphi }}_{n}\}$ given a linearly independent set $\{\mathbf {x} _{1},\mathbf {x} _{2},\dots ,\mathbf {x} _{n}\}$ .

We start of by assuming that ${\boldsymbol {\varphi }}_{1}=\mathbf {x} _{1}$ . Then ${\boldsymbol {\varphi }}_{2}$ is given by subtracting the projection of $\mathbf {x} _{2}$ onto ${\boldsymbol {\varphi }}_{1}$ from $\mathbf {x} _{2}$ , i.e.,

{\boldsymbol {\varphi }}_{2}=\mathbf {x} _{2}-{\cfrac {\langle \mathbf {x} _{2},{\boldsymbol {\varphi }}_{1}\rangle }{\lVert {\boldsymbol {\varphi }}_{1}\rVert ^{2}}}~{\boldsymbol {\varphi }}_{1}

Thus ${\boldsymbol {\varphi }}_{2}$ is clearly orthogonal to ${\boldsymbol {\varphi }}_{1}$ . For ${\boldsymbol {\varphi }}_{3}$ we use

{\boldsymbol {\varphi }}_{3}=\mathbf {x} _{3}-{\cfrac {\langle \mathbf {x} _{3},{\boldsymbol {\varphi }}_{1}\rangle }{\lVert {\boldsymbol {\varphi }}_{1}\rVert ^{2}}}~{\boldsymbol {\varphi }}_{1}-{\cfrac {\langle \mathbf {x} _{3},{\boldsymbol {\varphi }}_{2}\rangle }{\lVert {\boldsymbol {\varphi }}_{2}\rVert ^{2}}}~{\boldsymbol {\varphi }}_{2}

More generally,

{\boldsymbol {\varphi }}_{n}=\mathbf {x} _{n}-\sum _{j=1}^{n-1}{\cfrac {\langle \mathbf {x} _{n},{\boldsymbol {\varphi }}_{j}\rangle }{\lVert {\boldsymbol {\varphi }}_{j}\rVert ^{2}}}~{\boldsymbol {\varphi }}_{j}

If you want an orthonormal set then you can do that by normalizing the orthogonal set of vectors.

We can check that the vectors ${\boldsymbol {\varphi }}_{j}$ are indeed orthogonal by induction. Assume that all ${\boldsymbol {\varphi }}_{j},~j\leq n-1$ are orthogonal for some $j$ . Pick $k<n$ . Then

\langle {\boldsymbol {\varphi }}_{n},{\boldsymbol {\varphi }}_{k}\rangle =\langle \mathbf {x} _{n},{\boldsymbol {\varphi }}_{k}\rangle -\sum _{j=1}^{n-1}{\cfrac {\langle \mathbf {x} _{n},{\boldsymbol {\varphi }}_{j}\rangle }{\lVert {\boldsymbol {\varphi }}_{j}\rVert ^{2}}}~\langle {\boldsymbol {\varphi }}_{j},{\boldsymbol {\varphi }}_{k}\rangle

Now $\langle {\boldsymbol {\varphi }}_{j},{\boldsymbol {\varphi }}_{k}\rangle =0$ unless $j=k$ . However, at $j=k$ , $\langle {\boldsymbol {\varphi }}_{n},{\boldsymbol {\varphi }}_{k}\rangle =0$ because the two remaining terms cancel out. Hence the vectors are orthogonal.

Note that you have to be careful while numerically computing an orthogonal basis using the Gram-Schmidt technique because the errors add up in the terms under the sum.

Linear operators[edit | edit source]

The object ${\boldsymbol {A}}$ is a linear operator from ${\mathcal {S}}$ onto ${\mathcal {S}}$ if

{\boldsymbol {A}}~\mathbf {x} \equiv {\boldsymbol {A}}(\mathbf {x} )\in {\mathcal {S}}\quad \forall ~\mathbf {x} \in {\mathcal {S}}

A linear operator satisfies the properties

${\boldsymbol {A}}~(\alpha ~\mathbf {x} )=\alpha ~{\boldsymbol {A}}(\mathbf {x} )$ .
${\boldsymbol {A}}~(\mathbf {x} +\mathbf {y} )={\boldsymbol {A}}(\mathbf {x} )+{\boldsymbol {A}}(\mathbf {y} )$ .

Note that ${\boldsymbol {A}}$ is independent of basis. However, the action of ${\boldsymbol {A}}$ on a basis $\{{\boldsymbol {\varphi }}_{1},{\boldsymbol {\varphi }}_{2},\dots ,{\boldsymbol {\varphi }}_{n}\}$ determines ${\boldsymbol {A}}$ completely since

{\boldsymbol {A}}~\mathbf {f} ={\boldsymbol {A}}~\left(\sum _{j}\beta _{j}~{\boldsymbol {\varphi }}_{j}\right)=\sum _{j}\beta _{j}~{\boldsymbol {A}}({\boldsymbol {\varphi }}_{j})

Since ${\boldsymbol {A}}~{\boldsymbol {\varphi }}_{j}\in {\mathcal {S}}$ we can write

{\boldsymbol {A}}~{\boldsymbol {\varphi }}_{j}=\sum _{i}A_{ij}~\varphi _{i}

where $A_{ij}$ is the $n\times n$ matrix representing the operator ${\boldsymbol {A}}$ in the basis $\{{\boldsymbol {\varphi }}_{1},{\boldsymbol {\varphi }}_{2},\dots ,{\boldsymbol {\varphi }}_{n}\}$ .

Note the location of the indices here which is not the same as what we get in matrix multiplication. For example, in ${\text{Re}}^{2}$ , we have

{\boldsymbol {A}}~\mathbf {e} _{2}={\begin{bmatrix}A_{11}&A_{12}\\A_{21}&A_{22}\end{bmatrix}}{\begin{bmatrix}0\\1\end{bmatrix}}={\begin{bmatrix}A_{12}\\A_{22}\end{bmatrix}}=A_{12}~{\begin{bmatrix}1\\0\end{bmatrix}}+A_{22}~{\begin{bmatrix}0\\1\end{bmatrix}}=A_{12}~\mathbf {e} _{1}+A_{22}~\mathbf {e} _{2}=A_{ij}~\mathbf {e} _{i}

We will get into more details in the next lecture.

Resource type: this resource contains a lecture or lecture notes.

Applied linear operators and spectral methods/Lecture 2

Contents