Applied linear operators and spectral methods/Lecture 2

From Wikiversity

Jump to: navigation, search

Contents

[edit] Norms in inner product spaces

Inner product spaces have Lp norms which are defined as


  \lVert\mathbf{x}\rVert_{p} = \langle \mathbf{x}, \mathbf{x} \rangle^{1/p}, \quad p=1, 2, \dots\infty

When p = 1, we get the L1 norm


  \lVert\mathbf{x}\rVert_{1} = \langle \mathbf{x}, \mathbf{x} \rangle

When p = 2, we get the L2 norm


  \lVert\mathbf{x}\rVert_{2} = \sqrt{\langle \mathbf{x}, \mathbf{x} \rangle}

In the limit as p \rightarrow \infty we get the L_\infty norm or the sup norm


  \lVert\mathbf{x}\rVert_{\infty} = max|x_k|

The adjacent figure shows a geometric interpretation of the three norms.

Geomtric interpretation of various norms

If a vector space has an inner product then the norm


  \lVert\mathbf{x}\rVert = \sqrt{\langle \mathbf{x}, \mathbf{x} \rangle} = \lVert\mathbf{x}\rVert_2

is called the induced norm. Clearly, the induced norm is nonnegative and zero only if \mathbf{x} = \mathbf{0}. It is also linear under multiplication by a positive vector. You can think of the induced norm as a measure of length for the vector space.

So useful results that follow from the definition of the norm are discussed below.

[edit] Schwarz inequality

In an inner product space


  |\langle \mathbf{x}, \mathbf{y} \rangle| \le \lVert\mathbf{x}\rVert~\lVert\mathbf{y}\rVert

Proof

This statement is true if \mathbf{y} = \mathbf{0}.

If \mathbf{y} \ne \mathbf{0} we have


  0 < \lVert\mathbf{x} - \alpha~\mathbf{y}\rVert^2 = \langle (\mathbf{x} - \alpha~\mathbf{y}), (\mathbf{x} - \alpha~\mathbf{y}) \rangle
    = \langle \mathbf{x}, \mathbf{x} \rangle - \langle \mathbf{x}, \alpha~\mathbf{y} \rangle - \langle \alpha~\mathbf{y}, \mathbf{x} \rangle + 
       |\alpha^2|~\langle \mathbf{y}, \mathbf{y} \rangle

Now


  \langle \mathbf{x}, \alpha~\mathbf{y} \rangle + \langle \alpha~\mathbf{y}, \mathbf{x} \rangle = \overline{\alpha}~\langle \mathbf{x}, \mathbf{y} \rangle +
     \alpha~\langle \mathbf{x}, \mathbf{y} \rangle = 2~\text{Re}(\alpha)~\langle \mathbf{x}, \mathbf{y} \rangle

Therefore,


  \lVert\mathbf{x}\rVert^2 - 2~\text{Re}(\alpha)\langle \mathbf{x}, \mathbf{y} \rangle + |\alpha^2|~\lVert\mathbf{y}\rVert^2 > 0

Let us choose α such that it minimizes the left hand side above. This value is clearly


  \alpha = \cfrac{\langle \mathbf{x}, \mathbf{y} \rangle}{\lVert\mathbf{y}\rVert^2}

which gives us


  \lVert\mathbf{x}\rVert^2 - 2~\cfrac{|\langle \mathbf{x}, \mathbf{y} \rangle|^2}{\lVert\mathbf{y}\rVert^2} + 
     \cfrac{|\langle \mathbf{x}, \mathbf{y} \rangle|^2}{\lVert\mathbf{y}\rVert^2} > 0

Therefore,


  \lVert\mathbf{x}\rVert^2~\lVert\mathbf{y}\rVert^2  \ge  |\langle \mathbf{x}, \mathbf{y} \rangle|^2 \qquad \square

[edit] Triangle inequality

The triangle inequality states that


  \lVert\mathbf{x} + \mathbf{y}\rVert \le \lVert\mathbf{x}\rVert + \lVert\mathbf{y}\rVert

Proof


  \lVert\mathbf{x} + \mathbf{y}\rVert^2 = \lVert\mathbf{x}\rVert^2 + 2\text{Re}\langle \mathbf{x}, \mathbf{y} \rangle + \lVert\mathbf{y}\rVert^2

From the Schwarz inequality


  \lVert\mathbf{x} + \mathbf{y}\rVert^2 < \lVert\mathbf{x}\rVert^2 + 2\lVert\mathbf{x}\rVert\lVert\mathbf{y}\rVert + \lVert\mathbf{y}\rVert^2
                     = (\lVert\mathbf{x}\rVert + \lVert\mathbf{y}\rVert)^2

Hence


  \lVert\mathbf{x} + \mathbf{y}\rVert \le \lVert\mathbf{x}\rVert + \lVert\mathbf{y}\rVert \qquad \square

[edit] Angle between two vectors

In \mathbb{R}^2 or \mathbb{R}^3 we have


  \cos\theta = \cfrac{\langle \mathbf{x}, \mathbf{y} \rangle}{\lVert\mathbf{x}\rVert \lVert\mathbf{y}\rVert}

So it makes sense to define cosθ in this way for any real vector space.

We then have


  \lVert\mathbf{x} + \mathbf{y}\rVert^2 = \lVert\mathbf{x}\rVert^2 + 2~\lVert\mathbf{x}\rVert~\lVert\mathbf{y}\rVert\cos\theta + \lVert\mathbf{y}\rVert^2

[edit] Orthogonality

In particular, if cosθ = 0 we have an analog of the Pythagoras theorem.


  \lVert\mathbf{x} + \mathbf{y}\rVert^2 = \lVert\mathbf{x}\rVert^2 + \lVert\mathbf{y}\rVert^2

In that case the vectors are said to be orthogonal.

If \langle \mathbf{x}, \mathbf{y} \rangle = 0 then the vectors are said to be orthogonal even in a complex vector space.

Orthogonal vectors have a lot of nice properties.

[edit] Linear independence of orthogonal vectors

  • A set of nonzero orthogonal vectors is linearly independent.

If the vectors \boldsymbol{\varphi}_i are linearly dependent


   \alpha_1~\boldsymbol{\varphi}_1 + \alpha_2~\boldsymbol{\varphi}_2 + \dots + \alpha_n~\boldsymbol{\varphi}_n = 0

and the \boldsymbol{\varphi}_i are orthogonal, then taking an inner product with \boldsymbol{\varphi}_j gives


  \alpha_j~\langle \boldsymbol{\varphi}_j, \boldsymbol{\varphi}_j \rangle = 0 \quad \implies \quad \alpha_j = 0 ~\forall j

since


  \langle \boldsymbol{\varphi}_i, \boldsymbol{\varphi}_j \rangle = 0 \quad \text{if}~ i \ne j ~.

Therefore the only nontrivial case is that the vectors are linearly independent.

[edit] Expressing a vector in terms of a orthogonal basis

If we have a basis \{\boldsymbol{\varphi}_1, \boldsymbol{\varphi}_2, \dots, \boldsymbol{\varphi}_n\} and wish to express a vector \mathbf{f} in terms of it we have


  \mathbf{f} = \sum_{j=1}^n \beta_j~\boldsymbol{\varphi}_j

The problem is to find the βjs.

If we take the inner product with respect to \boldsymbol{\varphi}_i, we get


  \langle \mathbf{f}, \boldsymbol{\varphi} \rangle = \sum_{j=1}^n \beta_j~\langle \boldsymbol{\varphi}_i, \boldsymbol{\varphi}_j \rangle

In matrix form,


  \boldsymbol{\eta} = \boldsymbol{B}~\boldsymbol{\beta}

where B_{ij} = \langle \boldsymbol{\varphi}_i, \boldsymbol{\varphi}_j \rangle and \eta_i = \langle \mathbf{f}, \boldsymbol{\varphi}_i \rangle.

Generally, getting the βjs involves inverting the n \times m matrix \boldsymbol{B}.

If the \boldsymbol{\varphi}_is are orthogonal then \boldsymbol{B} is diagonal and we have


  \beta_j = \cfrac{\langle \mathbf{f}, \boldsymbol{\varphi}_j \rangle}{\lVert\boldsymbol{\varphi}_j\rVert^2}

and the quantity


  \mathbf{p} = \cfrac{\langle \mathbf{f}, \boldsymbol{\varphi} \rangle}{\lVert\boldsymbol{\varphi}\rVert^2}~\boldsymbol{\varphi}

is called the projection of \mathbf{f} into \boldsymbol{\varphi}.

Therefore the sum \mathbf{f} = \sum_j \beta_j~\boldsymbol{\varphi}_j says that \mathbf{f} is just a sum of its projections onto the orthogonal basis.

Projection operation.

Let us check whether \mathbf{p} is actually a projection. Let


  \mathbf{a} = \mathbf{f} - \mathbf{p} = \mathbf{f} - \cfrac{\langle \mathbf{f}, \boldsymbol{\varphi} \rangle}{\lVert\boldsymbol{\varphi}\rVert^2}~\boldsymbol{\varphi}

Then,


  \langle \mathbf{a}, \boldsymbol{\varphi} \rangle = \langle \mathbf{f}, \boldsymbol{\varphi} \rangle - 
    \cfrac{\langle \mathbf{f}, \boldsymbol{\varphi} \rangle}{\lVert\boldsymbol{\varphi}\rVert^2}~\langle \boldsymbol{\varphi}, \boldsymbol{\varphi} \rangle = 0

Therefor \mathbf{a} and \boldsymbol{\varphi} are indeed orthogonal.

Note that we can normalize \boldsymbol{\varphi}_i by defining


  \tilde{\boldsymbol{\varphi}}_i = \cfrac{\boldsymbol{\varphi}_i}{\lVert\boldsymbol{\varphi}_i\rVert}

Then the basis \{\tilde{\boldsymbol{\varphi}}_1, \tilde{\boldsymbol{\varphi}}_2, \dots, \tilde{\boldsymbol{\varphi}}_n\} is called an orthonormal basis.

It follows from the equation for βj that


  \tilde{\beta}_j = \langle \mathbf{f}, \tilde{\boldsymbol{\varphi} \rangle_j}

and


  \mathbf{f} = \sum_{j=1}^n \tilde{\beta}_j~\tilde{\boldsymbol{\varphi}}_j

You can think of the vectors \tilde{\boldsymbol{\varphi}}_i as orthogonal unit vectors in a n-dimensional space.

[edit] Biorthogonal basis

However, using an orthogonal basis is not the only way to o things. An alternative that is useful (for instance when using wavelets) is the biorthonormal basis.

The problem in this case is converted into one where, given any basis \{\boldsymbol{\varphi}_1, \boldsymbol{\varphi}_2, \dots, \boldsymbol{\varphi}_n\}, we want to find another set of sectors \{\boldsymbol{\psi}_1, \boldsymbol{\psi}_2, \dots, \boldsymbol{\psi}_n\} such that


  \langle \boldsymbol{\varphi}_i, \boldsymbol{\psi}_j \rangle = \delta_{ij}

In that case, if


  \mathbf{f} = \sum_{j=1}^n \beta_j~\boldsymbol{\varphi}_j

it follows that


  \langle \mathbf{f}, \boldsymbol{\psi}_k \rangle = \sum_{j=1}^n \beta_j~\langle \boldsymbol{\varphi}_j, \boldsymbol{\psi}_k \rangle = \beta_k

So the coefficients βk can easily be recovered. You can see a schematic of the two sets of vectors in the adjacent figure.

Biorthonomal basis

[edit] Gram-Schmidt orthogonalization

One technique for getting an orthogonal baisis is to use the process of Gram-Schmidt orthogonalization.

The goal is to produce an orthogonal set of vectors \{\boldsymbol{\varphi}_1, \boldsymbol{\varphi}_2, \dots, \boldsymbol{\varphi}_n\} given a linearly independent set \{\mathbf{x}_1, \mathbf{x}_2, \dots, \mathbf{x}_n\}.

We start of by assuming that \boldsymbol{\varphi}_1 = \mathbf{x}_1. Then \boldsymbol{\varphi}_2 is given by subtracting the projection of \mathbf{x}_2 onto \boldsymbol{\varphi}_1 from \mathbf{x}_2, i.e.,


  \boldsymbol{\varphi}_2 = \mathbf{x}_2 - \cfrac{\langle \mathbf{x}_2, \boldsymbol{\varphi}_1 \rangle}{\lVert\boldsymbol{\varphi}_1\rVert^2}~\boldsymbol{\varphi}_1

Thus \boldsymbol{\varphi}_2 is clearly orthogonal to \boldsymbol{\varphi}_1. For \boldsymbol{\varphi}_3 we use


  \boldsymbol{\varphi}_3 = \mathbf{x}_3 - \cfrac{\langle \mathbf{x}_3, \boldsymbol{\varphi}_1 \rangle}{\lVert\boldsymbol{\varphi}_1\rVert^2}~\boldsymbol{\varphi}_1
  - \cfrac{\langle \mathbf{x}_3, \boldsymbol{\varphi}_2 \rangle}{\lVert\boldsymbol{\varphi}_2\rVert^2}~\boldsymbol{\varphi}_2

More generally,


  \boldsymbol{\varphi}_n = \mathbf{x}_n - \sum_{j=1}^{n-1}
    \cfrac{\langle \mathbf{x}_n, \boldsymbol{\varphi}_j \rangle}{\lVert\boldsymbol{\varphi}_j\rVert^2}~\boldsymbol{\varphi}_j

If you want an orthonormal set then you can do that by normalizing the orthogonal set of vectors.

We can check that the vectors \boldsymbol{\varphi}_j are indeed orthogonal by induction. Assume that all \boldsymbol{\varphi}_j, ~ j \le n-1 are orthogonal for some j. Pick k < n. Then


  \langle \boldsymbol{\varphi}_n, \boldsymbol{\varphi}_k \rangle = \langle \mathbf{x}_n, \boldsymbol{\varphi}_k \rangle - \sum_{j=1}^{n-1}
    \cfrac{\langle \mathbf{x}_n, \boldsymbol{\varphi}_j \rangle}{\lVert\boldsymbol{\varphi}_j\rVert^2}~\langle \boldsymbol{\varphi}_j, \boldsymbol{\varphi}_k \rangle

Now \langle \boldsymbol{\varphi}_j, \boldsymbol{\varphi}_k \rangle = 0 unless j = k. However, at j = k, \langle \boldsymbol{\varphi}_n, \boldsymbol{\varphi}_k \rangle = 0 beacuse the two reamining terms cancel out. Hence the vectors are orthogonal.

Note that you have to be careful while numerically computing an orthogonal basis using the Gram-Schmidt technique because the errors add up in the terms under the sum.

[edit] Linear operators

The object \boldsymbol{A} is a linear operator from \mathcal{S} into \mathcal{S} if


  \boldsymbol{A}~\mathbf{x} \equiv \boldsymbol{A}(\mathbf{x}) \in \mathcal{S} \quad \forall~\mathbf{x}\in\mathcal{S}

A linear operator satisfies the properties

  1. \boldsymbol{A}~(\alpha~\mathbf{x}) = \alpha~\boldsymbol{A}(\mathbf{x}).
  2. \boldsymbol{A}~(\mathbf{x}+\mathbf{y}) = \boldsymbol{A}(\mathbf{x}) + \boldsymbol{A}(\mathbf{y}).

Note that \boldsymbol{A} is independent of basis. However, the action of \boldsymbol{A} on a basis \{\boldsymbol{\varphi}_1, \boldsymbol{\varphi}_2, \dots, \boldsymbol{\varphi}_n\} determines \boldsymbol{A} completely since


  \boldsymbol{A}~\mathbf{f} = \boldsymbol{A}~\left(\sum_j \beta_j~\boldsymbol{\varphi}_j\right)
          = \sum_j \beta_j~\boldsymbol{A}(\boldsymbol{\varphi}_j)

Since \boldsymbol{A}~\boldsymbol{\varphi}_j \in \mathcal{S} we can write


  \boldsymbol{A}~\boldsymbol{\varphi}_j = \sum_i A_{ij}~\varphi_i

where Aij is the n \times m matrix representing the operator \boldsymbol{A} in the basis \{\boldsymbol{\varphi}_1, \boldsymbol{\varphi}_2, \dots, \boldsymbol{\varphi}_n\}.

Note the location of the indices here which is not the same as what we get in matrix multiplication. For example, in Re2, we have


  \boldsymbol{A}~\mathbf{e}_2 = \begin{bmatrix} A_{11} & A_{12} \\ A_{21} & A_{22} \end{bmatrix} 
  \begin{bmatrix} 0 \\ 1 \end{bmatrix} = 
  \begin{bmatrix} A_{12} \\ A_{22} \end{bmatrix} = 
  A_{12}~ \begin{bmatrix} 1 \\ 0 \end{bmatrix} +
  A_{22}~ \begin{bmatrix} 0 \\ 1 \end{bmatrix} = A_{12}~\mathbf{e}_1 + A_{22}~\mathbf{e}_2 = A_{ij}~\mathbf{e}_i

We will get into more details in the next lecture.

Nuvola apps edu languages.svg Resource type: this resource contains a lecture or lecture notes.
Warning icon.svg Action required: please create Category:Applied linear operators and spectral methods/Lectures and add it to Category:Lectures.