Linear algebra (Osnabrück 2024-2025)/Part I/Lecture 4

In linear algebra, everything is worked out over a field ${}K$ , and the reader might think about the real numbers ${}\mathbb {R}$ . But, at the moment, only the algebraic properties of ${}\mathbb {R}$ are relevant, so instead one can think about the rational numbers ${}\mathbb {Q}$ . Starting with the theory of eigenvalues, also more specific properties of the field (like the existence of roots) are important.

The "mother of all systems of linear equations“ is just one linear equation in one variable of the form

{}ax=b\,,

with given elements ${}a,b$ from a field ${}K$ and wanted ${}x$ . We have three possibilities how the solution behavior might look like. For ${}a\neq 0$ , we can multiply the equation with the inverse ${}a^{-1}$ of ${}a$ , yielding the unique solution

{}x=ba^{-1}={\frac {b}{a}}\,.

Computationally, one can find the solution, as long as one can find the inverse element and can perform the multiplication in the field. For ${}a=0$ , the solution behavior depends on ${}b$ . If ${}b=0$ , then every ${}x\in K$ is a solution; if ${}b\neq 0$ , then there is no solution.

Linear systems

Firstly, we give three further introductory examples, one from every day's life, one from geometry, and one from physics. They all lead to systems of linear equations.

Example

At a booth on the Christmas market, there are three different pots of mulled wine. All three contain the ingredients cinnamon, cloves, red wine, and sugar, but the compositions differ. The mixtures of the mulled wines are

G_{1}={\begin{pmatrix}1\\2\\11\\2\end{pmatrix}},\,G_{2}={\begin{pmatrix}2\\2\\12\\3\end{pmatrix}},\,G_{3}={\begin{pmatrix}3\\1\\20\\7\end{pmatrix}}.

Every mulled wine is represented by a four-tuple, where the entries represent the respective shares of the ingredients. The set of all (possible) mulled wines forms a vector space (we will introduce this concept in the next lecture) and the three concrete mulled wines are vectors in this space.

Now suppose that none of the three mulled wines meets exactly our taste; in fact, the wanted mulled wine has the mixture

{}W={\begin{pmatrix}1\\2\\20\\5\end{pmatrix}}\,.

Is there a possibility to get the wanted mulled wine by pouring together the given mulled wines in some way? Are there numbers^[1] ${}a,b,c\in \mathbb {Q}$ such that

{}a{\begin{pmatrix}1\\2\\11\\2\end{pmatrix}}+b{\begin{pmatrix}2\\2\\12\\3\end{pmatrix}}+c{\begin{pmatrix}3\\1\\20\\7\end{pmatrix}}={\begin{pmatrix}1\\2\\20\\5\end{pmatrix}}\,

holds? This vector equation can be expressed by four equations in the "variables“ ${}a,b,c$ , where the equations come from the rows. When does there exist a solution, when none, when many? These are typical questions of linear algebra.

Example

Two planes in space intersecting in a line.

Suppose that two planes are given in ${}\mathbb {R} ^{3}$ ,^[2]

{}E={\left\{(x,y,z)\in \mathbb {R} ^{3}\mid 4x-2y-3z=5\right\}}\,

and

{}F={\left\{(x,y,z)\in \mathbb {R} ^{3}\mid 3x-5y+2z=1\right\}}\,.

How can we describe the intersecting line ${}G=E\cap F$ ? A point ${}P=(x,y,z)$ belongs to the intersection line if and only if it satisfies both plane equations. Therefore, both equations,

4x-2y-3z=5\,\,{\text{  and }}\,\,3x-5y+2z=1,

must hold. We multiply the first equation by ${}3$ , and subtract from that four times the second equation, and get

{}14y-17z=11\,.

If we set ${}y=0$ , then ${}z=-{\frac {11}{17}}$ and ${}x={\frac {13}{17}}$ must hold. This means that the point ${}P=\left({\frac {13}{17}},\,0,\,-{\frac {11}{17}}\right)$ belongs to ${}G$ . In the same way, setting ${}z=0$ , we find the point ${}Q=\left({\frac {23}{14}},\,{\frac {11}{14}},\,0\right)$ . Therefore, the intersecting line is the line connecting these points, so

{}G={\left\{\left({\frac {13}{17}},\,0,\,-{\frac {11}{17}}\right)+t\left({\frac {209}{238}},\,{\frac {11}{14}},\,{\frac {11}{17}}\right)\mid t\in \mathbb {R} \right\}}\,.

Example

An electrical network consists of several connected wires, which we call the edges of the network in this context. In every edge ${}K_{j}$ , there is a certain (depending on the material and the length of the edge) resistance ${}R_{j}$ . The points ${}P_{n}$ , where the edges meet, are called the vertices of the network. If we put to some edges of the network a certain electric tension (voltage), then we will have in every edge a certain current ${}I_{j}$ . The goal is to determine the currents from the data of the network and the voltages.

It is helpful to assign to each edge a fixed direction in order to distinguish the direction of the current in this edge (if the current is in the opposite direction, it gets a minus sign). We call these directed edges. In every vertex of the network, the currents of the adjacent edges come together; therefore, their sum must be ${}0$ . In an edge ${}K_{j}$ , there is a voltage drop ${}U_{j}$ , determined by Ohm's law to be

{}U_{j}=R_{j}\cdot I_{j}\,.

We call a closed, directed alignment of edges in a network a mesh. For such a mesh, the sum of voltages is ${}0$ , unless a certain voltage is enforced from "outside“.

We list these Kirchhoff's laws again.

In every vertex, the sum of the currents equals ${}0$ .
In every mesh, the sum of the voltages equals ${}0$ .
If in a mesh, a voltage ${}V$ is enforced, then the sum of the voltages equals ${}V$ .

Due to "physical reasons“, we expect that, given voltages in every edge, there should be a well-defined current in every edge. In fact, these currents can be computed if we translate the stated laws into a system of linear equations and solve this system.

In the example given by the picture, suppose that the edges ${}K_{1},\ldots ,K_{5}$ (with the resistances ${}R_{1},\ldots ,R_{5}$ ) are directed from left to right and that the connecting edge ${}K_{0}$ from ${}A$ to ${}C$ (where the voltage ${}V$ is applied) is directed upwards. The four vertices and the three meshes ${}(A,D,B),\,(D,B,C)$ and ${}(A,D,C)$ yield the system of linear equations

{\begin{matrix}I_{0}&+I_{1}&&-I_{3}&&&=&0\\&&&I_{3}&+I_{4}&+I_{5}&=&0\\-I_{0}&&+I_{2}&&-I_{4}&&=&0\\&-I_{1}&-I_{2}&&&-I_{5}&=&0\\&R_{1}I_{1}&&+R_{3}I_{3}&&-R_{5}I_{5}&=&0\\&&-R_{2}I_{2}&&-R_{4}I_{4}&+R_{5}I_{5}&=&0\\&-R_{1}I_{1}&+R_{2}I_{2}&&&&=&-V\,.\end{matrix}}

Here the ${}R_{j}$ and ${}V$ are given numbers, and the ${}I_{j}$ are the unknowns we are looking for.

We give now the definition of a homogeneous and of an inhomogeneous system of linear equations over a field for a given set of variables.

Definition

Let ${}K$ denote a field, and let ${}a_{ij}\in K$ for ${}1\leq i\leq m$ and ${}1\leq j\leq n$ . We call

{\begin{matrix}a_{11}x_{1}+a_{12}x_{2}+\cdots +a_{1n}x_{n}&=&0\\a_{21}x_{1}+a_{22}x_{2}+\cdots +a_{2n}x_{n}&=&0\\\vdots &\vdots &\vdots \\a_{m1}x_{1}+a_{m2}x_{2}+\cdots +a_{mn}x_{n}&=&0\end{matrix}}

a (homogeneous) system of linear equations in the variables ${}x_{1},\ldots ,x_{n}$ . A tuple ${}(\xi _{1},\ldots ,\xi _{n})\in K^{n}$ is called a solution of the linear system if ${}\sum _{j=1}^{n}a_{ij}\xi _{j}=0$ holds for all ${}i=1,\ldots ,m$ .

If ${}(c_{1},\ldots ,c_{m})\in K^{m}$ is given,^[3] then

{\begin{matrix}a_{11}x_{1}+a_{12}x_{2}+\cdots +a_{1n}x_{n}&=&c_{1}\\a_{21}x_{1}+a_{22}x_{2}+\cdots +a_{2n}x_{n}&=&c_{2}\\\vdots &\vdots &\vdots \\a_{m1}x_{1}+a_{m2}x_{2}+\cdots +a_{mn}x_{n}&=&c_{m}\end{matrix}}

is called an inhomogeneous system of linear equations. A tuple ${}(\zeta _{1},\ldots ,\zeta _{n})\in K^{n}$ is called a solution to the inhomogeneous linear system if ${}\sum _{j=1}^{n}a_{ij}\zeta _{j}=c_{i}$

holds for all

{}i

.

The set of all solutions of the system is called the solution set. In the homogeneous case, this is also called the solution space, as it is indeed, by Lemma 6.11 , a vector space.

A homogeneous system of linear equations always has the so-called trivial solution ${}0=(0,\ldots ,0)$ . An inhomogeneous system does not necessarily have a solution. For a given inhomogeneous linear system of equations, the homogeneous system that arises when we replace the tuple on the right-hand side by the null vector ${}0$ is called the corresponding homogeneous system.

The following situation describes a more abstract version of Example 4.1 .

Example

Let ${}K$ denote a field, and ${}m\in \mathbb {N}$ . Suppose that in ${}K^{m}$ , there are ${}n$ vectors (or ${}m$ -tuples)

v_{1}={\begin{pmatrix}a_{11}\\a_{21}\\\vdots \\a_{m1}\end{pmatrix}},\,v_{2}={\begin{pmatrix}a_{12}\\a_{22}\\\vdots \\a_{m2}\end{pmatrix}},\ldots ,v_{n}={\begin{pmatrix}a_{1n}\\a_{2n}\\\vdots \\a_{mn}\end{pmatrix}}

given. Let

{}w={\begin{pmatrix}c_{1}\\c_{2}\\\vdots \\c_{m}\end{pmatrix}}\,

be another vector. We want to know whether ${}w$ can be written as a linear combination of the ${}v_{j}$ . Thus, we are dealing with the question whether there are ${}n$ elements ${}s_{1},\ldots ,s_{n}\in K$ such that

{}s_{1}{\begin{pmatrix}a_{11}\\a_{21}\\\vdots \\a_{m1}\end{pmatrix}}+s_{2}{\begin{pmatrix}a_{12}\\a_{22}\\\vdots \\a_{m2}\end{pmatrix}}+\cdots +s_{n}{\begin{pmatrix}a_{1n}\\a_{2n}\\\vdots \\a_{mn}\end{pmatrix}}={\begin{pmatrix}c_{1}\\c_{2}\\\vdots \\c_{m}\end{pmatrix}}\,

holds. This equality of vectors means identity in every component, so that this condition yields a system of linear equations

{\begin{matrix}a_{11}s_{1}+a_{12}s_{2}+\cdots +a_{1n}s_{n}&=&c_{1}\\a_{21}s_{1}+a_{22}s_{2}+\cdots +a_{2n}s_{n}&=&c_{2}\\\vdots &\vdots &\vdots \\a_{m1}s_{1}+a_{m2}s_{2}+\cdots +a_{mn}s_{n}&=&c_{m}.\end{matrix}}

Remark

It might happen that a system of linear equations is given in such a way that there are variables on both sides of the equations, like in

{}3x-4+5y=8z+7x\,,

{}2-4x+z=2y+3x+6\,,

{}4z-3x+2x+3=5x-11y+2z-8\,.

In this case, one first transforms this system to the standard form by simple additions and processing the coefficients in each equation.

Matrices

A system of linear equations can easily be written with a matrix. This allows us to make the manipulations that lead to the solution of such a system without writing down the variables. Matrices are quite simple objects; however, they can represent quite different mathematical objects (e.g., a family of column vectors, a family of row vectors, a linear mapping, a table of physical interactions, a relation, a linear vector field, etc.), which one has to keep in mind in order to prevent wrong conclusions.

Definition

Let ${}K$ denote a field, and let ${}I$ and ${}J$ denote index sets. An ${}I\times J$ -matrix is a mapping

I\times J\longrightarrow K,(i,j)\longmapsto a_{ij}.

If ${}I=\{1,\ldots ,m\}$ and ${}J=\{1,\ldots ,n\}$ , then we talk about an ${}m\times n$ -matrix. In this case, the matrix is usually written as

{\begin{pmatrix}a_{11}&a_{12}&\ldots &a_{1n}\\a_{21}&a_{22}&\ldots &a_{2n}\\\vdots &\vdots &\ddots &\vdots \\a_{m1}&a_{m2}&\ldots &a_{mn}\end{pmatrix}}.

We will usually restrict to this last situation.

For every ${}i\in I$ , the family ${}a_{ij}$ , ${}j\in J$ , is called the ${}i$ -th row of the matrix, which is usually written as a row tuple (or row vector)

(a_{i1},a_{i2},\ldots ,a_{in}).

For every ${}j\in J$ , the family ${}a_{ij}$ , ${}i\in I$ , is called the ${}j$ -th column of the matrix, usually written as a column tuple (or column vector)

{\begin{pmatrix}a_{1j}\\a_{2j}\\\vdots \\a_{mj}\end{pmatrix}}.

The elements ${}a_{ij}$ are called the entries of the matrix. For ${}a_{ij}$ , the number ${}i$ is called the row index, and ${}j$ is called the column index of the entry. The position of the entry ${}a_{ij}$ is where the ${}i$ -th row meets the ${}j$ -th column. A matrix with ${}m=n$ is called a square matrix. An ${}m\times 1$ -matrix is simply a column tuple (or column vector) of length ${}m$ , and an ${}1\times n$ -matrix is simply a row tuple (or row vector) of length ${}n$ . The set of all matrices with ${}m$ rows and ${}n$ columns (and with entries in ${}K$ ) is denoted by ${}\operatorname {Mat} _{m\times n}(K)$ ; in case ${}m=n$ we also write ${}\operatorname {Mat} _{n}(K)$ .

Two matrices ${}A,B\in \operatorname {Mat} _{m\times n}(K)$ are added by adding corresponding entries. The multiplication of a matrix ${}A$ with an element ${}r\in K$ (a scalar) is also defined entrywise, so

{\begin{pmatrix}a_{11}&a_{12}&\ldots &a_{1n}\\a_{21}&a_{22}&\ldots &a_{2n}\\\vdots &\vdots &\ddots &\vdots \\a_{m1}&a_{m2}&\ldots &a_{mn}\end{pmatrix}}+{\begin{pmatrix}b_{11}&b_{12}&\ldots &b_{1n}\\b_{21}&b_{22}&\ldots &b_{2n}\\\vdots &\vdots &\ddots &\vdots \\b_{m1}&b_{m2}&\ldots &b_{mn}\end{pmatrix}}={\begin{pmatrix}a_{11}+b_{11}&a_{12}+b_{12}&\ldots &a_{1n}+b_{1n}\\a_{21}+b_{21}&a_{22}+b_{22}&\ldots &a_{2n}+b_{2n}\\\vdots &\vdots &\ddots &\vdots \\a_{m1}+b_{m1}&a_{m2}+b_{m2}&\ldots &a_{mn}+b_{mn}\end{pmatrix}}\,

and

{}r{\begin{pmatrix}a_{11}&a_{12}&\ldots &a_{1n}\\a_{21}&a_{22}&\ldots &a_{2n}\\\vdots &\vdots &\ddots &\vdots \\a_{m1}&a_{m2}&\ldots &a_{mn}\end{pmatrix}}={\begin{pmatrix}ra_{11}&ra_{12}&\ldots &ra_{1n}\\ra_{21}&ra_{22}&\ldots &ra_{2n}\\\vdots &\vdots &\ddots &\vdots \\ra_{m1}&ra_{m2}&\ldots &ra_{mn}\end{pmatrix}}\,.

The multiplication of matrices is defined in the following way:

Definition

Let ${}K$ denote a field, and let ${}A$ denote an ${}m\times n$ -matrix and ${}B$ an ${}n\times p$ -matrix over ${}K$ . Then the matrix product

AB

is the ${}m\times p$ -matrix, whose entries are given by

{}c_{ik}=\sum _{j=1}^{n}a_{ij}b_{jk}\,.

A matrix multiplication is only possible when the number of columns of the left-hand matrix equals the number of rows of the right-hand matrix. Just think of the scheme

{}(ROWROW){\begin{pmatrix}C\\O\\L\\U\\M\\N\end{pmatrix}}=(RC+O^{2}+WL+RU+OM+WN)\,,

the result is an ${}1\times 1$ -Matrix. In particular, one can multiply an ${}m\times n$ -matrix ${}A$ with a column vector of length ${}n$ (the vector on the right), and the result is a column vector of length ${}m$ . The two matrices can also be multiplied with roles interchanged,

{}{\begin{pmatrix}C\\O\\L\\U\\M\\N\end{pmatrix}}(ROWROW)={\begin{pmatrix}CR&CO&CW&CR&CO&CW\\OR&O^{2}&OW&OR&O^{2}&OW\\LR&LO&LW&LR&LO&LW\\UR&UO&UW&UR&UO&UW\\MR&MO&MW&MR&MO&MW\\NR&NO&NW&NR&NO&NW\end{pmatrix}}\,.

Definition

The ${}n\times n$ -matrix

{}E_{n}:={\begin{pmatrix}1&0&\cdots &\cdots &0\\0&1&0&\cdots &0\\\vdots &\ddots &\ddots &\ddots &\vdots \\0&\cdots &0&1&0\\0&\cdots &\cdots &0&1\end{pmatrix}}\,

is called identity matrix.

The identity matrix ${}E_{n}$ has the property ${}E_{n}M=M=ME_{n}$ , for an arbitrary ${}n\times n$ -matrix ${}M$ . Hence, the identity matrix is the neutral element with respect to matrix multiplication.

Remark

If we multiply an ${}m\times n$ -matrix ${}A=(a_{ij})_{ij}$ with a column vector ${}x={\begin{pmatrix}x_{1}\\x_{2}\\\vdots \\x_{n}\end{pmatrix}}$ , then we get

{}Ax={\begin{pmatrix}a_{11}&a_{12}&\ldots &a_{1n}\\a_{21}&a_{22}&\ldots &a_{2n}\\\vdots &\vdots &\ddots &\vdots \\a_{m1}&a_{m2}&\ldots &a_{mn}\end{pmatrix}}{\begin{pmatrix}x_{1}\\x_{2}\\\vdots \\x_{n}\end{pmatrix}}={\begin{pmatrix}a_{11}x_{1}+a_{12}x_{2}+\cdots +a_{1n}x_{n}\\a_{21}x_{1}+a_{22}x_{2}+\cdots +a_{2n}x_{n}\\\vdots \\a_{m1}x_{1}+a_{m2}x_{2}+\cdots +a_{mn}x_{n}\end{pmatrix}}\,.

Hence, an inhomogeneous system of linear equations with disturbance vector ${}{\begin{pmatrix}c_{1}\\c_{2}\\\vdots \\c_{m}\end{pmatrix}}$ can be written briefly as

{}Ax=c\,.

Then, the manipulations on the equations that do not change the solution set, can be replaced by corresponding manipulations on the rows of the matrix. It is not necessary to write down the variables.

Definition

An ${}n\times n$ -matrix of the form

{\begin{pmatrix}d_{11}&0&\cdots &\cdots &0\\0&d_{22}&0&\cdots &0\\\vdots &\ddots &\ddots &\ddots &\vdots \\0&\cdots &0&d_{n-1\,n-1}&0\\0&\cdots &\cdots &0&d_{nn}\end{pmatrix}}

is called a diagonal matrix.

Definition

Let ${}K$ be a field, and let ${}M=(a_{ij})_{ij}$ be an ${}m\times n$ -matrix over ${}K$ . Then the ${}n\times m$ -matrix

{M^{\text{tr}}}={\left(b_{ij}\right)}_{ij}{\text{ with }}b_{ij}:=a_{ji}

is called the transposed matrix of

{}M

.

The transposed matrix arises by interchanging the roles of the rows and the columns. For example, we have

{}{{\begin{pmatrix}t&n&o&d\\r&s&s&x\\a&p&e&y\end{pmatrix}}^{\text{tr}}}={\begin{pmatrix}t&r&a\\n&s&p\\o&s&e\\d&x&y\end{pmatrix}}\,.

Footnotes

↑ In this example, only positive numbers have a practical interpretation. In linear algebra, everything is over a field, so we also allow negative numbers.
↑ Right here, we do not discuss that such equations define a plane. The solution sets are "shifted linear subspaces of dimension two“.
↑ Such a vector is sometimes called a disturbance vector of the system.

<< \| Linear algebra (Osnabrück 2024-2025)/Part I \| >> PDF-version of this lecture Exercise sheet for this lecture (PDF)

[1] In this example, only positive numbers have a practical interpretation. In linear algebra, everything is over a field, so we also allow negative numbers.

[2] Right here, we do not discuss that such equations define a plane. The solution sets are "shifted linear subspaces of dimension two“.

[3] Such a vector is sometimes called a disturbance vector of the system.

[1]

[2]

[3]