Linear algebra (Osnabrück 2024-2025)/Part I/Lecture 5

Solving systems of linear equations

It is not clear a priori what solving a (linear) system of equation is supposed to mean. Anyway, the goal is to find a quite good description of the solution set. If there is only one solution, then we want to find it. If there is no solution at all, we want to detect this in a reasonable way. In general, the solution set of a system of equations is large. In this case, solving a system means to identify "free“ variables, for which arbitrary values are allowed, and to describe explicitly how the other "dependent“ variables can be expressed in terms of the free variables. This is called an explicit description of the solution set.

Linear systems of equations can be solved systematically with the elimination process. With this method, variables are eliminated step by step, until a very simple equivalent linear system in triangular form arises, which can be solved directly (or from where we can deduce that there is no solution). We consider a typical example in many variables.

Example

We want to solve the inhomogeneous linear system

{\begin{matrix}2x&+5y&+2z&&-v&=&3\\3x&-4y&&+u&+2v&=&1\\4x&&-2z&+2u&&=&7\,\end{matrix}}

over ${}\mathbb {R}$ (or over ${}\mathbb {Q}$ ). Firstly, we eliminate ${}x$ by keeping the first row ${}I$ , replacing the second row ${}II$ by ${}II-{\frac {3}{2}}I$ , and replacing the third row ${}III$ by ${}III-2I$ . This yields

{\begin{matrix}2x&+5y&+2z&&-v&=&3\\&-{\frac {23}{2}}y&-3z&+u&+{\frac {7}{2}}v&=&{\frac {-7}{2}}\\&-10y&-6z&+2u&+2v&=&1\,.\end{matrix}}

Now, we can eliminate ${}y$ from the (new) third row, with the help of the second row. Because of the fractions, we rather eliminate ${}z$ (which eliminates also ${}u$ ). We leave the first and the second row as they are, and we replace the third row ${}III$ by ${}III-2II$ . This yields the system, in a new ordering of the variables,^[1]

{\begin{matrix}2x&+2z&&+5y&-v&=&3\\&-3z&+u&-{\frac {23}{2}}y&+{\frac {7}{2}}v&=&{\frac {-7}{2}}\\&&&13y&-5v&=&8\,.\end{matrix}}

Now we can choose an arbitrary (free) value for ${}v$ . The third row determines ${}y$ uniquely, we must have

{}y={\frac {8}{13}}+{\frac {5}{13}}v\,.

In the second equation, we can choose ${}u$ arbitrarily, this determines ${}z$ via

{}{\begin{aligned}z&=-{\frac {1}{3}}{\left(-{\frac {7}{2}}-u-{\frac {7}{2}}v+{\frac {23}{2}}{\left({\frac {8}{13}}+{\frac {5}{13}}v\right)}\right)}\\&=-{\frac {1}{3}}{\left(-{\frac {7}{2}}-u-{\frac {7}{2}}v+{\frac {92}{13}}+{\frac {115}{26}}v\right)}\\&=-{\frac {1}{3}}{\left({\frac {93}{26}}-u+{\frac {12}{13}}v\right)}\\&=-{\frac {31}{26}}+{\frac {1}{3}}u-{\frac {4}{13}}v.\end{aligned}}

The first row determines ${}x$ , namely

{}{\begin{aligned}x&={\frac {1}{2}}{\left(3-2z-5y+v\right)}\\&={\frac {1}{2}}{\left(3-2{\left(-{\frac {31}{26}}+{\frac {1}{3}}u-{\frac {4}{13}}v\right)}-5{\left({\frac {8}{13}}+{\frac {5}{13}}v\right)}+v\right)}\\&={\frac {1}{2}}{\left({\frac {30}{13}}-{\frac {2}{3}}u-{\frac {4}{13}}v\right)}\\&={\frac {15}{13}}-{\frac {1}{3}}u-{\frac {2}{13}}v.\end{aligned}}

Hence, the solution set is

{\left\{{\left({\frac {15}{13}}-{\frac {1}{3}}u-{\frac {2}{13}}v,{\frac {8}{13}}+{\frac {5}{13}}v,-{\frac {31}{26}}+{\frac {1}{3}}u-{\frac {4}{13}}v,u,v\right)}\mid u,v\in \mathbb {R} \right\}}.

A particularly simple solution is obtained by equating the free variables ${}u$ and ${}v$ with ${}0$ . This yields the special solution

{}(x,y,z,u,v)=\left({\frac {15}{13}},\,{\frac {8}{13}},\,-{\frac {31}{26}},\,0,\,0\right)\,.

The general solution set can also be written as

{\left\{{\left({\frac {15}{13}},{\frac {8}{13}},-{\frac {31}{26}},0,0\right)}+u{\left(-{\frac {1}{3}},0,{\frac {1}{3}},1,0\right)}+v{\left(-{\frac {2}{13}},{\frac {5}{13}},-{\frac {4}{13}},0,1\right)}\mid u,v\in \mathbb {R} \right\}}.

Here,

{\left\{u{\left(-{\frac {1}{3}},0,{\frac {1}{3}},1,0\right)}+v{\left(-{\frac {2}{13}},{\frac {5}{13}},-{\frac {4}{13}},0,1\right)}\mid u,v\in \mathbb {R} \right\}}

is a description of the general solution of the corresponding homogeneous linear system.

Definition

Let ${}K$ denote a field, and let two (inhomogeneous) systems of linear equations,

with respect to the same set of variables, be given. The systems are called equivalent, if their solution sets are identical.

Lemma

Let ${}K$ be a field, and let

{\begin{matrix}a_{11}x_{1}+a_{12}x_{2}+\cdots +a_{1n}x_{n}&=&c_{1}\\a_{21}x_{1}+a_{22}x_{2}+\cdots +a_{2n}x_{n}&=&c_{2}\\\vdots &\vdots &\vdots \\a_{m1}x_{1}+a_{m2}x_{2}+\cdots +a_{mn}x_{n}&=&c_{m}\end{matrix}}

be an inhomogeneous system of linear equations over ${}K$ . Then the following manipulations on this system yield an equivalent system.

Swapping two equations.
The multiplication of an equation by a scalar ${}s\neq 0$ .
The omitting of an equation, if it occurs twice.
The duplication of an equation (in the sense to write down the equation again).
The omitting or adding of a zero row (zero equation).
The replacement of an equation ${}H$ by the equation that arises if we add to ${}H$ another equation ${}G$ of the system.

Proof

Most statements are immediately clear. (2) follows from the fact that if

{}\sum _{i=1}^{n}a_{i}\xi _{i}=c\,

holds, then also

{}\sum _{i=1}^{n}(sa_{i})\xi _{i}=sc\,

holds for every ${}s\in K$ . If ${}s\neq 0$ , then this implication can be reversed by multiplication with ${}s^{-1}$ .

(6). Let ${}G$ be the equation

{}\sum _{i=1}^{n}a_{i}x_{i}=c\,,

and ${}H$ be the equation

{}\sum _{i=1}^{n}b_{i}x_{i}=d\,.

If a tuple ${}(\xi _{1},\ldots ,\xi _{n})\in K^{n}$ satisfies both equations, then it also satisfies the equation ${}H'=G+H$ . And if the tuple satisfies the equations ${}G$ and ${}H'$ , then it also satisfies the equation ${}G$ and ${}H=H'-G$ .

\Box

For finding the solution set of a linear system, the manipulations (2) and (6) are most important. In general, these two steps are combined, and the equation ${}H$ is replaced by an equation of the form ${}H+\lambda G$ (with ${}G\neq H$ ). Here, ${}\lambda \in K$ has to be chosen in such a way that the new equation contains one variable less than the old equation. This process is called elimination of a variable. This elimination is not only applied to one equation, but for all equations except one (suitable chosen) "working row“ ${}G$ , and with a fixed "working variable“. The following elimination lemma describes this step.

Lemma

Let ${}K$ denote a field, and let ${}S$ denote an (inhomogeneous) system of linear equations over ${}K$ in the variables ${}x_{1},\ldots ,x_{n}$ . Suppose that ${}x$ is a variable which occurs in at least one equation ${}G$ with a coefficient ${}a\neq 0$ . Then every equation ${}H$ , different from ${}G$ ,^[2] can be replaced by an equation ${}H'$ , in which ${}x$ does not occur any more, and such that the new system of equations ${}S'$ that consists of ${}G$ and the equations ${}H'$ , is equivalent with the system ${}S$ .

Proof

Changing the numbering, we may assume ${}x=x_{1}$ . Let ${}G$ be the equation

{}ax_{1}+\sum _{i=2}^{n}a_{i}x_{i}=b\,

(with ${}a\neq 0$ ), and let ${}H$ be the equation

{}cx_{1}+\sum _{i=2}^{n}c_{i}x_{i}=d\,.

Then the equation

{}H'=H-{\frac {c}{a}}G\,

has the form

{}\sum _{i=2}^{n}{\left(c_{i}-{\frac {c}{a}}a_{i}\right)}x_{i}=d-{\frac {c}{a}}b\,,

and ${}x_{1}$ does not occur in it. Because of ${}H=H'+{\frac {c}{a}}G$ , the systems are equivalent.

\Box

The method of this lemma, called Gauß elimination process, can be applied successively in order to obtain a linear system in triangular form.

Theorem

Every (inhomogeneous) system of linear equations over a field ${}K$ can be transformed, by the manipulations described in Lemma 5.3 , to an equivalent linear system of the form

{\begin{matrix}b_{1s_{1}}x_{s_{1}}&+b_{1s_{1}+1}x_{s_{1}+1}&\ldots &\ldots &\ldots &\ldots &\ldots &+b_{1n}x_{n}&=&d_{1}\\0&\ldots &0&b_{2s_{2}}x_{s_{2}}&\ldots &\ldots &\ldots &+b_{2n}x_{n}&=&d_{2}\\\vdots &\ddots &\ddots &\vdots &\vdots &\vdots &\vdots &\vdots &=&\vdots \\0&\ldots &\ldots &\ldots &0&b_{m{s_{m}}}x_{s_{m}}&\ldots &+b_{mn}x_{n}&=&d_{m}\\(0&\ldots &\ldots &\ldots &\ldots &\ldots &\ldots &0&=&d_{m+1}),\end{matrix}}

where, in each row, the first coefficient ${}b_{1s_{1}},b_{2s_{2}},\ldots ,b_{ms_{m}}$ is different from ${}0$ . Here, either ${}d_{m+1}=0$ , and the last row can be omitted, or ${}d_{m+1}=0$ , and then the system has no solution at all.

With the help of renaming the variables, we get an equivalent system of the form

{\begin{matrix}c_{11}y_{1}&+c_{12}y_{2}&\ldots &+c_{1m}y_{m}&+c_{1m+1}y_{m+1}&\ldots &+c_{1n}y_{n}&=&d_{1}\\0&c_{22}y_{2}&\ldots &\ldots &\ldots &\ldots &+c_{2n}y_{n}&=&d_{2}\\\vdots &\ddots &\ddots &\vdots &\vdots &\vdots &\vdots &=&\vdots \\0&\ldots &0&c_{mm}y_{m}&+c_{mm+1}y_{m+1}&\ldots &+c_{mn}y_{n}&=&d_{m}\\(0&\ldots &\ldots &0&0&\ldots &0&=&d_{m+1})\end{matrix}}

with diagonal elements

{}c_{ii}\neq 0

.

Proof

This follows directly from the elimination lemma, by eliminating successively variables. Elimination is applied firstly to the first variable (in the given ordering), say ${}x_{s_{1}}$ , which occurs in at least one equation with a coefficient ${}\neq 0$ (if it only occurs in one equation, then this elimination step is already done). This elimination process is applied as long as the new subsystem (without the working equation used in the elimination step before) has at least one equation with a coefficient for one variable different from ${}0$ . After this, we have in the end only equations without variables, and they are either only zero equations, or there is no solution.

When we set ${}y_{1}=x_{s_{1}},y_{2}=x_{s_{2}},\ldots ,y_{m}=x_{s_{m}}$ , and denote the other variables with ${}y_{m+1},\ldots ,y_{n}$ , then we obtain the described system in triangular form.

\Box

It might happen that the variable ${}x_{1}$ does not appear in the system with a coefficient ${}\neq 0$ , and that, in the elimination process, more than one variable is eliminated at the same time. Then one gets a linear system in echelon form, which can be transformed to a triangular form by a change of variables.

Remark

A linear system can be written briefly as

{}Ax=c\,

with an ${}m\times n$ -matrix ${}A$ and an ${}m$ -tuple ${}c$ . The manipulations at the equations that we do in the elimination procedure, can be performed directly for the matrix, or for the extended matrix that arises when we extend ${}A$ by the column ${}c$ . Essentially, we replace a row by the sum of the row with a multiple of another row. This has the advantage that we do not have to write down the variables. However, one should then not swap the variables. At the end, the arising matrix in echelon form can be interpreted again as a linear system.

Remark

Sometimes, we want to solve a simultaneous system of linear equations of the form

{\begin{matrix}a_{11}x_{1}+a_{12}x_{2}+\cdots +a_{1n}x_{n}&=&c_{1}&(=&d_{1},&=&e_{1},\ldots )\\a_{21}x_{1}+a_{22}x_{2}+\cdots +a_{2n}x_{n}&=&c_{2}&(=&d_{2},&=&e_{2},\ldots )\\\vdots &\vdots &\vdots \\a_{m1}x_{1}+a_{m2}x_{2}+\cdots +a_{mn}x_{n}&=&c_{m}&(=&d_{m},&=&e_{m},\ldots )\,.\end{matrix}}

The goal is to find the solutions of the corresponding inhomogeneous linear systems for different vectors. In principle, we could consider independent linear systems, and solve them. However, it is smarter to perform those manipulations that we do on the left-hand side to achieve upper triangular form, simultaneously with all the vectors on the right-hand side. An important special case, for ${}n=m$ , is when the vectors are the standard vectors, see Method 12.11 .

We discuss briefly some further methods to solve a linear system.

Remark

Another method to solve a linear system is the substitution method. Here, the variables are also successively eliminated, but in another way. If we want to eliminate the variable ${}x_{1}$ , then we look at an equation, say ${}G_{1}$ , where ${}x_{1}$ occurs with a coefficient different from ${}0$ . In this equation, we isolate ${}x_{1}$ and get a new equation of the form

G_{1}':\,\,\,x_{1}=F_{1},

where ${}x_{1}$ does not occur in ${}F_{1}$ . Then in all other equations ${}G_{2},\ldots ,G_{m}$ , we replace the variable ${}x_{1}$ by ${}F_{1}$ , and obtain (after some simplifications) a linear system ${}G_{2}',\ldots ,G_{m}'$ without the variable ${}x_{1}$ , which is, together with ${}G_{1}'$ , equivalent with the original system.

Remark

Another method to solve a linear system is the equating method. Here, the variables are also successively eliminated, but in another way. In this method, in every equation ${}G_{i}$ , ${}i=1,\ldots ,m$ , we isolate one fixed variable, say ${}x_{1}$ . Suppose that (after reordering) ${}G_{1},\ldots ,G_{k}$ are the equations where the variable ${}x_{1}$ occurs with a coefficient different from ${}0$ . These equations are brought into the form

G_{i}':\,\,\,x_{1}=F_{i},

where in ${}F_{i}$ , the variable ${}x_{1}$ does not occur. The linear system consisting in

G_{1}',F_{1}=F_{2},F_{1}=F_{3},\ldots ,F_{1}=F_{k},G_{k+1},\ldots ,G_{m}

is equivalent to the original system. We continue with this system without ${}G_{1}'$ .

Remark

The methods described in Theorem 5.5 , Remark 5.8 , and Remark 5.9 to solve a linear system differ with respect to speed, strategic conception, complexity of the coefficients, error-proneness. In the elimination method, the systematic reduction of the number of variables (reduction of dimension) is obvious, and it is unlikely to make mistakes (except for miscalculations). It is always clear how to continue. However, these advantages emerge starting with three variables. For two variables, it does not make a difference what method we choose.

The evaluation of the methods depends also on the features of the concrete system. Such features should be taken into account in order to find "short-cuts“ to the solution. The adequate choice of the solution method appropriate for the given problem is called adaptivity (a concept which is used in the didactic context with different meanings). If, for example, one row of the system has the form ${}x=3$ , then one should recognize that a part of the solution can be read off immediately, and one should not add to this row any other row. Here, one should replace in the other rows everywhere ${}x$ by ${}3$ , and continue then. Or: if four equations are given, where in two equations only the variables ${}x$ and ${}y$ appear, and where in the two other equations only the variables ${}z$ and ${}w$ appear, then one should realize that, in principle, two unrelated linear systems are given, each in two variables, and these should be solved independently. Or: it might be that a small subsystem of the system guarantees that there is no solution at all. Then only this has to be worked out, there is no need to consider the other equations. And: consider the exact question! If the question is whether a certain tuple is a solution, then we only have to plug this tuple into the equations, no manipulations are necessary.

Remark

A system of linear inequalities over the rational numbers or over the real numbers is a system of the form

{\begin{matrix}a_{11}x_{1}+a_{12}x_{2}+\cdots +a_{1n}x_{n}&\star &c_{1}\\a_{21}x_{1}+a_{22}x_{2}+\cdots +a_{2n}x_{n}&\star &c_{2}\\\vdots &\vdots &\vdots \\a_{m1}x_{1}+a_{m2}x_{2}+\cdots +a_{mn}x_{n}&\star &c_{m}\,,\end{matrix}}

where ${}\star$ might be ${}\leq$ or ${}\geq$ . It is considerably more difficult to find the solution set of such a system than in the case of equations. In general, it is not possible to eliminate the variables.

Linear system in triangular form

Theorem

Let an inhomogeneous system of linear equations in triangular form

{\begin{matrix}a_{11}x_{1}&+a_{12}x_{2}&\ldots &+a_{1m}x_{m}&\ldots &+a_{1n}x_{n}&=&c_{1}\\0&a_{22}x_{2}&\ldots &\ldots &\ldots &+a_{2n}x_{n}&=&c_{2}\\\vdots &\ddots &\ddots &\vdots &\vdots &\vdots &=&\vdots \\0&\ldots &0&a_{mm}x_{m}&\ldots &+a_{mn}x_{n}&=&c_{m}\\\end{matrix}}

with ${}m\leq n$ over a field ${}K$ be given, where the diagonal elements are all not ${}0$ . Then the solutions ${}(x_{1},\ldots ,x_{m},x_{m+1},\ldots ,x_{n})$ are in bijection with the tuples ${}(x_{m+1},\ldots ,x_{n})\in K^{n-m}$ .

The

{}n-m

entries

{}x_{m+1},\ldots ,x_{n}

can be chosen arbitrarily, and they determine a unique solution, and every solution is of this form.

Proof

This is clear, as when the tuple ${}(x_{m+1},\ldots ,x_{n})$ is given, the rows determine successively the other variables from bottom to top.

\Box

In case ${}m=n$ , there are no free variables, ${}K^{0}=0$ , and the linear system has exactly one solution.

The superposition principle for linear systems

Theorem

Let ${}M={\left(a_{ij}\right)}_{1\leq i\leq m,1\leq j\leq n}$ denote a matrix over a field ${}K$ . Let ${}c={\left(c_{1},\ldots ,c_{m}\right)}$ and ${}d={\left(d_{1},\ldots ,d_{m}\right)}$ denote two ${}m$ -tuples, and let ${}y={\left(y_{1},\ldots ,y_{n}\right)}\in K^{n}$ be a solution of the linear system

{}Mx=c\,,

and ${}z={\left(z_{1},\ldots ,z_{n}\right)}\in K^{n}$ a solution of the system

{}Mx=d\,.

Then

${}y+z={\left(y_{1}+z_{1},\ldots ,y_{n}+z_{n}\right)}$ is a solution of the system

{}Mx=c+d\,.

Proof

See Exercise 5.19 .

\Box

Corollary

Let ${}K$ be a field, and let

{\begin{matrix}a_{11}x_{1}+a_{12}x_{2}+\cdots +a_{1n}x_{n}&=&c_{1}\\a_{21}x_{1}+a_{22}x_{2}+\cdots +a_{2n}x_{n}&=&c_{2}\\\vdots &\vdots &\vdots \\a_{m1}x_{1}+a_{m2}x_{2}+\cdots +a_{mn}x_{n}&=&c_{m}\end{matrix}}

be an inhomogeneous linear system over ${}K$ , and let

{\begin{matrix}a_{11}x_{1}+a_{12}x_{2}+\cdots +a_{1n}x_{n}&=&0\\a_{21}x_{1}+a_{22}x_{2}+\cdots +a_{2n}x_{n}&=&0\\\vdots &\vdots &\vdots \\a_{m1}x_{1}+a_{m2}x_{2}+\cdots +a_{mn}x_{n}&=&0\end{matrix}}

be the corresponding homogeneous linear system. If

{}{\left(y_{1},\ldots ,y_{n}\right)}

is a solution of the inhomogeneous system and if

{}{\left(z_{1},\ldots ,z_{n}\right)}

is a solution of the homogeneous system, then

{}{\left(y_{1}+z_{1},\ldots ,y_{n}+z_{n}\right)}

is a solution of the inhomogeneous system.

Proof

This follows immediately from Theorem 5.13 .

\Box

In particular, this means that when ${}L$ is the solution space of a homogeneous linear system, and when ${}y$ is one (particular) solution of an inhomogeneous linear system, then the mapping

L\longrightarrow L',z\longmapsto y+z,

gives a bijection between ${}L$ and the solution set ${}L'$ of the inhomogeneous linear system.

Footnotes

↑ Such a reordering is safe as long as we keep the names of the variables. But if we write down the system in matrix notation without the variables, then one has to be careful and remember the reordering of the columns.
↑ It is enough that these equations have a different index in the system.

<< \| Linear algebra (Osnabrück 2024-2025)/Part I \| >> PDF-version of this lecture Exercise sheet for this lecture (PDF)

[1] Such a reordering is safe as long as we keep the names of the variables. But if we write down the system in matrix notation without the variables, then one has to be careful and remember the reordering of the columns.

[2] It is enough that these equations have a different index in the system.

[1]

[2]