# Nonlinear finite elements/Newton method for finite elements

## Newton's method for nonlinear finite elements

The finite element system of equations is of the form

${\displaystyle K(u)u=f~.}$

In order to write it in the form ${\displaystyle f(x)=0}$, we define the residual

${\displaystyle r(u):=K(u)u-f\qquad \implies \qquad r(u)=0~.}$

Then, the Newton iteration formula becomes

${\displaystyle {u_{r+1}=u_{r}-{\cfrac {r(u_{r})}{\cfrac {dr(u_{r})}{du}}}=u_{r}-T^{-1}(u_{r})~r(u_{r})~.}}$

The slope of the tangent (the tangent stiffness) is

{\displaystyle {\begin{aligned}T(u_{r}):={\cfrac {dr(u_{r})}{du}}&={\cfrac {d}{du}}[K(u)u-f]\ \\&=K(u)+{\cfrac {dK(u)}{du}}u\ \\&=K(u_{r})+{\cfrac {dK(u_{r})}{du}}u_{r}\end{aligned}}}

Therefore the tangent stiffness is

${\displaystyle {T(u_{r}):=K(u_{r})+{\cfrac {dK(u_{r})}{du}}u_{r}~.}}$

So far we have considered Newton's method only for single equations. What changes have to be made for a system of equations such as the ones encountered in FEM?

### Newton's method for a system of equations

For a system of equations, all we have to do is a straightforward extension of the method as it applies to one dimension. In this case, it's easier to think in terms of matrices instead of individual coefficients of the matrices. Thus, we have

${\displaystyle \mathbf {K} (\mathbf {u} )~\mathbf {u} =\mathbf {f} ~.}$

The residual is defined as

${\displaystyle \mathbf {r} (\mathbf {u} ):=\mathbf {K} (\mathbf {u} )~\mathbf {u} -\mathbf {f} ~.}$

Then, the Newton iteration formula is

${\displaystyle {\mathbf {u} _{r+1}=\mathbf {u} _{r}-\mathbf {T} ^{-1}(\mathbf {u} _{r})~\mathbf {r} (\mathbf {u} _{r})}}$

where the tangent stiffness matrix is given by

${\displaystyle {\text{(7)}}\qquad {\mathbf {T} (\mathbf {u} _{r}):={\frac {\partial \mathbf {r} (\mathbf {u} _{r})}{\partial \mathbf {u} _{r}}}=\mathbf {K} (\mathbf {u} _{r})+{\frac {\partial \mathbf {K} (\mathbf {u} _{r})}{\partial \mathbf {u} }}~\mathbf {u} _{r}~.}}$

The iterative procedure is terminated when either the residual is very small or the difference between successive solutions is less than a specified tolerance.

However, both the residual ${\displaystyle \mathbf {r} }$ and the solution ${\displaystyle \mathbf {u} }$ are vectors. We usually compare the ${\displaystyle L_{2}}$ (Euclidean) norm of the vectors with a tolerance ${\displaystyle \epsilon }$. In symbolic form, we check the norm of the residual using

${\displaystyle \lVert \mathbf {r} \rVert _{2}\leq \epsilon \qquad {\text{or}}\qquad {\sqrt {\mathbf {r} \cdot \mathbf {r} }}\leq \epsilon \qquad {\text{or}}\qquad {\sqrt {\sum _{i=1}^{n}r_{i}^{2}}}\leq \epsilon ~.}$

For the difference between successive solutions we check

${\displaystyle {\cfrac {\lVert \mathbf {u} _{r+1}-\mathbf {u} _{r}\rVert _{2}}{\lVert \mathbf {u} _{r+1}\rVert _{2}}}\leq \epsilon \qquad {\text{or}}\qquad {\cfrac {\sqrt {(\mathbf {u} _{r+1}-\mathbf {u} _{r})\cdot (\mathbf {u} _{r+1}-\mathbf {u} _{r})}}{\sqrt {\mathbf {u} _{r+1}\cdot \mathbf {u} _{r+1}}}}\leq \epsilon \qquad {\text{or}}\qquad {\cfrac {\sqrt {\sum _{i=1}^{n}(u_{i}^{r+1}-u_{i}^{r})^{2}}}{\sqrt {\sum _{i=1}^{n}(u_{i}^{r+1})^{2}}}}\leq \epsilon ~.}$

#### How do we calculate the derivative of ${\displaystyle \mathbf {K} }$ with respect to ${\displaystyle \mathbf {u} }$?

In equation (7) we have a term that involves a partial derivative of matrix ${\displaystyle \mathbf {K} }$ with respect to the vector ${\displaystyle \mathbf {u} }$. To see what that means, let us look at the component form of the tangent stiffness matrix.

The tangent stiffness matrix is defined as

${\displaystyle \mathbf {T} :={\frac {\partial \mathbf {r} (\mathbf {u} _{r})}{\partial \mathbf {u} _{r}}}~.}$

Let us consider a ${\displaystyle 3\times 3}$ stiffness matrix and see what the above equation means. The residual is

${\displaystyle {\begin{bmatrix}r_{1}\\r_{2}\\r_{3}\end{bmatrix}}={\begin{bmatrix}K_{11}&K_{12}&K_{13}\\K_{21}&K_{22}&K_{23}\\K_{31}&K_{32}&K_{33}\end{bmatrix}}{\begin{bmatrix}u_{1}\\u_{2}\\u_{3}\end{bmatrix}}-{\begin{bmatrix}f_{1}\\f_{2}\\f_{3}\end{bmatrix}}~.}$

Expanding the above matrix equation out we get

{\displaystyle {\begin{aligned}r_{1}&=K_{11}u_{1}+K_{12}u_{2}+K_{13}u_{3}-f_{1}\\r_{2}&=K_{21}u_{1}+K_{22}u_{2}+K_{23}u_{3}-f_{2}\\r_{3}&=K_{31}u_{1}+K_{32}u_{2}+K_{33}u_{3}-f_{3}\end{aligned}}}

Taking derivatives with respect to ${\displaystyle u_{1}}$, we get

{\displaystyle {\begin{aligned}{\frac {\partial r_{1}}{\partial u_{1}}}&=\left({\frac {\partial K_{11}}{\partial u_{1}}}u_{1}+K_{11}\right)+{\frac {\partial K_{12}}{\partial u_{1}}}u_{2}+{\frac {\partial K_{13}}{\partial u_{1}}}u_{3}\\{\frac {\partial r_{2}}{\partial u_{1}}}&=\left({\frac {\partial K_{21}}{\partial u_{1}}}u_{1}+K_{21}\right)+{\frac {\partial K_{22}}{\partial u_{1}}}u_{2}+{\frac {\partial K_{23}}{\partial u_{1}}}u_{3}\\{\frac {\partial r_{3}}{\partial u_{1}}}&=\left({\frac {\partial K_{31}}{\partial u_{1}}}u_{1}+K_{31}\right)+{\frac {\partial K_{32}}{\partial u_{1}}}u_{2}+{\frac {\partial K_{33}}{\partial u_{1}}}u_{3}\end{aligned}}}

In shorter form, we can write

${\displaystyle {\frac {\partial r_{i}}{\partial u_{1}}}=K_{i1}+{\frac {\partial K_{i1}}{\partial u_{1}}}u_{1}+{\frac {\partial K_{i2}}{\partial u_{1}}}u_{2}+{\frac {\partial K_{i3}}{\partial u_{1}}}u_{3}=K_{i1}+\sum _{m=1}^{3}{\frac {\partial K_{im}}{\partial u_{1}}}u_{m}~.}$

Similarly, taking derivatives with respect to ${\displaystyle u_{2}}$, we get

{\displaystyle {\begin{aligned}{\frac {\partial r_{1}}{\partial u_{2}}}&={\frac {\partial K_{11}}{\partial u_{2}}}u_{1}+\left({\frac {\partial K_{12}}{\partial u_{2}}}u_{2}+K_{12}\right)+{\frac {\partial K_{13}}{\partial u_{2}}}u_{3}\\{\frac {\partial r_{2}}{\partial u_{2}}}&={\frac {\partial K_{21}}{\partial u_{2}}}u_{1}+\left({\frac {\partial K_{22}}{\partial u_{2}}}u_{2}+K_{22}\right)+{\frac {\partial K_{23}}{\partial u_{2}}}u_{3}\\{\frac {\partial r_{3}}{\partial u_{2}}}&={\frac {\partial K_{31}}{\partial u_{2}}}u_{1}+\left({\frac {\partial K_{32}}{\partial u_{2}}}u_{2}+K_{32}\right)+{\frac {\partial K_{33}}{\partial u_{2}}}u_{3}\end{aligned}}}

The short form of the above is

${\displaystyle {\frac {\partial r_{i}}{\partial u_{2}}}=K_{i2}+\sum _{m=1}^{3}{\frac {\partial K_{im}}{\partial u_{2}}}u_{m}~.}$

Finally, taking derivatives with respect to ${\displaystyle u_{3}}$ gives us

{\displaystyle {\begin{aligned}{\frac {\partial r_{1}}{\partial u_{3}}}&={\frac {\partial K_{11}}{\partial u_{3}}}u_{1}+{\frac {\partial K_{12}}{\partial u_{3}}}u_{2}+\left({\frac {\partial K_{13}}{\partial u_{3}}}u_{3}+K_{13}\right)\\{\frac {\partial r_{2}}{\partial u_{3}}}&={\frac {\partial K_{21}}{\partial u_{3}}}u_{1}+{\frac {\partial K_{22}}{\partial u_{3}}}u_{2}+\left({\frac {\partial K_{23}}{\partial u_{3}}}u_{3}+K_{23}\right)\\{\frac {\partial r_{3}}{\partial u_{3}}}&={\frac {\partial K_{31}}{\partial u_{3}}}u_{1}+{\frac {\partial K_{32}}{\partial u_{3}}}u_{2}+\left({\frac {\partial K_{33}}{\partial u_{3}}}u_{3}+K_{33}\right)\end{aligned}}}

The short form is

${\displaystyle {\frac {\partial r_{i}}{\partial u_{3}}}=K_{i3}+\sum _{m=1}^{3}{\frac {\partial K_{im}}{\partial u_{3}}}u_{m}~.}$

Combining the three short forms of the 9 equations, we get

${\displaystyle {{\frac {\partial r_{i}}{\partial u_{j}}}=K_{ij}+\sum _{m=1}^{3}{\frac {\partial K_{im}}{\partial u_{j}}}u_{m}~.}}$

Each of these terms represents one component ${\displaystyle T_{ij}}$ of the matrix ${\displaystyle \mathbf {T} }$, and for a ${\displaystyle n\times n}$ stiffness matrix we get

${\displaystyle {T_{ij}:=K_{ij}+\sum _{m=1}^{n}{\frac {\partial K_{im}}{\partial u_{j}}}u_{m}~.}}$

In matrix notation, the above equation is written as

${\displaystyle {\mathbf {T} :=\mathbf {K} +{\frac {\partial \mathbf {K} }{\partial \mathbf {u} }}~\mathbf {u} ~.}}$

This shows us how to compute the derivative of ${\displaystyle \mathbf {K} }$ with respect to ${\displaystyle \mathbf {u} }$ and hence ${\displaystyle \mathbf {T} }$.

Note that we can form the tangent stiffness matrix over an element and assemble the contributions from each element to get the global tangent stiffness matrix. The reasons are the same as those for the standard global stiffness matrix.