# Least-Squares Method

## Learning project summary[edit]

**Project code: Least-Squares Method****School: Computer Science****Department: Scientific_Computing**

## Content summary[edit]

A brief introduction to Least-Squares method, and its statistic meaning.

## Goals[edit]

This learning project offers learning activities and some application for Least-Squares Method. With this project, one should understand the intention of Least-Squares Method, and what it means. Moreover, one should be able to apply some simple Least-Squares methods to find a good approximation for any functions. For more mathematical explanation, one should visit the following page: "Least squares" to obtain more information.

## Learning materials[edit]

### Texts[edit]

[1] __Numerical Mathematics and Computing__ Chapter 12.1

[2] __Numerical Method for Engineers: With Software and Programming Applications__ Chapter 17.3

[3] __Statistics for Management and Economics__ Chapter 17.1

[4] __T.Strutz: Data Fitting and Uncertainty. A practical introduction to weighted least squares and beyond.__ 2nd edition, Springer Vieweg, 2016, ISBN 978-3-658-11455-8.

### Lessons[edit]

#### Lesson 1: *Introduction to Least-Squares Method*[edit]

The goal of Least-Squares Method is to find a good estimation of parameters that fit a function, f(x), of a set of data, . The Least-Squares Method requires that the estimated function has to deviate as little as possible from f(x) in the sense of a 2-norm. Generally speaking, Least-Squares Method has two categories, linear and non-linear. We can also classify these methods further: ordinary least squares (OLS), weighted least squares (WLS), and alternating least squares (ALS) and partial least squares (PLS).

To fit a set of data best, the least-squares method minimizes the sum of squared residuals (it is also called the Sum of Squared Errors, SSE.)

,

with, , the residual, which is the difference between the actual points and the regression line, and is defined as

where the m data pairs are , and the model function is .

At here, we can choose n different parameters for f(x), so that the approximated function can best fit the data set.

For example, in the right graph, R3 = Y3 - f(X3), and , the sum of the square of each red line's length, is what we want to minimize.

#### Lesson 2: *Linear Least-Squares Method*[edit]

Linear Least-Squares (LLS) Method assumes that the data set falls on a straight line. Therefore, , where a and b are constants. However, due to experimental error, some data might not be on the line exactly. There must be error (residual) between the estimated function and real data. Linear Least-Squares Method (or approximation) defined the best-fit function as the function that minimizes

The advantages of LLS:

1. If we assume that the errors have a normal probability distribution, then minimizing S gives us the best approximation of a and b.

2. We can easily use calculus to determine the approximated value of a and b.

To minimize S, the following conditions must be satisfied , and

Taking the partial derivatives, we obtain , and .

This system actually consists of two simultaneous linear equations with two unknowns a and b. (These two equations are so-called normal equations.)

Based on the simple calculation on summation, we can easily find out that

and

where

.

Thus, the best estimated function for data set , for i is an integer between [1, n], is

, where and .

#### Lesson 3: *Linear Least-Squares Method in matrix form*[edit]

We can also represent estimated linear function in the following model: .

It can be also represented in the matrix form: , where [X] is a matrix containing coefficients that are derived from the data set (It might not be a square matrix based on the number of variables (m), and data point (n).); Vector contains the value of dependent variable, which is ; Vector contains the unknown coefficients that we'd like to solve, which is ; Vector {R} contains the residuals, which is .

To minimize , we follow the same method in lesson 2, obtaining partial derivative for each coefficient, and setting it equal zero. As a result, we have a system of normalized equations, and they can be represented in the following matrix form: .

To solve the system, we have many options, such as LU method, Cholesky method, inverse matrix, and Gauss-Seidel. (Generally, the equations might not result in diagonal dominated matrices, so Gauss-Seidel method is not recommended.)

182.56.115.73 (discuss) 11:40, 23 August 2014 (UTC)KB

#### Lesson 4: *Least-Squares Method in statistical view*[edit]

From equation , we can derive the following equation: .

From this equation, we can determine not only the coefficients, but also the approximated values in statistic.

Using calculus, the following formulas for coefficients can be obtained:

and

where

.

Moreover, the diagonal values and non-diagonal values matrix represents variances and covariances of coefficient , respectively.

Assume the diagonal values of is and the corresponding coefficient is , then

and

where is called stand error of the estimate, and .

(Here, lower index, y/x, means that the error of certain x is caused by the inaccurate approximation of corresponding y.)

We have many application on these two information. For example, we can derive the upper and lower bound of intercept and slope.

### Assignments[edit]

To better understand the application of Least-Squares application, the first question will be solved by applying the LLS equations, and the second one will be solved by Matlab program.

#### Question1: Linear Least-Square Example[edit]

The following are 8 data points that shows the relationship between the number of fishermen and the amount of fish (in thousand pounds) they can catch a day.

Number of Fishermen | Fish Caught |
---|---|

18 | 39 |

14 | 9 |

9 | 9 |

10 | 7 |

5 | 8 |

22 | 35 |

14 | 36 |

12 | 22 |

According to this data set, what is the function between the number of fishermen and the amount of fish caught? *hint: let the number of fisherman be x, and the amount of fish caught be y, and use LLS to find the coefficients.*

##### Calculation[edit]

By the simple calculation and statistic knowledge, we can easily find out:

- = 13
- = 20.625, and
- the following chart

X | Y | ||||
---|---|---|---|---|---|

18 | 39 | 5 | 18.375 | 91.875 | 25 |

14 | 9 | 1 | 1 | ||

9 | 9 | 46.5 | 16 | ||

10 | 7 | 40.875 | 9 | ||

5 | 8 | 101 | 64 | ||

22 | 35 | 9 | 14.375 | 129.375 | 81 |

14 | 36 | 1 | 15.375 | 15.375 | 1 |

12 | 22 | 1.375 | 1 |

Thus, we have , and , so the slope, a, = .

And last the intercept, b, = .

Therefore, the linear least-squares line is .

#### Question2: Nonpolynomial example[edit]

We have the following data , where , by a function of the form .

x | 0.23 | 0.66 | 0.93 | 1.25 | 1.75 | 2.03 | 2.24 | 2.57 | 2.87 | 2.98 |
---|---|---|---|---|---|---|---|---|---|---|

y | 0.25 | 0.28 | 0.13 | 0.26 | 0.58 | 1.03 |

Write a Matlab program that uses Least-Squares method to obtain the estimated function. *hint: input the data in the matrix form, and solve the system to obtain the coefficients.*

The blue spots are the data, the green spots are the estimated nonpolynomial function.

## References[edit]

[1] Cheney, Ward and Kincaid, David. __Numerical Mathematics and Computing__ Fifth Edition. Belmont: Thomson Learning, 2004

[2] Chapra, Steven and Canale, Raymond. __Numerical Method for Engineers: With Software and Programming Applications__ Fourth Edition. McGraw-Hill, 2005

[3] Keller, Gerald. __Statistics for Management and Economics__ Seventh Edition. Thomson Higher Education, 2005

## External Links[edit]

[1] Least Squares Fitting at MathWorld

[2] Least Squares at wikipedia

## Active participants[edit]

Active participants in this Learning Group