Least-Squares Method

From Wikiversity
Jump to: navigation, search

Learning project summary[edit]

Content summary[edit]

A brief introduction to Least-Squares method, and its statistic meaning.


This learning project offers learning activities and some application for Least-Squares Method. With this project, one should understand the intention of Least-Squares Method, and what it means. Moreover, one should be able to apply some simple Least-Squares methods to find a good approximation for any functions. For more mathematical explanation, one should visit the following page: "Least squares" to obtain more information.

Learning materials[edit]


[1] Numerical Mathematics and Computing Chapter 12.1

[2] Numerical Method for Engineers: With Software and Programming Applications Chapter 17.3

[3] Statistics for Management and Economics Chapter 17.1

[4] T.Strutz: Data Fitting and Uncertainty. A practical introduction to weighted least squares and beyond. 2nd edition, Springer Vieweg, 2016, ISBN 978-3-658-11455-8.


Lesson 1: Introduction to Least-Squares Method[edit]

The goal of Least-Squares Method is to find a good estimation of parameters that fit a function, f(x), of a set of data, . The Least-Squares Method requires that the estimated function has to deviate as little as possible from f(x) in the sense of a 2-norm. Generally speaking, Least-Squares Method has two categories, linear and non-linear. We can also classify these methods further: ordinary least squares (OLS), weighted least squares (WLS), and alternating least squares (ALS) and partial least squares (PLS).


To fit a set of data best, the least-squares method minimizes the sum of squared residuals (it is also called the Sum of Squared Errors, SSE.)


with, , the residual, which is the difference between the actual points and the regression line, and is defined as

where the m data pairs are , and the model function is .

At here, we can choose n different parameters for f(x), so that the approximated function can best fit the data set.

For example, in the right graph, R3 = Y3 - f(X3), and , the sum of the square of each red line's length, is what we want to minimize.

Lesson 2: Linear Least-Squares Method[edit]

Linear Least-Squares (LLS) Method assumes that the data set falls on a straight line. Therefore, , where a and b are constants. However, due to experimental error, some data might not be on the line exactly. There must be error (residual) between the estimated function and real data. Linear Least-Squares Method (or approximation) defined the best-fit function as the function that minimizes

The advantages of LLS:

1. If we assume that the errors have a normal probability distribution, then minimizing S gives us the best approximation of a and b.

2. We can easily use calculus to determine the approximated value of a and b.

To minimize S, the following conditions must be satisfied , and

Taking the partial derivatives, we obtain , and .

This system actually consists of two simultaneous linear equations with two unknowns a and b. (These two equations are so-called normal equations.)

Based on the simple calculation on summation, we can easily find out that




Thus, the best estimated function for data set , for i is an integer between [1, n], is

, where and .

Lesson 3: Linear Least-Squares Method in matrix form[edit]

We can also represent estimated linear function in the following model: .

It can be also represented in the matrix form: , where [X] is a matrix containing coefficients that are derived from the data set (It might not be a square matrix based on the number of variables (m), and data point (n).); Vector contains the value of dependent variable, which is ; Vector contains the unknown coefficients that we'd like to solve, which is ; Vector {R} contains the residuals, which is .

To minimize , we follow the same method in lesson 2, obtaining partial derivative for each coefficient, and setting it equal zero. As a result, we have a system of normalized equations, and they can be represented in the following matrix form: .

To solve the system, we have many options, such as LU method, Cholesky method, inverse matrix, and Gauss-Seidel. (Generally, the equations might not result in diagonal dominated matrices, so Gauss-Seidel method is not recommended.) (discuss) 11:40, 23 August 2014 (UTC)KB

Lesson 4: Least-Squares Method in statistical view[edit]

From equation , we can derive the following equation: .

From this equation, we can determine not only the coefficients, but also the approximated values in statistic.

Using calculus, the following formulas for coefficients can be obtained:




Moreover, the diagonal values and non-diagonal values matrix represents variances and covariances of coefficient , respectively.

Assume the diagonal values of is and the corresponding coefficient is , then


where is called stand error of the estimate, and .

(Here, lower index, y/x, means that the error of certain x is caused by the inaccurate approximation of corresponding y.)

We have many application on these two information. For example, we can derive the upper and lower bound of intercept and slope.


To better understand the application of Least-Squares application, the first question will be solved by applying the LLS equations, and the second one will be solved by Matlab program.

Question1: Linear Least-Square Example[edit]

The following are 8 data points that shows the relationship between the number of fishermen and the amount of fish (in thousand pounds) they can catch a day.

Number of Fishermen Fish Caught
18 39
14 9
9 9
10 7
5 8
22 35
14 36
12 22

According to this data set, what is the function between the number of fishermen and the amount of fish caught? hint: let the number of fisherman be x, and the amount of fish caught be y, and use LLS to find the coefficients.


By the simple calculation and statistic knowledge, we can easily find out:

  1. = 13
  2. = 20.625, and
  3. the following chart
18 39 5 18.375 91.875 25
14 9 1 1
9 9 46.5 16
10 7 40.875 9
5 8 101 64
22 35 9 14.375 129.375 81
14 36 1 15.375 15.375 1
12 22 1.375 1

Thus, we have , and , so the slope, a, = .

And last the intercept, b, = .

Therefore, the linear least-squares line is .

Question2: Nonpolynomial example[edit]

We have the following data , where , by a function of the form .

x 0.23 0.66 0.93 1.25 1.75 2.03 2.24 2.57 2.87 2.98
y 0.25 0.28 0.13 0.26 0.58 1.03

Write a Matlab program that uses Least-Squares method to obtain the estimated function. hint: input the data in the matrix form, and solve the system to obtain the coefficients.


The blue spots are the data, the green spots are the estimated nonpolynomial function.


[1] Cheney, Ward and Kincaid, David. Numerical Mathematics and Computing Fifth Edition. Belmont: Thomson Learning, 2004

[2] Chapra, Steven and Canale, Raymond. Numerical Method for Engineers: With Software and Programming Applications Fourth Edition. McGraw-Hill, 2005

[3] Keller, Gerald. Statistics for Management and Economics Seventh Edition. Thomson Higher Education, 2005

External Links[edit]

[1] Least Squares Fitting at MathWorld

[2] Least Squares at wikipedia

Active participants[edit]

Active participants in this Learning Group