What is the least squares problem?

The problem to find x ∈ Rn that minimizes Ax − b2 is called the least squares problem. A minimizing vector x is called a least squares solution of Ax = b. 2 = (x1 − 1)2 + (x1 − 1)2 + (x1 − 2)2. The second derivative is positive (it is equal to 6) and x = 4/3 is a global minimum.

What is the least square approximation in linear algebra?

The least squares approximation ^Y=X^β Y ^ = X β ^ of Y is the closest point to Y in the plane spanned by the columns of X .

How do you calculate least squares?

The least-squares regression line equation is y = mx + b , where m is the slope, which is equal to (Nsum(xy) - sum(x)sum(y))/(Nsum(x^2) - (sum x)^2), and b is the y-intercept, which is equals to (sum(y) - msum(x))/N. N is the number of data points, and x and y are the coordinates of the data points.

What is least squares for dummies?

The least squares principle states that the SRF should be constructed (with the constant and slope values) so that the sum of the squared distance between the observed values of your dependent variable and the values estimated from your SRF is minimized (the smallest possible value).

What is the principle of the least squares problem?

The least squares method is a statistical procedure to find the best fit for a set of data points. The method works by minimizing the sum of the offsets or residuals of points from the plotted curve . Least squares regression is used to predict the behavior of dependent variables.

What is least square estimation problem?

Least squares problems fall into two categories: linear or ordinary least squares and nonlinear least squares , depending on whether or not the model functions are linear in all unknowns. The linear least-squares problem occurs in statistical regression analysis; it has a closed-form solution.

What is an example of the least squares method?

Least Square Method Examples Example 1: Consider the set of points: (1, 1), (-2,-1), and (3, 2). Plot these points and the least-squares regression line in the same graph. Now, find the value of m, using the formula. Therefore, the equation of regression line is y = 23/38x + 5/19 .

What is the idea of the least squares method?

The method of least squares assumes that the best fit curve of a given type is the curve that has the minimal sum of deviations , i.e., least square error from a given set of data. According to the method of least squares, the best fitting curve has the property that ∑ 1 n e i 2 = ∑ 1 n [ y i − f ( x i ) ] 2 is minimum.

17: Least Squares and Singular Values (2024)

Last updated
Save as PDF

Page ID: 1739

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}}}\)

\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{#1}}} \)

\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

\( \newcommand{\Span}{\mathrm{span}}\)

\( \newcommand{\id}{\mathrm{id}}\)

\( \newcommand{\Span}{\mathrm{span}}\)

\( \newcommand{\kernel}{\mathrm{null}\,}\)

\( \newcommand{\range}{\mathrm{range}\,}\)

\( \newcommand{\RealPart}{\mathrm{Re}}\)

\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

\( \newcommand{\Argument}{\mathrm{Arg}}\)

\( \newcommand{\norm}[1]{\| #1 \|}\)

\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

\( \newcommand{\vectorA}[1]{\vec{#1}} % arrow\)

\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow\)

\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}}}\)

\( \newcommand{\vectorC}[1]{\textbf{#1}}\)

\( \newcommand{\vectorD}[1]{\overrightarrow{#1}}\)

\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}}\)

\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}}}\)

\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{#1}}} \)

Consider the linear system \(L(x)=v\), where \(L \colon U\stackrel{\textit{linear}}{-\!\!\!-\!\!\!\longrightarrow}W\), and \(v\in W\) is given. As we have seen, this system may have no solutions, a unique solution, or a space of solutions. But if \(v\) is not in the range of \(L\), in pictures:

there will never be any solutions for \(L(x)=v\). However, for many applications we do not need an exact solution of the system; instead, we try to find the best approximation possible.

"My work always tried to unite the Truth with the Beautiful, but when I had to choose one or the other, I usually chose the Beautiful.''--Hermann Weyl.
See Also
The Linear Algebra View of Least-Squares Regression

If the vector space \(W\) has a notion of lengths of vectors, we can try to find \(x\) that minimizes \(||L(x)-v||\):

This method has many applications, such as when trying to fit a (perhaps linear) function to a "noisy'' set of observations. For example, suppose we measured the position of a bicycle on a racetrack once every five seconds. Our observations won't be exact, but so long as the observations are right on average, we can figure out a best-possible linear function of position of the bicycle in terms of time.

Suppose \(M\) is the matrix for \(L\) in some bases for \(U\) and \(W\), and \(v\) and \(x\) are given by column vectors \(V\) and \(X\) in these bases. Then we need to approximate

\[MX-V\approx 0\, .\]

Note that if \(\dim U=n\) and \(\dim W=m\) then \(M\) can be represented by an \(m\times n\) matrix and \(x\) and \(v\) as vectors in \(\Re^{n}\) and \(\Re^{m}\), respectively. Thus, we can write \(W=L(U)\oplus L(U)^{\perp}\). Then we can uniquely write \(v=v^{\parallel} + v^{\perp}\), with \(v^{\parallel} \in L(U)\) and \(v^{\perp} \in L(U)^{\perp}\).

Thus we should solve \(L(u)=v^{\parallel}\). In components, \(v^{\perp}\) is just \(V-MX\), and is the part we will eventually wish to minimize.

In terms of \(M\), recall that \(L(V)\) is spanned by the columns of \(M\). (In the standard basis, the columns of \(M\) are \(Me_{1}\), \(\ldots\), \(Me_{n}\).) Then \(v^{\perp}\) must be perpendicular to the columns of \(M\). \(\textit{i.e.}\), \(M^{T}(V-MX)=0\), or

\[M^{T}MX = M^{T}V.\]

Solutions of \(M^{T}MX = M^{T}V\) for \(X\) are called \(\textit{least squares}\) solutions to \(MX=V\). Notice that any solution \(X\) to \(MX=V\) is a least squares solution. However, the converse is often false. In fact, the equation \(MX=V\) may have no solutions at all, but still have least squares solutions to \(M^{T}MX = M^{T}V\).

Observe that since \(M\) is an \(m\times n\) matrix, then \(M^{T}\) is an \(n\times m\) matrix. Then \(M^{T}M\) is an \(n\times n\) matrix, and is symmetric, since \((M^{T}M)^{T}=M^{T}M\). Then, for any vector \(X\), we can evaluate \(X^{T}M^{T}MX\) to obtain a number. This is a very nice number, though! It is just the length \(|MX|^{2} = (MX)^{T}(MX)=X^{T}M^{T}MX\).

Now suppose that \(\ker L=\{0\}\), so that the only solution to \(MX=0\) is \(X=0\). (This need not mean that \(M\) is invertible because \(M\) is an \(n\times m\) matrix, so not necessarily square.) However the square matrix \(M^{T}M\) \(\textit{is}\) invertible. To see this, suppose there was a vector \(X\) such that \(M^{T} M X=0\). Then it would follow that \(X^{T} M^{T} M X = |M X|^{2}=0\). In other words the vector \(MX\) would have zero length, so could only be the zero vector. But we are assuming that \(\ker L=\{0\}\) so \(MX=0\) implies \(X=0\). Thus the kernel of \(M^{T}M\) is \(\{0\}\) so this matrix is invertible. So, in this case, the least squares solution (the \(X\) that solves \(M^{T}MX=MV\)) is unique, and is equal to

\[X = (M^{T}M)^{-1}M^{T}V.\]

In a nutshell, this is the least squares method:

Compute \(M^{T}M\) and \(M^{T}V\).
Solve \((M^{T}M)X=M^{T}V\) by Gaussian elimination.

Example \(\PageIndex{1}\):

Captain Conundrum falls off of the leaning tower of Pisa and makes three (rather shaky) measurements of his velocity at three different times.

\[\begin{array}{r|r} t s & v m/s \\ \hline 1 & 11 \\ 2 & 19 \\ 3 & 31\end{array}\]

Having taken some calculus, he believes that his data are best approximated by a straight line

\[v = at+b.\]

Then he should find \(a\) and \(b\) to best fit the data.

\begin{eqnarray*}
11 &=& a\cdot 1 + b \\
19 &=& a\cdot 2 + b \\
31 &=& a\cdot 3 + b.
\end{eqnarray*}
As a system of linear equations, this becomes:

\[
\begin{pmatrix}
1 & 1 \\
2 & 1 \\
3 & 1 \\
\end{pmatrix}
\begin{pmatrix}a\\b\end{pmatrix} \stackrel{?}{=}
\begin{pmatrix}11\\19\\31\end{pmatrix}.
\]

There is likely no actual straight line solution, so instead solve \(M^{T}MX=M^{T}V\).

\[
\begin{pmatrix}
1 & 2 & 3 \\
1 & 1 & 1 \\
\end{pmatrix}
\begin{pmatrix}
1 & 1 \\
2 & 1 \\
3 & 1 \\
\end{pmatrix} \begin{pmatrix}a\\b\end{pmatrix}
=
\begin{pmatrix}
1 & 2 & 3 \\
1 & 1 & 1 \\
\end{pmatrix}
\begin{pmatrix}11\\19\\31\end{pmatrix}.
\]

This simplifies to the system:

\[\begin{pmatrix}14&6&142\\6&3&61\end{pmatrix} \sim \begin{pmatrix}1&0&10\\0&1&\frac{1}{3}\end{pmatrix}\]

Thus, the least-squares fit is the line

\[v = 10\ t + \frac{1}{3}\, .\]

Notice that this equation implies that Captain Conundrum accelerates towards Italian soil at 10 m/s\(^{2}\) (which is an excellent approximation to reality) and that he started at a downward velocity of \(\frac{1}{3}\) m/s (perhaps somebody gave him a shove...)!

Contributor

David Cherney, Tom Denton, and Andrew Waldron (UC Davis)

17: Least Squares and Singular Values (2024)

Contributor

FAQs

What is the least squares problem? ›