3. Singular values#
In this paragraph we present some useful results for the design of a method for operational resolution of the system [éq 1-1]. These results derive from the concept of singular values (section 3.1) and make it possible to build a kernel base and a base for the image of the system matrix (section 3.2) from which it is possible to give a meaning, adapted to the calculation of the solution in the sense of least squares, in contrast to any matrix (section 3.3).
3.1. Breakdown into singular values#
Let’s start by recalling the definition of singular values. We call singular values of a real matrix \(A\) of order \(m\times n\) the square roots of the eigenvalues of the square matrix \({A}^{T}A\) of order \(n\) which, remember, is positive semi-definite.
The concept of diagonalization of square matrices (when they can be diagonalized) is generalized to rectangular matrices (without restriction) through the concept of decomposition (or factorization) into singular values.
For all real matrices \(A\) of order \(m\times n\), there are two unit square matrices \(Q\) and \(P\) of order \(m\) and \(n\) respectively such that:
\({A=Q\Sigma P}^{T}\) eq 3.1-1
where \(\Sigma\) is a matrix of order \(m\times n\) whose structure is shown schematically below:
\(\Sigma =∣\begin{array}{cccc}{\mu }_{1}& & & \\ & {\mu }_{2}& & \\ & & \mathrm{...}& \\ & & & {\mu }_{n}\end{array}∣\begin{array}{cccc}& & & \\ & 0& & \\ & & & \\ & & & \end{array}\mid \text{si}m\le n\)
\(\Sigma =∣\begin{array}{c}\underline{\begin{array}{cccc}{\mu }_{1}& & & \\ & {\mu }_{2}& & \\ & & \mathrm{...}& \\ & & & {\mu }_{n}\end{array}}\\ 0\end{array}∣\text{si}m>n\)
The \({\mu }_{i}\) are the singular values of \(A\) that we assume are ordered in descending order:
\({\mu }_{1}\ge {\mu }_{2}\ge \text{...}\ge {\mu }_{n}\)
We can find a demonstration of this result in [bib1] p.10 for the equi-stressed case and in [bib3] p.73 for the over-stressed case, the under-stressed case is then deduced by transposition.
The factorization SVD [éq 3.1-1] of \(A\) gives \({A}^{T}A={P\Sigma }^{T}{\mathrm{SP}}^{T}\) and \({\text{AA}}^{T}={Q\Sigma \Sigma }^{T}{Q}^{T}\) so that, since \({\Sigma }^{T}\Sigma\) and \({\Sigma \Sigma }^{T}\) are diagonal square matrices, the matrix \(P\) consists of the orthonormalized eigenvectors of the matrix \({A}^{T}A\) while the matrix \(Q\) is consisting of the orthonormalized eigenvectors of the matrix \({\text{AA}}^{T}\).
3.2. Rank, image and core#
Paragraph [§2] showed the fundamental role that the rank of the \(A\) matrix and the core of the \({A}^{T}A\) matrix play in solving a non-regular linear system of the type [éq 1-1]. We will now see how factorization [éq 3.1-1] can be used to determine this rank as well as a base of \(\text{Ker}{A}^{T}A\).
Let \(r\) be the index of the smallest non-zero singular value. The factorization [éq 3.1-1] is also written \({Q}^{T}\text{AP}=\Sigma\) where taking into account zero singular values makes it possible to specify the block decomposition of \(\Sigma\):
where \({\Sigma }_{r}=\text{Diag}({\mu }_{1},{\mu }_{2},\text{...},{\mu }_{r})\) is the \(r\) order diagonal matrix of non-zero singular values in ascending order.
Since matrices \(Q\) and \(P\) are regular, matrices \(A\) and \(\Sigma\) are equivalent so that their respective kernel and image coincide. We therefore deduce that:
The rank of \(A\) coincides with the number of non-zero singular values:
\(\text{rg}A=r\)
The column vectors of \(P\) with subscript \(r+1\) to \(n\) form a base of \(\text{Ker}A\)
The column vectors of \(Q\) corresponding to non-zero singular values form a base of \(\text{Im}A\)
On the other hand, in section 2.3 we saw that \(\text{Im}{A}^{T}=\text{Im}{A}^{T}A\). The identity \(\text{Im}{M}^{T}={(\text{Ker}M)}^{\perp }\) then gives us \(\text{Ker}A=\text{Ker}{A}^{T}A\) so that the second condition of the definition [éq 2.4-1] is simply fulfilled by any vector that is expressed as a linear combination of the column vectors of \(P\) corresponding to the non-zero singular values.
3.3. Pseudo-inverse and least squares solutions#
Another application of singular value decomposition consists in the concept of pseudo‑inverse (or Moore-Penrose inverse) which generalizes the usual notion of the inverse of a regular square matrix to rectangular matrices on the one hand, and to singular square matrices on the other hand.
First, the pseudo-inverse of a matrix \(\Sigma\) of singular value decomposition [éq3.1-1] is defined by:
where \({\Sigma }_{r}^{\mathrm{-}1}\mathrm{=}\text{Diag}(\frac{1}{{\mu }_{1}},\frac{1}{{\mu }_{2}},\text{...},\frac{1}{{\mu }_{r}})\) is the opposite in the usual sense of \({S}_{r}\).
However, we use the decomposition [éq 3.1-1] of the \(A\) matrix to define its pseudoinverse \({A}^{+}\) by:
\({A}^{+}=P{\Sigma }^{+}{Q}^{T}\) eq 3.3-1
Likewise, from the decomposition [éq 3.1-1] of the matrix \(A\) we get \({A}^{T}A=P{\Sigma }^{T}\Sigma {P}^{T}\), so that the pseudo-inverse \({({A}^{T}A)}^{+}\) of the matrix \({A}^{T}A\) is defined by:
\({({A}^{T}A)}^{+}=P{\Sigma }^{+}{({\Sigma }^{T})}^{+}{P}^{T}\) eq 3.3-2
We are now in a position to provide a simple interpretation of the least squares solution defined by [éq 2.4-1].
The restriction to \({(\text{Ker}{A}^{T}A)}^{\perp }\) of the linear application associated with the matrix \({A}^{T}A\) defines an isomorphism of \({(\text{Ker}{A}^{T}A)}^{\perp }\) over \(\text{Im}{A}^{T}A\). Like, on the one hand \({(\text{Ker}{A}^{T}A)}^{\perp }={(\text{Ker}A)}^{\perp }=\text{Im}{A}^{T}\), and, on the other hand, \(\text{Im}{A}^{T}A=\text{Im}{A}^{T}\), this restriction is in fact an automorphism of \({(\text{Ker}{A}^{T}A)}^{\perp }\).
In the base of \({(\text{Ker}{A}^{T}A)}^{\perp }\) formed by the first \(r\) columns of the matrix \(P\), this automorphism is represented by the matrix \({\Sigma }_{r}^{2}\). Also, its reciprocal automorphism \(y\) is represented by the matrix \({\Sigma }_{r}^{-2}\). The extension to \({ℝ}^{n}\) of this automorphism is then represented, in the base associated with the matrix \(P\), by the matrix \({({\Sigma }^{T}\Sigma )}^{+}={\Sigma }^{T}{({\Sigma }^{T})}^{+}\), and therefore, in the canonical base, by the matrix \({({A}^{T}A)}^{+}\).
It follows that:
We find the fact that, for every \(b\) element of \({ℝ}^{n}\), there is a unique vector \(x\in {(\text{Ker}{A}^{T}A)}^{\perp }\) solution of \({A}^{T}\text{Ax}={A}^{T}b\), i.e. the existence and the uniqueness of the solution in the sense of least squares [eq 2.4-1) of the system \(\text{Ax}=b\),
This unique solution is given by:
\(x={({A}^{T}A)}^{+}{A}^{T}b\) eq 3.3-3
The pseudo-inverse of a matrix is defined from the SVD decomposition of this matrix. As the SVD decomposition is not unique, the pseudo-inverse matrix is not unique. On the other hand, from the point of view of linear applications associated with matrices, the pseudo‑inverse application is unique. All the pseudo-inverse matrices associated with the various SVD decompositions of a given matrix are then only particular matrix representatives that express this pseudo-inverse application with respect to the bases induced by the orthogonal matrices of the decompositions SVD. Also, the expression [éq 3.3-3] has a meaning: it defines a vector where \(\mathrm{x}\) represents the components of which represents the components with respect to the arrival base (matrix \(P\)) of the decomposition SVD.