Singular Value Decompositions

Learning Objectives

Construct an SVD of a matrix
Identify pieces of an SVD
Use an SVD to solve a problem

Singular Value Decomposition

An $m \times n <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>m</mi><mo>\times</mo><mi>n</mi></math>$ real matrix $A <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow></math>$ has a singular value decomposition of the form

A = U Σ V T <math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow><mo>=</mo><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">U</mi></mrow><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">Σ</mi></mrow><msup><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">V</mi></mrow><mi>T</mi></msup></math>

where

$U <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">U</mi></mrow></math>$ is an $m \times m <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>m</mi><mo>\times</mo><mi>m</mi></math>$ orthogonal matrix whose columns are eigenvectors of $A A T <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow><msup><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow><mi>T</mi></msup></math>$ . The columns of $U <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">U</mi></mrow></math>$ are called the left singular vectors of $A <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow></math>$ .
$Σ <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">Σ</mi></mrow></math>$ is an $m \times n <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>m</mi><mo>\times</mo><mi>n</mi></math>$ diagonal matrix of the form:

where $s = min (m, n) <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>s</mi><mo>=</mo><mo data-mjx-texclass="OP" movablelimits="true">min</mo><mo stretchy="false">(</mo><mi>m</mi><mo>,</mo><mi>n</mi><mo stretchy="false">)</mo></math>$ and $σ 1 \geq σ 2 \dots \geq σ s \geq 0 <math xmlns="http://www.w3.org/1998/Math/MathML"><msub><mi>σ</mi><mn>1</mn></msub><mo>\geq</mo><msub><mi>σ</mi><mn>2</mn></msub><mo>\dots</mo><mo>\geq</mo><msub><mi>σ</mi><mi>s</mi></msub><mo>\geq</mo><mn>0</mn></math>$ are the square roots of the eigenvalues values of $A T A <math xmlns="http://www.w3.org/1998/Math/MathML"><msup><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow><mi>T</mi></msup><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow></math>$ . The diagonal entries are called the singular values of $A <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow></math>$ .

$V <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">V</mi></mrow></math>$ is an $n \times n <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>n</mi><mo>\times</mo><mi>n</mi></math>$ orthogonal matrix whose columns are eigenvectors of $A T A <math xmlns="http://www.w3.org/1998/Math/MathML"><msup><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow><mi>T</mi></msup><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow></math>$ The columns of $V <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">V</mi></mrow></math>$ are called the right singular vectors of $A <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow></math>$ .

Time Complexity

The time-complexity for computing the SVD factorization of an arbitrary $m \times n <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>m</mi><mo>\times</mo><mi>n</mi></math>$ matrix is proportional to $m 2 n + n 3 <math xmlns="http://www.w3.org/1998/Math/MathML"><msup><mi>m</mi><mn>2</mn></msup><mi>n</mi><mo>+</mo><msup><mi>n</mi><mn>3</mn></msup></math>$ , where the constant of proportionality ranges from 4 to 10 (or more) depending on the algorithm.

In general, we can define the cost as:

O (m 2 n + n 3) <math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><mrow data-mjx-texclass="ORD"><mi data-mjx-variant="-tex-calligraphic" mathvariant="script">O</mi></mrow><mo stretchy="false">(</mo><msup><mi>m</mi><mn>2</mn></msup><mi>n</mi><mo>+</mo><msup><mi>n</mi><mn>3</mn></msup><mo stretchy="false">)</mo></math>

Reduced SVD

The SVD factorization of a non-square matrix $A <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow></math>$ of size $m \times n <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>m</mi><mo>\times</mo><mi>n</mi></math>$ can be represented in a reduced format:

For $m \geq n <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>m</mi><mo>\geq</mo><mi>n</mi></math>$ : $U <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">U</mi></mrow></math>$ is $m \times n <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>m</mi><mo>\times</mo><mi>n</mi></math>$ , $Σ <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">Σ</mi></mrow></math>$ is $n \times n <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>n</mi><mo>\times</mo><mi>n</mi></math>$ , and $V <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">V</mi></mrow></math>$ is $n \times n <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>n</mi><mo>\times</mo><mi>n</mi></math>$
For $m \leq n <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>m</mi><mo>\leq</mo><mi>n</mi></math>$ : $U <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">U</mi></mrow></math>$ is $m \times m <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>m</mi><mo>\times</mo><mi>m</mi></math>$ , $Σ <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">Σ</mi></mrow></math>$ is $m \times m <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>m</mi><mo>\times</mo><mi>m</mi></math>$ , and $V <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">V</mi></mrow></math>$ is $n \times m <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>n</mi><mo>\times</mo><mi>m</mi></math>$ (note if $V <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">V</mi></mrow></math>$ is $n \times m <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>n</mi><mo>\times</mo><mi>m</mi></math>$ , then $V T <math xmlns="http://www.w3.org/1998/Math/MathML"><msup><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">V</mi></mrow><mi>T</mi></msup></math>$ is $m \times n <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>m</mi><mo>\times</mo><mi>n</mi></math>$ )

The following figure depicts the reduced SVD factorization (in red) against the full SVD factorizations (in gray).

In general, we will represent the reduced SVD as:

A = U R Σ R V T R <math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow><mo>=</mo><msub><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">U</mi></mrow><mi>R</mi></msub><msub><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">Σ</mi></mrow><mi>R</mi></msub><msubsup><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">V</mi></mrow><mi>R</mi><mi>T</mi></msubsup></math>

where $U R <math xmlns="http://www.w3.org/1998/Math/MathML"><msub><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">U</mi></mrow><mi>R</mi></msub></math>$ is a $m \times s <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>m</mi><mo>\times</mo><mi>s</mi></math>$ matrix, $V R <math xmlns="http://www.w3.org/1998/Math/MathML"><msub><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">V</mi></mrow><mi>R</mi></msub></math>$ is a $n \times s <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>n</mi><mo>\times</mo><mi>s</mi></math>$ matrix, $Σ R <math xmlns="http://www.w3.org/1998/Math/MathML"><msub><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">Σ</mi></mrow><mi>R</mi></msub></math>$ is a $s \times s <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>s</mi><mo>\times</mo><mi>s</mi></math>$ matrix, and $s = min (m, n) <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>s</mi><mo>=</mo><mo data-mjx-texclass="OP" movablelimits="true">min</mo><mo stretchy="false">(</mo><mi>m</mi><mo>,</mo><mi>n</mi><mo stretchy="false">)</mo></math>$ .

Example: Computing the SVD

We begin with the following non-square matrix, $A <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow></math>$

A = [323882874187647] <math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow><mo>=</mo><mrow data-mjx-texclass="INNER"><mo data-mjx-texclass="OPEN">[</mo><mtable columnalign="center center center" columnspacing="1em" rowspacing="4pt"><mtr><mtd><mn>3</mn></mtd><mtd><mn>2</mn></mtd><mtd><mn>3</mn></mtd></mtr><mtr><mtd><mn>8</mn></mtd><mtd><mn>8</mn></mtd><mtd><mn>2</mn></mtd></mtr><mtr><mtd><mn>8</mn></mtd><mtd><mn>7</mn></mtd><mtd><mn>4</mn></mtd></mtr><mtr><mtd><mn>1</mn></mtd><mtd><mn>8</mn></mtd><mtd><mn>7</mn></mtd></mtr><mtr><mtd><mn>6</mn></mtd><mtd><mn>4</mn></mtd><mtd><mn>7</mn></mtd></mtr></mtable><mo data-mjx-texclass="CLOSE">]</mo></mrow></math>

and we will compute the reduced form of the SVD (where here $s = 3 <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>s</mi><mo>=</mo><mn>3</mn></math>$ ):

(1) Compute $A T A <math xmlns="http://www.w3.org/1998/Math/MathML"><msup><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow><mi>T</mi></msup><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow></math>$ :

A T A = [174158106158197134106134127] <math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><msup><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow><mi>T</mi></msup><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow><mo>=</mo><mrow data-mjx-texclass="INNER"><mo data-mjx-texclass="OPEN">[</mo><mtable columnalign="center center center" columnspacing="1em" rowspacing="4pt"><mtr><mtd><mn>174</mn></mtd><mtd><mn>158</mn></mtd><mtd><mn>106</mn></mtd></mtr><mtr><mtd><mn>158</mn></mtd><mtd><mn>197</mn></mtd><mtd><mn>134</mn></mtd></mtr><mtr><mtd><mn>106</mn></mtd><mtd><mn>134</mn></mtd><mtd><mn>127</mn></mtd></mtr></mtable><mo data-mjx-texclass="CLOSE">]</mo></mrow></math>

(2) Compute the eigenvectors and eigenvalues of $A T A <math xmlns="http://www.w3.org/1998/Math/MathML"><msup><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow><mi>T</mi></msup><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow></math>$ :

λ 1 = 437.479, λ 2 = 42.6444, λ 3 = 17.8766, v 1 = [0.585051 0.652648 0.481418], v 2 = [- 0.710399 0.126068 0.692415], v 3 = [0.391212 - 0.747098 0.537398] <math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><msub><mi>λ</mi><mn>1</mn></msub><mo>=</mo><mn>437.479</mn><mo>,</mo><mstyle scriptlevel="0"><mspace width="1em"></mspace></mstyle><msub><mi>λ</mi><mn>2</mn></msub><mo>=</mo><mn>42.6444</mn><mo>,</mo><mstyle scriptlevel="0"><mspace width="1em"></mspace></mstyle><msub><mi>λ</mi><mn>3</mn></msub><mo>=</mo><mn>17.8766</mn><mo>,</mo><mspace linebreak="newline"></mspace><msub><mi mathvariant="bold-italic">v</mi><mn>1</mn></msub><mo>=</mo><mrow data-mjx-texclass="INNER"><mo data-mjx-texclass="OPEN">[</mo><mtable columnspacing="1em" rowspacing="4pt"><mtr><mtd><mn>0.585051</mn></mtd></mtr><mtr><mtd><mn>0.652648</mn></mtd></mtr><mtr><mtd><mn>0.481418</mn></mtd></mtr></mtable><mo data-mjx-texclass="CLOSE">]</mo></mrow><mo>,</mo><mstyle scriptlevel="0"><mspace width="1em"></mspace></mstyle><msub><mi mathvariant="bold-italic">v</mi><mn>2</mn></msub><mo>=</mo><mrow data-mjx-texclass="INNER"><mo data-mjx-texclass="OPEN">[</mo><mtable columnspacing="1em" rowspacing="4pt"><mtr><mtd><mo>-</mo><mn>0.710399</mn></mtd></mtr><mtr><mtd><mn>0.126068</mn></mtd></mtr><mtr><mtd><mn>0.692415</mn></mtd></mtr></mtable><mo data-mjx-texclass="CLOSE">]</mo></mrow><mo>,</mo><mstyle scriptlevel="0"><mspace width="1em"></mspace></mstyle><msub><mi mathvariant="bold-italic">v</mi><mn>3</mn></msub><mo>=</mo><mrow data-mjx-texclass="INNER"><mo data-mjx-texclass="OPEN">[</mo><mtable columnspacing="1em" rowspacing="4pt"><mtr><mtd><mn>0.391212</mn></mtd></mtr><mtr><mtd><mo>-</mo><mn>0.747098</mn></mtd></mtr><mtr><mtd><mn>0.537398</mn></mtd></mtr></mtable><mo data-mjx-texclass="CLOSE">]</mo></mrow></math>

(3) Construct $V R <math xmlns="http://www.w3.org/1998/Math/MathML"><msub><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">V</mi></mrow><mi>R</mi></msub></math>$ from the eigenvectors of $A T A <math xmlns="http://www.w3.org/1998/Math/MathML"><msup><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow><mi>T</mi></msup><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow></math>$ :

V R = [0.585051 - 0.710399 0.391212 0.652648 0.126068 - 0.747098 0.481418 0.692415 0.537398] . <math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><msub><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">V</mi></mrow><mi>R</mi></msub><mo>=</mo><mrow data-mjx-texclass="INNER"><mo data-mjx-texclass="OPEN">[</mo><mtable columnalign="center center center" columnspacing="1em" rowspacing="4pt"><mtr><mtd><mn>0.585051</mn></mtd><mtd><mo>-</mo><mn>0.710399</mn></mtd><mtd><mn>0.391212</mn></mtd></mtr><mtr><mtd><mn>0.652648</mn></mtd><mtd><mn>0.126068</mn></mtd><mtd><mo>-</mo><mn>0.747098</mn></mtd></mtr><mtr><mtd><mn>0.481418</mn></mtd><mtd><mn>0.692415</mn></mtd><mtd><mn>0.537398</mn></mtd></mtr></mtable><mo data-mjx-texclass="CLOSE">]</mo></mrow><mo>.</mo></math>

(4) Construct $Σ R <math xmlns="http://www.w3.org/1998/Math/MathML"><msub><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">Σ</mi></mrow><mi>R</mi></msub></math>$ from the square roots of the eigenvalues of $A T A <math xmlns="http://www.w3.org/1998/Math/MathML"><msup><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow><mi>T</mi></msup><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow></math>$ :

Σ R = [20.916 00 0 6.53207 0 00 4.22807] <math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><msub><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">Σ</mi></mrow><mi>R</mi></msub><mo>=</mo><mrow data-mjx-texclass="INNER"><mo data-mjx-texclass="OPEN">[</mo><mtable columnspacing="1em" rowspacing="4pt"><mtr><mtd><mn>20.916</mn></mtd><mtd><mn>0</mn></mtd><mtd><mn>0</mn></mtd></mtr><mtr><mtd><mn>0</mn></mtd><mtd><mn>6.53207</mn></mtd><mtd><mn>0</mn></mtd></mtr><mtr><mtd><mn>0</mn></mtd><mtd><mn>0</mn></mtd><mtd><mn>4.22807</mn></mtd></mtr></mtable><mo data-mjx-texclass="CLOSE">]</mo></mrow></math>

(5) Find $U <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">U</mi></mrow></math>$ by solving $U Σ = A V <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">U</mi></mrow><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">Σ</mi></mrow><mo>=</mo><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">V</mi></mrow></math>$ . For our reduced case, we can find $U R = A V R Σ - 1 R <math xmlns="http://www.w3.org/1998/Math/MathML"><msub><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">U</mi></mrow><mi>R</mi></msub><mo>=</mo><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow><msub><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">V</mi></mrow><mi>R</mi></msub><msubsup><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">Σ</mi></mrow><mi>R</mi><mrow data-mjx-texclass="ORD"><mo>-</mo><mn>1</mn></mrow></msubsup></math>$ . You could also find $U <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">U</mi></mrow></math>$ by computing the eigenvectors of $A A T <math xmlns="http://www.w3.org/1998/Math/MathML"><msup><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi><mi mathvariant="bold">A</mi></mrow><mi>T</mi></msup></math>$ .

U = [0.215371 0.030348 0.305490 0.519432 - 0.503779 - 0.419173 0.534262 - 0.311021 0.011730 0.438715 0.787878 - 0.431352 0.453759 0.166729 0.738082] <math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">U</mi></mrow><mo>=</mo><mrow data-mjx-texclass="INNER"><mo data-mjx-texclass="OPEN">[</mo><mtable columnalign="center center center" columnspacing="1em" rowspacing="4pt"><mtr><mtd><mn>0.215371</mn></mtd><mtd><mn>0.030348</mn></mtd><mtd><mn>0.305490</mn></mtd></mtr><mtr><mtd><mn>0.519432</mn></mtd><mtd><mo>-</mo><mn>0.503779</mn></mtd><mtd><mo>-</mo><mn>0.419173</mn></mtd></mtr><mtr><mtd><mn>0.534262</mn></mtd><mtd><mo>-</mo><mn>0.311021</mn></mtd><mtd><mn>0.011730</mn></mtd></mtr><mtr><mtd><mn>0.438715</mn></mtd><mtd><mn>0.787878</mn></mtd><mtd><mo>-</mo><mn>0.431352</mn></mtd></mtr><mtr><mtd><mn>0.453759</mn></mtd><mtd><mn>0.166729</mn></mtd><mtd><mn>0.738082</mn></mtd></mtr></mtable><mo data-mjx-texclass="CLOSE">]</mo></mrow></math>

We obtain the following singular value decomposition for $A <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow></math>$ :

Recall that we computed the reduced SVD factorization (i.e. $Σ <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">Σ</mi></mrow></math>$ is square, $U <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">U</mi></mrow></math>$ is non-square) here.

Rank, null space and range of a matrix

Suppose $A <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow></math>$ is a $m \times n <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>m</mi><mo>\times</mo><mi>n</mi></math>$ matrix where $m > n <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>m</mi><mo>></mo><mi>n</mi></math>$ (without loss of generality):

We can re-write the above as:

A = [| | | | u 1 \dots u n | | | |] [- σ 1 v T 1 - ⋮ - σ n v T n -] <math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow><mo>=</mo><mrow data-mjx-texclass="INNER"><mo data-mjx-texclass="OPEN">[</mo><mtable columnspacing="1em" rowspacing="4pt"><mtr><mtd><mo data-mjx-texclass="ORD" fence="false" stretchy="false">|</mo></mtd><mtd></mtd><mtd><mo data-mjx-texclass="ORD" fence="false" stretchy="false">|</mo></mtd></mtr><mtr><mtd><mo data-mjx-texclass="ORD" fence="false" stretchy="false">|</mo></mtd><mtd></mtd><mtd><mo data-mjx-texclass="ORD" fence="false" stretchy="false">|</mo></mtd></mtr><mtr><mtd><msub><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">u</mi></mrow><mn>1</mn></msub></mtd><mtd><mo>\dots</mo></mtd><mtd><msub><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">u</mi></mrow><mi>n</mi></msub></mtd></mtr><mtr><mtd><mo data-mjx-texclass="ORD" fence="false" stretchy="false">|</mo></mtd><mtd></mtd><mtd><mo data-mjx-texclass="ORD" fence="false" stretchy="false">|</mo></mtd></mtr><mtr><mtd><mo data-mjx-texclass="ORD" fence="false" stretchy="false">|</mo></mtd><mtd></mtd><mtd><mo data-mjx-texclass="ORD" fence="false" stretchy="false">|</mo></mtd></mtr></mtable><mo data-mjx-texclass="CLOSE">]</mo></mrow><mrow data-mjx-texclass="INNER"><mo data-mjx-texclass="OPEN">[</mo><mtable columnspacing="1em" rowspacing="4pt"><mtr><mtd><mo>-</mo></mtd><mtd><msub><mi>σ</mi><mn>1</mn></msub><msubsup><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">v</mi></mrow><mn>1</mn><mi>T</mi></msubsup></mtd><mtd><mo>-</mo></mtd></mtr><mtr><mtd></mtd><mtd><mrow data-mjx-texclass="ORD"><mo>⋮</mo></mrow></mtd><mtd></mtd></mtr><mtr><mtd><mo>-</mo></mtd><mtd><msub><mi>σ</mi><mi>n</mi></msub><msubsup><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">v</mi></mrow><mi>n</mi><mi>T</mi></msubsup></mtd><mtd><mo>-</mo></mtd></mtr></mtable><mo data-mjx-texclass="CLOSE">]</mo></mrow></math>

Furthermore, the product of two matrices can be written as a sum of outer products:

A = σ 1 u 1 v T 1 + σ 2 u 2 v T 2 + . . . + σ n u n v T n <math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow><mo>=</mo><msub><mi>σ</mi><mn>1</mn></msub><msub><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">u</mi></mrow><mn>1</mn></msub><msubsup><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">v</mi></mrow><mn>1</mn><mi>T</mi></msubsup><mo>+</mo><msub><mi>σ</mi><mn>2</mn></msub><msub><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">u</mi></mrow><mn>2</mn></msub><msubsup><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">v</mi></mrow><mn>2</mn><mi>T</mi></msubsup><mo>+</mo><mo>.</mo><mo>.</mo><mo>.</mo><mo>+</mo><msub><mi>σ</mi><mi>n</mi></msub><msub><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">u</mi></mrow><mi>n</mi></msub><msubsup><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">v</mi></mrow><mi>n</mi><mi>T</mi></msubsup></math>

For a general rectangular matrix, we have:

A = s \sum i = 1 σ i u i v T i <math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow><mo>=</mo><munderover><mo data-mjx-texclass="OP">\sum</mo><mrow data-mjx-texclass="ORD"><mi>i</mi><mo>=</mo><mn>1</mn></mrow><mrow data-mjx-texclass="ORD"><mi>s</mi></mrow></munderover><msub><mi>σ</mi><mi>i</mi></msub><msub><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">u</mi></mrow><mi>i</mi></msub><msubsup><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">v</mi></mrow><mi>i</mi><mi>T</mi></msubsup></math>

If $A <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow></math>$ has $s <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>s</mi></math>$ non-zero singular values, the matrix is full rank, i.e. $rank (A) = s <math xmlns="http://www.w3.org/1998/Math/MathML"><mtext>rank</mtext><mo stretchy="false">(</mo><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow><mo stretchy="false">)</mo><mo>=</mo><mi>s</mi></math>$ .

If $A <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow></math>$ has $r <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>r</mi></math>$ non-zero singular values, and $r < s <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>r</mi><mo><</mo><mi>s</mi></math>$ , the matrix is rank deficient, i.e. $rank (A) = r <math xmlns="http://www.w3.org/1998/Math/MathML"><mtext>rank</mtext><mo stretchy="false">(</mo><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow><mo stretchy="false">)</mo><mo>=</mo><mi>r</mi></math>$ .

In other words, the rank of $A <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow></math>$ equals the number of non-zero singular values which is the same as the number of non-zero diagonal elements in $Σ <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">Σ</mi></mrow></math>$ .

Rounding errors may lead to small but non-zero singular values in a rank deficient matrix. Singular values that are smaller than a given tolerance are assumed to be numerically equivalent to zero, defining what is sometimes called the effective rank.

The right-singular vectors (columns of $V <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">V</mi></mrow></math>$ ) corresponding to vanishing singular values of $A <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow></math>$ span the null space of $A <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow></math>$ , i.e. null( $A <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow></math>$ ) = span{ $v r + 1 <math xmlns="http://www.w3.org/1998/Math/MathML"><msub><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">v</mi></mrow><mrow data-mjx-texclass="ORD"><mi>r</mi><mo>+</mo><mn>1</mn></mrow></msub></math>$ , $v r + 2 <math xmlns="http://www.w3.org/1998/Math/MathML"><msub><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">v</mi></mrow><mrow data-mjx-texclass="ORD"><mi>r</mi><mo>+</mo><mn>2</mn></mrow></msub></math>$ , …, $v n <math xmlns="http://www.w3.org/1998/Math/MathML"><msub><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">v</mi></mrow><mrow data-mjx-texclass="ORD"><mi>n</mi></mrow></msub></math>$ }.

The left-singular vectors (columns of $U <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">U</mi></mrow></math>$ ) corresponding to the non-zero singular values of $A <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow></math>$ span the range of $A <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow></math>$ , i.e. range( $A <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow></math>$ ) = span{ $u 1 <math xmlns="http://www.w3.org/1998/Math/MathML"><msub><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">u</mi></mrow><mrow data-mjx-texclass="ORD"><mn>1</mn></mrow></msub></math>$ , $u 2 <math xmlns="http://www.w3.org/1998/Math/MathML"><msub><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">u</mi></mrow><mrow data-mjx-texclass="ORD"><mn>2</mn></mrow></msub></math>$ , …, $u r <math xmlns="http://www.w3.org/1998/Math/MathML"><msub><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">u</mi></mrow><mrow data-mjx-texclass="ORD"><mi>r</mi></mrow></msub></math>$ }.

Example:

A=[1√2−1√2001√221√20000010010][14000140000000][100010001]<math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow><mo>=</mo><mrow data-mjx-texclass="INNER"><mo data-mjx-texclass="OPEN">[</mo><mtable columnalign="center center center center" columnspacing="1em" rowspacing="4pt"><mtr><mtd><mfrac><mn>1</mn><msqrt><mn>2</mn></msqrt></mfrac></mtd><mtd><mo>−</mo><mfrac><mn>1</mn><msqrt><mn>2</mn></msqrt></mfrac></mtd><mtd><mn>0</mn></mtd><mtd><mn>0</mn></mtd></mtr><mtr><mtd><mfrac><mn>1</mn><msqrt><mn>2</mn></msqrt></mfrac><mn>2</mn></mtd><mtd><mfrac><mn>1</mn><msqrt><mn>2</mn></msqrt></mfrac></mtd><mtd><mn>0</mn></mtd><mtd><mn>0</mn></mtd></mtr><mtr><mtd><mn>0</mn></mtd><mtd><mn>0</mn></mtd><mtd><mn>0</mn></mtd><mtd><mn>1</mn></mtd></mtr><mtr><mtd><mn>0</mn></mtd><mtd><mn>0</mn></mtd><mtd><mn>1</mn></mtd><mtd><mn>0</mn></mtd></mtr></mtable><mo data-mjx-texclass="CLOSE">]</mo></mrow><mrow data-mjx-texclass="INNER"><mo data-mjx-texclass="OPEN">[</mo><mtable columnalign="center center center" columnspacing="1em" rowspacing="4pt"><mtr><mtd><mn>14</mn></mtd><mtd><mn>0</mn></mtd><mtd><mn>0</mn></mtd></mtr><mtr><mtd><mn>0</mn></mtd><mtd><mn>14</mn></mtd><mtd><mn>0</mn></mtd></mtr><mtr><mtd><mn>0</mn></mtd><mtd><mn>0</mn></mtd><mtd><mn>0</mn></mtd></mtr><mtr><mtd><mn>0</mn></mtd><mtd><mn>0</mn></mtd><mtd><mn>0</mn></mtd></mtr></mtable><mo data-mjx-texclass="CLOSE">]</mo></mrow><mrow data-mjx-texclass="INNER"><mo data-mjx-texclass="OPEN">[</mo><mtable columnalign="center center center" columnspacing="1em" rowspacing="4pt"><mtr><mtd><mn>1</mn></mtd><mtd><mn>0</mn></mtd><mtd><mn>0</mn></mtd></mtr><mtr><mtd><mn>0</mn></mtd><mtd><mn>1</mn></mtd><mtd><mn>0</mn></mtd></mtr><mtr><mtd><mn>0</mn></mtd><mtd><mn>0</mn></mtd><mtd><mn>1</mn></mtd></mtr></mtable><mo data-mjx-texclass="CLOSE">]</mo></mrow></math>

The rank of $A <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow></math>$ is 2.

The vectors $[1√21√200]<math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="INNER"><mo data-mjx-texclass="OPEN">[</mo><mtable columnspacing="1em" rowspacing="4pt"><mtr><mtd><mfrac><mn>1</mn><msqrt><mn>2</mn></msqrt></mfrac></mtd></mtr><mtr><mtd><mfrac><mn>1</mn><msqrt><mn>2</mn></msqrt></mfrac></mtd></mtr><mtr><mtd><mn>0</mn></mtd></mtr><mtr><mtd><mn>0</mn></mtd></mtr></mtable><mo data-mjx-texclass="CLOSE">]</mo></mrow></math>$ and $[−1√21√200]<math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="INNER"><mo data-mjx-texclass="OPEN">[</mo><mtable columnspacing="1em" rowspacing="4pt"><mtr><mtd><mo>−</mo><mfrac><mn>1</mn><msqrt><mn>2</mn></msqrt></mfrac></mtd></mtr><mtr><mtd><mfrac><mn>1</mn><msqrt><mn>2</mn></msqrt></mfrac></mtd></mtr><mtr><mtd><mn>0</mn></mtd></mtr><mtr><mtd><mn>0</mn></mtd></mtr></mtable><mo data-mjx-texclass="CLOSE">]</mo></mrow></math>$ provide an orthonormal basis for the range of $A <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow></math>$ .

The vector $[001] <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="INNER"><mo data-mjx-texclass="OPEN">[</mo><mtable columnspacing="1em" rowspacing="4pt"><mtr><mtd><mn>0</mn></mtd></mtr><mtr><mtd><mn>0</mn></mtd></mtr><mtr><mtd><mn>1</mn></mtd></mtr></mtable><mo data-mjx-texclass="CLOSE">]</mo></mrow></math>$ provides an orthonormal basis for the null space of $A <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow></math>$ .

(Moore-Penrose) Pseudoinverse

If the matrix $Σ <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">Σ</mi></mrow></math>$ is rank deficient, we cannot get its inverse. We define instead the pseudoinverse:

(Σ+)ii={1σiσi≠00σi=0<math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><mo stretchy="false">(</mo><msup><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">Σ</mi></mrow><mo>+</mo></msup><msub><mo stretchy="false">)</mo><mrow data-mjx-texclass="ORD"><mi>i</mi><mi>i</mi></mrow></msub><mo>=</mo><mrow data-mjx-texclass="INNER"><mo data-mjx-texclass="OPEN">{</mo><mtable columnalign="left left" columnspacing="1em" rowspacing=".2em"><mtr><mtd><mfrac><mn>1</mn><msub><mi>σ</mi><mi>i</mi></msub></mfrac></mtd><mtd><msub><mi>σ</mi><mi>i</mi></msub><mo>≠</mo><mn>0</mn></mtd></mtr><mtr><mtd><mn>0</mn></mtd><mtd><msub><mi>σ</mi><mi>i</mi></msub><mo>=</mo><mn>0</mn></mtd></mtr></mtable><mo data-mjx-texclass="CLOSE" fence="true" stretchy="true" symmetric="true"></mo></mrow></math>

For a general non-square matrix $A <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow></math>$ with known SVD ( $A = U Σ V T <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow><mo>=</mo><msup><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">U</mi><mi mathvariant="bold">Σ</mi><mi mathvariant="bold">V</mi></mrow><mi>T</mi></msup></math>$ ), the pseudoinverse is defined as:

A + = V Σ + U T <math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><msup><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow><mrow data-mjx-texclass="ORD"><mo>+</mo></mrow></msup><mo>=</mo><msup><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">V</mi><mi mathvariant="bold">Σ</mi></mrow><mrow data-mjx-texclass="ORD"><mo>+</mo></mrow></msup><msup><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">U</mi></mrow><mi>T</mi></msup></math>

For example, if we consider a $m \times n <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>m</mi><mo>\times</mo><mi>n</mi></math>$ full rank matrix where $m > n <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>m</mi><mo>></mo><mi>n</mi></math>$ :

Euclidean norm of matrices

The induced 2-norm of a matrix $A <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow></math>$ can be obtained using the SVD of the matrix :

And hence,

‖ A ‖ 2 = σ 1 <math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><mo data-mjx-texclass="ORD" fence="false" stretchy="false">‖</mo><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow><msub><mo data-mjx-texclass="ORD" fence="false" stretchy="false">‖</mo><mn>2</mn></msub><mo>=</mo><msub><mi>σ</mi><mn>1</mn></msub></math>

In the above equations, all the notations for the norm $‖ . ‖ <math xmlns="http://www.w3.org/1998/Math/MathML"><mo data-mjx-texclass="ORD" fence="false" stretchy="false">‖</mo><mo>.</mo><mo data-mjx-texclass="ORD" fence="false" stretchy="false">‖</mo></math>$ refer to the $p = 2 <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>p</mi><mo>=</mo><mn>2</mn></math>$ Euclidean norm, and we used the fact that $U <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">U</mi></mrow></math>$ and $V <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">V</mi></mrow></math>$ are orthogonal matrices and hence $‖ U ‖ 2 = ‖ V ‖ 2 = 1 <math xmlns="http://www.w3.org/1998/Math/MathML"><mo data-mjx-texclass="ORD" fence="false" stretchy="false">‖</mo><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">U</mi></mrow><msub><mo data-mjx-texclass="ORD" fence="false" stretchy="false">‖</mo><mn>2</mn></msub><mo>=</mo><mo data-mjx-texclass="ORD" fence="false" stretchy="false">‖</mo><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">V</mi></mrow><msub><mo data-mjx-texclass="ORD" fence="false" stretchy="false">‖</mo><mn>2</mn></msub><mo>=</mo><mn>1</mn></math>$ .

Example:

We begin with the following non-square matrix $A <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow></math>$ :

A = [323882874187647] . <math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow><mo>=</mo><mrow data-mjx-texclass="INNER"><mo data-mjx-texclass="OPEN">[</mo><mtable columnalign="center center center" columnspacing="1em" rowspacing="4pt"><mtr><mtd><mn>3</mn></mtd><mtd><mn>2</mn></mtd><mtd><mn>3</mn></mtd></mtr><mtr><mtd><mn>8</mn></mtd><mtd><mn>8</mn></mtd><mtd><mn>2</mn></mtd></mtr><mtr><mtd><mn>8</mn></mtd><mtd><mn>7</mn></mtd><mtd><mn>4</mn></mtd></mtr><mtr><mtd><mn>1</mn></mtd><mtd><mn>8</mn></mtd><mtd><mn>7</mn></mtd></mtr><mtr><mtd><mn>6</mn></mtd><mtd><mn>4</mn></mtd><mtd><mn>7</mn></mtd></mtr></mtable><mo data-mjx-texclass="CLOSE">]</mo></mrow><mo>.</mo></math>

The matrix of singular values, $Σ <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">Σ</mi></mrow></math>$ , computed from the SVD factorization is:

Σ = [20.916 00 0 6.53207 0 00 4.22807] . <math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><mi mathvariant="normal">Σ</mi><mo>=</mo><mrow data-mjx-texclass="INNER"><mo data-mjx-texclass="OPEN">[</mo><mtable columnalign="center center center" columnspacing="1em" rowspacing="4pt"><mtr><mtd><mn>20.916</mn></mtd><mtd><mn>0</mn></mtd><mtd><mn>0</mn></mtd></mtr><mtr><mtd><mn>0</mn></mtd><mtd><mn>6.53207</mn></mtd><mtd><mn>0</mn></mtd></mtr><mtr><mtd><mn>0</mn></mtd><mtd><mn>0</mn></mtd><mtd><mn>4.22807</mn></mtd></mtr></mtable><mo data-mjx-texclass="CLOSE">]</mo></mrow><mo>.</mo></math>

Consequently the 2-norm of $A <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow></math>$ is

‖ A ‖ 2 = 20.916 . <math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><mo data-mjx-texclass="ORD" fence="false" stretchy="false">‖</mo><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow><msub><mo data-mjx-texclass="ORD" fence="false" stretchy="false">‖</mo><mn>2</mn></msub><mo>=</mo><mn>20.916</mn><mo>.</mo></math>

Euclidean norm of the inverse of matrices

Following the same derivation as above, we can show that for a full rank $n \times n <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>n</mi><mo>\times</mo><mi>n</mi></math>$ matrix we have:

‖A−1‖2=1σn<math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><mo data-mjx-texclass="ORD" fence="false" stretchy="false">‖</mo><msup><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow><mrow data-mjx-texclass="ORD"><mo>−</mo><mn>1</mn></mrow></msup><msub><mo data-mjx-texclass="ORD" fence="false" stretchy="false">‖</mo><mn>2</mn></msub><mo>=</mo><mfrac><mn>1</mn><msub><mi>σ</mi><mi>n</mi></msub></mfrac></math>

where $σ n <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><msub><mi>σ</mi><mi>n</mi></msub></mrow></math>$ is the smallest singular value.

For non-square matrices, we can use the definition of the pseudoinverse (regardless of the rank):

‖A+‖2=1σr<math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><mo data-mjx-texclass="ORD" fence="false" stretchy="false">‖</mo><msup><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow><mrow data-mjx-texclass="ORD"><mo>+</mo></mrow></msup><msub><mo data-mjx-texclass="ORD" fence="false" stretchy="false">‖</mo><mn>2</mn></msub><mo>=</mo><mfrac><mn>1</mn><msub><mi>σ</mi><mi>r</mi></msub></mfrac></math>

where $σ r <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><msub><mi>σ</mi><mi>r</mi></msub></mrow></math>$ is the smallest non-zero singular value. Note that for a full rank square matrix, we have $‖ A + ‖ 2 = ‖ A - 1 ‖ 2 <math xmlns="http://www.w3.org/1998/Math/MathML"><mo data-mjx-texclass="ORD" fence="false" stretchy="false">‖</mo><msup><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow><mrow data-mjx-texclass="ORD"><mo>+</mo></mrow></msup><msub><mo data-mjx-texclass="ORD" fence="false" stretchy="false">‖</mo><mn>2</mn></msub><mo>=</mo><mo data-mjx-texclass="ORD" fence="false" stretchy="false">‖</mo><msup><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow><mrow data-mjx-texclass="ORD"><mo>-</mo><mn>1</mn></mrow></msup><msub><mo data-mjx-texclass="ORD" fence="false" stretchy="false">‖</mo><mn>2</mn></msub></math>$ . An exception of the definition above is the zero matrix. In this case, $‖ A + ‖ 2 = 0 <math xmlns="http://www.w3.org/1998/Math/MathML"><mo data-mjx-texclass="ORD" fence="false" stretchy="false">‖</mo><msup><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow><mrow data-mjx-texclass="ORD"><mo>+</mo></mrow></msup><msub><mo data-mjx-texclass="ORD" fence="false" stretchy="false">‖</mo><mn>2</mn></msub><mo>=</mo><mn>0</mn></math>$

2-Norm Condition Number

The 2-norm condition number of a matrix $A <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow></math>$ is given by the ratio of its largest singular value to its smallest singular value:

cond 2 (A) = ‖ A ‖ 2 ‖ A - 1 ‖ 2 = σ max / σ min . <math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><msub><mtext>cond</mtext><mn>2</mn></msub><mo stretchy="false">(</mo><mi>A</mi><mo stretchy="false">)</mo><mo>=</mo><mo data-mjx-texclass="ORD" fence="false" stretchy="false">‖</mo><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow><msub><mo data-mjx-texclass="ORD" fence="false" stretchy="false">‖</mo><mn>2</mn></msub><mo data-mjx-texclass="ORD" fence="false" stretchy="false">‖</mo><msup><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow><mrow data-mjx-texclass="ORD"><mo>-</mo><mn>1</mn></mrow></msup><msub><mo data-mjx-texclass="ORD" fence="false" stretchy="false">‖</mo><mn>2</mn></msub><mo>=</mo><msub><mi>σ</mi><mrow data-mjx-texclass="ORD"><mo data-mjx-texclass="OP" movablelimits="true">max</mo></mrow></msub><mrow data-mjx-texclass="ORD"><mo>/</mo></mrow><msub><mi>σ</mi><mrow data-mjx-texclass="ORD"><mo data-mjx-texclass="OP" movablelimits="true">min</mo></mrow></msub><mo>.</mo></math>

If the matrix $A <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow></math>$ is rank deficient, i.e. $rank (A) < min (m, n) <math xmlns="http://www.w3.org/1998/Math/MathML"><mtext>rank</mtext><mo stretchy="false">(</mo><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow><mo stretchy="false">)</mo><mo><</mo><mo data-mjx-texclass="OP" movablelimits="true">min</mo><mo stretchy="false">(</mo><mi>m</mi><mo>,</mo><mi>n</mi><mo stretchy="false">)</mo></math>$ , then $cond 2 (A) = \infty <math xmlns="http://www.w3.org/1998/Math/MathML"><msub><mtext>cond</mtext><mn>2</mn></msub><mo stretchy="false">(</mo><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow><mo stretchy="false">)</mo><mo>=</mo><mi mathvariant="normal">\infty</mi></math>$ .

Low-rank Approximation

The best rank- $k <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>k</mi></math>$ approximation for a $m \times n <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>m</mi><mo>\times</mo><mi>n</mi></math>$ matrix $A <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow></math>$ , where $k < s = min (m, n) <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>k</mi><mo><</mo><mi>s</mi><mo>=</mo><mo data-mjx-texclass="OP" movablelimits="true">min</mo><mo stretchy="false">(</mo><mi>m</mi><mo>,</mo><mi>n</mi><mo stretchy="false">)</mo></math>$ , for some matrix norm $‖ . ‖ <math xmlns="http://www.w3.org/1998/Math/MathML"><mo data-mjx-texclass="ORD" fence="false" stretchy="false">‖</mo><mo>.</mo><mo data-mjx-texclass="ORD" fence="false" stretchy="false">‖</mo></math>$ , is one that minimizes the following problem:

min A k ‖ A - A k ‖ such that rank (A k) \leq k . <math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><mtable displaystyle="true" columnalign="right left" columnspacing="0em" rowspacing="3pt"><mtr><mtd></mtd><mtd><mi></mi><munder><mo data-mjx-texclass="OP" movablelimits="true">min</mo><mrow data-mjx-texclass="ORD"><msub><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow><mi>k</mi></msub></mrow></munder><mtext> </mtext><mo data-mjx-texclass="ORD" fence="false" stretchy="false">‖</mo><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow><mo>-</mo><msub><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow><mi>k</mi></msub><mo data-mjx-texclass="ORD" fence="false" stretchy="false">‖</mo></mtd></mtr><mtr><mtd></mtd><mtd><mtext>such that</mtext><mstyle scriptlevel="0"><mspace width="1em"></mspace></mstyle><mrow data-mjx-texclass="ORD"><mi data-mjx-auto-op="false">rank</mi></mrow><mo stretchy="false">(</mo><msub><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow><mi>k</mi></msub><mo stretchy="false">)</mo><mo>\leq</mo><mi>k</mi><mo>.</mo></mtd></mtr></mtable></math>

Under the induced $2 <math xmlns="http://www.w3.org/1998/Math/MathML"><mn>2</mn></math>$ -norm, the best rank- $k <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>k</mi></math>$ approximation is given by the sum of the first $k <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>k</mi></math>$ outer products of the left and right singular vectors scaled by the corresponding singular value (where, $σ 1 \geq \dots \geq σ s <math xmlns="http://www.w3.org/1998/Math/MathML"><msub><mi>σ</mi><mn>1</mn></msub><mo>\geq</mo><mo>\dots</mo><mo>\geq</mo><msub><mi>σ</mi><mi>s</mi></msub></math>$ ):

A k = σ 1 u 1 v T 1 + \dots σ k u k v T k <math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><msub><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow><mi>k</mi></msub><mo>=</mo><msub><mi>σ</mi><mn>1</mn></msub><msub><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">u</mi></mrow><mn mathvariant="bold">1</mn></msub><msubsup><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">v</mi></mrow><mn mathvariant="bold">1</mn><mi mathvariant="bold">T</mi></msubsup><mo mathvariant="bold">+</mo><mo>\dots</mo><msub><mi>σ</mi><mi mathvariant="bold">k</mi></msub><msub><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">u</mi></mrow><mi mathvariant="bold">k</mi></msub><msubsup><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">v</mi></mrow><mi mathvariant="bold">k</mi><mi mathvariant="bold">T</mi></msubsup></math>

Observe that the norm of the difference between the best approximation and the matrix under the induced $2 <math xmlns="http://www.w3.org/1998/Math/MathML"><mn>2</mn></math>$ -norm condition is the magnitude of the $(k + 1) th <math xmlns="http://www.w3.org/1998/Math/MathML"><mo stretchy="false">(</mo><mi>k</mi><mo>+</mo><mn>1</mn><msup><mo stretchy="false">)</mo><mtext>th</mtext></msup></math>$ singular value of the matrix:

‖ A - A k ‖ 2 = | | n \sum i = k + 1 σ i u i v T i | | 2 = σ k + 1 <math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><mo data-mjx-texclass="ORD" fence="false" stretchy="false">‖</mo><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow><mo>-</mo><msub><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow><mi>k</mi></msub><msub><mo data-mjx-texclass="ORD" fence="false" stretchy="false">‖</mo><mn>2</mn></msub><mo>=</mo><msub><mrow data-mjx-texclass="INNER"><mo data-mjx-texclass="OPEN">|</mo><mrow data-mjx-texclass="INNER"><mo data-mjx-texclass="OPEN">|</mo><munderover><mo data-mjx-texclass="OP">\sum</mo><mrow data-mjx-texclass="ORD"><mi>i</mi><mo>=</mo><mi>k</mi><mo>+</mo><mn>1</mn></mrow><mi>n</mi></munderover><msub><mi>σ</mi><mi>i</mi></msub><msub><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">u</mi></mrow><mi mathvariant="bold">i</mi></msub><msubsup><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">v</mi></mrow><mi mathvariant="bold">i</mi><mi mathvariant="bold">T</mi></msubsup><mo data-mjx-texclass="CLOSE">|</mo></mrow><mo data-mjx-texclass="CLOSE">|</mo></mrow><mn>2</mn></msub><mo>=</mo><msub><mi>σ</mi><mrow data-mjx-texclass="ORD"><mi>k</mi><mo>+</mo><mn>1</mn></mrow></msub></math>

Note that the best rank- $k <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi>k</mi></mrow></math>$ approximation to $A <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow></math>$ can be stored efficiently by only storing the $k <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi>k</mi></mrow></math>$ singular values $σ 1, \dots, σ k <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><msub><mi>σ</mi><mn>1</mn></msub><mo>,</mo><mo>\dots</mo><mo>,</mo><msub><mi>σ</mi><mi>k</mi></msub></mrow></math>$ , the $k <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi>k</mi></mrow></math>$ left singular vectors $u 1, \dots, u k <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><msub><mi mathvariant="bold">u</mi><mn mathvariant="bold">1</mn></msub><mo mathvariant="bold">,</mo><mo>\dots</mo><mo mathvariant="bold">,</mo><msub><mi mathvariant="bold">u</mi><mi mathvariant="bold">k</mi></msub></mrow></math>$ , and the $k <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi>k</mi></mrow></math>$ right singular vectors $v 1, \dots, v k <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><msub><mi mathvariant="bold">v</mi><mn mathvariant="bold">1</mn></msub><mo mathvariant="bold">,</mo><mo>\dots</mo><mo mathvariant="bold">,</mo><msub><mi mathvariant="bold">v</mi><mi mathvariant="bold">k</mi></msub></mrow></math>$ .

The figure below show best rank- $k <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>k</mi></math>$ approximations of an image (you can find the code snippet that generates these images in the IPython notebook):

Review Questions

For a matrix $A <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow></math>$ with SVD decomposition $A = U Σ V T <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow><mo>=</mo><msup><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">U</mi><mi mathvariant="bold">Σ</mi><mi mathvariant="bold">V</mi></mrow><mi>T</mi></msup></math>$ , what are the columns of $U <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">U</mi></mrow></math>$ and how can we find them? What are the columns of $V <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">V</mi></mrow></math>$ and how can we find them? What are the entries of $Σ <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">Σ</mi></mrow></math>$ and how can we find them?
What special properties are true of $U <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">U</mi></mrow></math>$ , $V <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">V</mi></mrow></math>$ and $Σ <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">Σ</mi></mrow></math>$ ?
What are the shapes of $U <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">U</mi></mrow></math>$ , $V <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">V</mi></mrow></math>$ and $Σ <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">Σ</mi></mrow></math>$ in the full SVD of an $m \times n <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>m</mi><mo>\times</mo><mi>n</mi></math>$ matrix?
What are the shapes of $U <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">U</mi></mrow></math>$ , $V <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">V</mi></mrow></math>$ and $Σ <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">Σ</mi></mrow></math>$ in the reduced SVD of an $m \times n <math xmlns="http://www.w3.org/1998/Math/MathML"><mi>m</mi><mo>\times</mo><mi>n</mi></math>$ matrix?
What is the cost of computing the SVD?
Given an already computed SVD of a matrix $A <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow></math>$ , what is the cost of using the SVD to solve a linear system $A x = b <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">x</mi></mrow><mo mathvariant="bold">=</mo><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">b</mi></mrow></math>$ ? How would you use the SVD to solve this system?
How do you use the SVD to compute a low-rank approximation of a matrix? For a small matrix, you should be able to compute a given low rank approximation (i.e. rank-one, rank-two).
Given the SVD of a matrix $A <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow></math>$ , what is the SVD of $A + <math xmlns="http://www.w3.org/1998/Math/MathML"><msup><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow><mo>+</mo></msup></math>$ (the psuedoinverse of $A <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow></math>$ )?
Given the SVD of a matrix $A <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow data-mjx-texclass="ORD"><mi mathvariant="bold">A</mi></mrow></math>$ , what is the 2-norm of the matrix? What is the 2-norm condition number of the matrix?

ChangeLog

2020-04-26 Mariana Silva mfsilva@illinois.edu: adding more details to sections
2018-11-14 Erin Carrier ecarrie2@illinois.edu: spelling fix
2018-10-18 Erin Carrier ecarrie2@illinois.edu: correct svd cost
2018-01-14 Erin Carrier ecarrie2@illinois.edu: removes demo links
2017-12-04 Arun Lakshmanan lakshma2@illinois.edu: fix best rank approx, svd image
2017-11-15 Erin Carrier ecarrie2@illinois.edu: adds review questions, adds cond num sec, removes normal equations, minor corrections and clarifications
2017-11-13 Arun Lakshmanan lakshma2@illinois.edu: first complete draft
2017-10-17 Luke Olson lukeo@illinois.edu: outline