A vector space is a set V of vectors and a field F (elements of F are called scalars) with the following two operations:
which satisfiy the following conditions:
If there exist a set of vectors v1,v2…,vn such that any vector x∈V can be written as a linear combination
x=c1v1+c2v2+⋯+cnvnwith uniquely determined scalars c1,…,cn, the set v1,…,vn is called a basis for V. The size of the basis n is called the dimension of V.
The standard example of a vector space is V=Rn with F=R. Vectors in Rn are written as an array of numbers:
x=[x1x2⋮xn]=[x1x2⋯xn]TThe dimension of Rn is n. The standard basis vectors of Rn are written as
e1=[10⋮0]e2=[01⋮0] …en=[00⋮1].A set of vectors v1,…,vk is called linearly independent if the equation α1v1+α2v2+⋯+αkvk=0 in the unknowns α1,…,αk, has only the trivial solution α1=α2=⋯=αk=0. Otherwise the vectors are linearly dependent, and at least one of the vectors can be written as a linear combination of the other vectors in the set. A basis is always linearly independent.
Let V be a real vector space. Then, an inner product is a function ⟨⋅,⋅⟩:V×V→R (i.e., it takes two vectors and returns a real number) which satisfies the following four properties, where u,v,w∈V and α,β∈R:
The inner product intuitively represents the similarity between two vectors. Two vectors u,v∈V are said to be orthogonal if ⟨u,v⟩=0.
The standard inner product on Rn is the dot product :⟨x,y⟩=xTy=∑ni=1xiyi.
To read more about Inner Product Definition
A function f:V→W between two vector spaces V and W is called linear if
f is commonly called a linear transformation.
If n and m are the dimension of V and W, respectively, then f can be represented as an m×n rectangular array or matrix
A=[a11a12…a1na21a22…a2n⋮⋮⋱⋮am1am2…amn].The numbers in the matrix A are determined by the basis vectors for the spaces V and W. To see how, we first review matrix vector multiplication.
Let A be an m×n matrix of real numbers. We can also write A∈Rm×n as shorthand. If x is a vector in Rn then the matrix-vector product Ax=b is a vector in Rm defined by:
bi=n∑j=1aijxjfor i=1,2,…,m.We can interpret matrix-vector multiplications in two ways. Throughout this online textbook reference, we will use the notation ai to refer to the ith column of the matrix A and aTi to refer to the ith row of the matrix A.
1) Writing a matrix-vector multiplication as inner products of the rows A:
Ax=[aT1⋅xaT2⋅x⋮aTm⋅x]2) Writing a matrix-vector multiplication as linear combination of the columns of A:
Ax=x1a1+x2a2+…xnan=x1[a11a21⋮am1]+x2[a12a22⋮am2]+⋯+xn[a1na2n⋮amn]
It is this representation that allows us to express any linear transformation between finite-dimensional vector spaces with matrices.
Let e1,e2,…,en be the standard basis of Rn. If we define the vector zj=Aej, then using the interpretation of matrix-vector products as linear combinations of the column of A, we have that:
zj=Aej=[a1ja2j⋮amj]=a1j[10⋮0]+a2j[01⋮0]+⋯+amj[00⋮1]=m∑i=1aijˆei,where we have written the standard basis of Rm as ˆe1,ˆe2,…,ˆem.
In other words, if zj=Aej is written as a linear combination of the basis vectors of Rm, the element aij is the coefficient corresponding to ˆei.
Let’s try an example. Suppose that V is a vector space with basis v1,v2,v3, and W is a vector space with basis w1,w2. Then V and W have dimension 3 and 2, respectively. Thus any linear transformation f:V→W can be represented by a 2×3 matrix. We can introduce column vector notation, so that vectors v=α1v1+α2v2+α3v3 and w=β1w1+β2w2 can be written as
v=[α1α2α3],w=[β1β2].We have not specified what the vector spaces V and W, but it is fine if we treat them like elements of R3 and R2.
Suppose that the following facts are known about the linear transformation f:
This is enough information to completely determine the matrix representation of f. The first equation tells us
w1=f(v1)⟹[10]=[a11a12a13a21a22a23][100]=[a11a21].So we know a11=1, a21=0. The second equation tells us that
5w1−w2=f(v2)⟹[5−1]=[1a12a130a22a23][010]=[a12a22].So we know a12=5, a22=−1. Finally, the third equation tells us
2w1+2w2=f(v2)⟹[22]=[15a130−1a23][001]=[a13a23].Thus, a13=2, a23=2, and the linear transformation f can be represented by the matrix:
[1520−12].It is important to note that the matrix representation not only depends on f, but also our choice of basis. If we chose different bases for the vector spaces V and W, the matrix representation of f would change as well.
The m×n zero matrix is denoted by 0mn and has all entries equal to zero. For example, the 3×4 zero matrix is
034=[000000000000].The n×n identity matrix is denoted by In and has all entries equal to zero except for the diagonal, which is all 1. For example, the 4×4 identity matrix is
I4=[1000010000100001].A n×n diagonal matrix has all entries equal to zero except for the diagonal entries. We typically use D for diagonal matrices. For ecample, 4×4 diagonal matrices have the form:
[d110000d220000d330000d44].A lower-triangular matrix is a square matrix that is entirely zero above the diagonal. We typically use L for lower-triangular matrices. For example, 4×4 lower-triangular matrices have the form:
L=[ℓ11000ℓ21ℓ2200ℓ31ℓ32ℓ330ℓ41ℓ42ℓ43ℓ44].An upper triangular matrix is a square matrix that is entirely zero below the diagonal. We typically use U for upper-triangular matrices. For example, 4×4 upper-triangular matrices have the form:
U=[u11u12u13u140u22u23u2400u33u34000u44].Properties of triangular matrices:
A permuation matrix is a square matrix that is all zero, except for a single entry in each row and each column which is 1. We typically use P for permutation matrices. An example of a 4×4 permutation matrix is
P=[0100000110000010].The properties of a permutation matrix are:
A matrix in block form is a matrix partitioned into blocks. A block is simply a submatrix. For example, consider
M=[ABCD]where A, B, C, and D are submatrices.
There are special matrices in block form as well. For instance, a block diagonal matrix is a block matrix whose off-diagonal blocks are zero matrices.
The rank of a matrix is the number of linearly independent columns of the matrix. It can also be shown that the matrix has the same number of linearly indendent rows, as well. If A is an m×n matrix, then
A square n×n matrix A is invertible if there exists a square matrix B such that AB=BA=I, where I is the n×n identity matrix. The matrix B is denoted by A−1. A square matrix is invertible if and only if it has full rank. A square matrix that is not invertible is called a singular matrix.
A vector norm is a function ‖u‖:V→R+0 (i.e., it takes a vector and returns a nonnegative real number) that satisfies the following properties, where u,v∈V and α∈R:
A norm is a generalization of “absolute value” and measures the “magnitude” of the input vector.
The p-norm is defined as
‖w‖p=(∑Ni=1|wi|p)1p.
The definition is a valid norm when p≥1. If 0≤p<1 then it is not a valid norm because it violates the triangle inequality.
When p=2 (2-norm), this is called the Euclidean norm and it corresponds to the length of the vector.
Consider the case of w=[−3,5,0,1], in this part we will show how to calculate the 1, 2, and ∞ norm of w.
For the 1-norm:
‖w‖1=(N∑i=1|wi|1)11 ‖w‖1=N∑i=1|wi| ‖w‖1=|−3|+|5|+|0|+|1| ‖w‖1=3+5+0+1 ‖w‖1=9For the 2-norm:
‖w‖2=(N∑i=1|wi|2)12 ‖w‖2=√N∑i=1w2i ‖w‖2=√(−3)2+(5)2+(0)2+(1)2 ‖w‖2=√9+25+0+1 ‖w‖2=√35≈5.92For the ∞-norm:
‖w‖∞=limp→∞(N∑i=1|wi|p)1p ‖w‖∞=maxi=1,…,N|wi| ‖w‖∞=max(|−3|,|5|,|0|,|1|) ‖w‖∞=max(3,5,0,1) ‖w‖∞=5A general matrix norm is a real valued function ‖A‖ that satisfies the following properties:
Induced (or operator) matrix norms are associated with a specific vector norm ‖⋅‖ and are defined as:
‖A‖:=max‖x‖=1‖Ax‖.An induced matrix norm is a particular type of a general matrix norm. Induced matrix norms tell us the maximum amplification of the norm of any vector when multiplied by the matrix. Note that the definition above is equivalent to
‖A‖=max‖x‖≠0‖Ax‖‖x‖.In addition to the properties above of general matrix norms, induced matrix norms also satisfy the submultiplicative conditions:
‖Ax‖≤‖A‖‖x‖ ‖AB‖≤‖A‖‖B‖The Frobenius norm is simply the sum of every element of the matrix squared, which is equivalent to applying the vector 2-norm to the flattened matrix,
‖A‖F=√∑i,ja2ij.The Frobenius norm is an example of a general matrix norm that is not an induced norm.
The matrix p-norm is induced by the p-norm of a vector. It is ‖A‖p:=max‖x‖p=1‖Ax‖p. There are three special cases:
For the 1-norm, this reduces to the maximum absolute column sum of the matrix, i.e.,
‖A‖1=maxjn∑i=1|aij|.For the 2-norm, this reduces the maximum singular value of the matrix.
‖A‖2=maxkσkFor the ∞-norm this reduces to the maximum absolute row sum of the matrix.
‖A‖∞=maxin∑j=1|aij|.Now we will go through a few examples with a matrix C, defined below.
C=[3−2−13]For the 1-norm:
‖C‖1=max‖x‖1=1‖Cx‖1 ‖C‖1=max1≤j≤33∑i=1|Cij| ‖C‖1=max(|3|+|−1|,|−2|+|3|) ‖C‖1=max(4,5) ‖C‖1=5For the 2-norm:
The singular values are the square roots of the eigenvalues of the matrix CTC. You can also find the maximum singular values by calculating the Singular Value Decomposition of the matrix.
‖C‖2=max‖x‖2=1‖Cx‖2 det(CTC−λI)=0 det([3−1−23][3−2−13]−λI)=0 det([9+1−6−3−3−64+9]−λI)=0 det([10−λ−9−913−λ])=0 (10−λ)(13−λ)−81=0 λ2−23λ+130−81=0 λ2−23λ+49=0 (λ−12(23+3√37))(λ−12(23−3√37))=0 ‖C‖2=√λmax=√12(23+3√37)≈4.54For the ∞-norm:
‖C‖∞=max‖x‖∞=1‖Cx‖∞ ‖C‖∞=max1≤i≤33∑j=1|Cij| ‖C‖∞=max(|3|+|−2|,|−1|+|3|) ‖C‖∞=max(5,4) ‖C‖∞=5