Home About us Products Services Contact us Bookmark
:: wikimiki.org ::
Diagonal Matrix

Diagonal matrix

In linear algebra, a diagonal matrix is a square matrix in which the entries outside the main diagonal are all zero. The diagonal entries themselves may or may not be zero. Thus, the matrix D = (di,j) with n columns and n rows is diagonal if: :d_ = 0 \mbox i \ne j \qquad \forall i,j \in \ For example, the following matrix is diagonal: :\begin 1 & 0 & 0\\ 0 & 4 & 0\\ 0 & 0 & -3\end The term diagonal matrix may sometimes refer to a rectangular m-by-n matrix in case only the entries of the form ai,i are non-zero. However, in this article we will consider only square matrices. Any diagonal matrix is also a symmetric matrix. Also, if the entries come from the field R or C, then it is a normal matrix as well. Equivalently, we can define a diagonal matrix as a matrix that is both upper- and lower-triangular. The identity matrix In and any square zero matrix are diagonal. A one-dimensional matrix is always diagonal. A diagonal matrix with all its main diagonal entries equal is a scalar matrix, that is, a scalar multiple λI of the identity matrix I. Its effect on a vector is scalar multiplication by λ. The scalar matrices are the center of the algebra of matrices: that is, they are precisely the matrices that commute with all other square matrices of the same size.

Matrix operations

The operations of matrix addition and matrix multiplication are especially simple for diagonal matrices. Write diag(a1,...,an) for a diagonal matrix whose diagonal entries starting in the upper left corner are a1,...,an. Then, for addition, we have :diag(a1,...,an) + diag(b1,...,bn) = diag(a1+b1,...,an+bn) and for matrix multiplication, :diag(a1,...,an) · diag(b1,...,bn) = diag(a1b1,...,anbn). The diagonal matrix diag(a1,...,an) is invertible if and only if the entries a1,...,an are all non-zero. In this case, we have :diag(a1,...,an)-1 = diag(a1-1,...,an-1). In particular, the diagonal matrices form a subring of the ring of all n-by-n matrices. Multiplying the matrix A from the left with diag(a1,...,an) amounts to multiplying the i-th row of A by ai for all i; multiplying the matrix A from the right with diag(a1,...,an) amounts to multiplying the i-th column of A by ai for all i.

Other properties

The eigenvalues of diag(a1, ..., an) are a1, ..., an. The unit vectors e1, ..., en form a basis of eigenvectors. The determinant of diag(a1, ..., an) is the product a1...an. The adjugate of a diagonal matrix is again diagonal. A square matrix is diagonal if and only if it is triangular and normal.

Uses

Diagonal matrices occur in many areas of linear algebra. Because of the simple description of the matrix operation and eigenvalues/eigenvectors given above, it is always desirable to represent a given matrix or linear map by a diagonal matrix. In fact, a given n-by-n matrix A is similar to a diagonal matrix (meaning that there is a matrix X such that XAX-1 is diagonal) if and only if it has n linearly independent eigenvectors. Such matrices are said to be diagonalizable. Over the field of real or complex numbers, more is true. The spectral theorem says that every normal matrix is unitarily similar to a diagonal matrix (if AA
-
= A
-
A then there exists a unitary matrix U such that UAU
-
is diagonal). Furthermore, the singular value decomposition implies that for any matrix A, there exist unitary matrices U and V such that UAV
-
is diagonal with positive entries.

External links


-

References


- Roger A. Horn and Charles R. Johnson, Matrix Analysis, Cambridge University Press, 1985. ISBN 0-521-30586-1 (hardback), ISBN 0-521-38632-2 (paperback). Category:Matrices ja:対角行列



Main diagonal

In linear algebra, the main diagonal of a square matrix is the diagonal which runs from the top left corner to the bottom right corner. For example, the following matrix has 1s down its main diagonal: :\begin 1 & 0 & 0\\ 0 & 1 & 0\\ 0 & 0 & 1\end A square matrix like the above in which the entries outside the main diagonal are all zero is called a diagonal matrix. The sum of the entries on the main diagonal of a matrix is known as the trace of that matrix. Category:Linear algebra

Field (mathematics)

In abstract algebra, a field is an algebraic structure in which the operations of addition, subtraction, multiplication and division (except division by zero) may be performed, and the same rules hold which are familiar from the arithmetic of ordinary numbers.

Introduction

Fields are important objects of study in algebra since they provide a useful generalization of many number systems, such as the rational numbers, real numbers, and complex numbers. In particular, the usual rules of associativity, commutativity and distributivity hold. Fields also appear in many other areas of mathematics; see the examples below. When abstract algebra was first being developed, the definition of a field usually did not include commutativity of multiplication, and what we today call a field would have been called either a commutative field or a rational domain. In contemporary usage, a field is always commutative. A structure which satisfies all the properties of a field except for commutativity, is today called a division ring or sometimes a skew field, but also non-commutative field is still widely used. Other languages have retained the old usage: for example, in Italian and French, division rings are called corpo and corps, both literally meaning 'body'. Instead, in German and Spanish, Körper (whence the blackboard bold K used to denote a field) and cuerpo mean 'field'. Notice that French language has no single word for field, they are simply called corps commutatif. Italian for field is campo, with the same literal meaning as English. The concept of a field is of use, for example, in defining vectors and matrices, two structures in linear algebra whose components can be elements of an arbitrary field. Galois theory studies the symmetry of equations by investigating the ways in which fields can be contained in each other. See field theory for more information.

Definition

A field is a commutative ring (F, +,
- ) such that 0 does not equal 1 and all elements of F except 0 have a multiplicative inverse. Spelled out, this means that the following hold: ; Closure of F under + and
- : For all a, b belonging to F, both a + b and a
- b belong to F (or more formally, + and
- are binary operations on F). ; Both + and
- are associative : For all a, b, c in F, a + (b + c) = (a + b) + c and a
- (b
- c) = (a
- b)
- c. ; Both + and
- are commutative : For all a, b belonging to F, a + b = b + a and a
- b = b
- a. ; The operation
- is distributive over the operation + : For all a, b, c, belonging to F, a
- (b + c) = (a
- b) + (a
- c). ; Existence of an additive identity : There exists an element 0 in F, such that for all a belonging to F, a + 0 = a. ; Existence of a multiplicative identity : There exists an element 1 in F different from 0, such that for all a belonging to F, a
- 1 = a. ; Existence of additive inverses : For every a belonging to F, there exists an element −a in F, such that a + (−a) = 0. ; Existence of multiplicative inverses : For every a ≠ 0 belonging to F, there exists an element a−1 in F, such that a
- a−1 = 1. The requirement 0 ≠ 1 ensures that the set which only contains a single element is not a field. Directly from the axioms, one may show that (F, +) and (F − ,
- ) are commutative groups (abelian groups) and that therefore (see elementary group theory) the additive inverse −a and the multiplicative inverse a−1 are uniquely determined by a. Furthermore, the multiplicative inverse of a product is equal to the product of the inverses: :(a
- b
)−1 = b−1
- a−1 = a−1
- b−1 provided both a and b are non-zero. Other useful rules include :−a = (−1)
- a and more generally :−(a
- b
) = (−a)
- b = a
- (−b) as well as :a
- 0 = 0, all rules familiar from elementary arithmetic. If the requirement of commutativity of the operation
- is dropped, one distinguishes the above commutative fields from non-commutative fields, usually called division rings or skew fields).

Examples of fields


- The complex numbers C, under the usual operations of addition and multiplication. The field of complex numbers contains the following subfields (a subfield of a field F is a set containing 0 and 1, closed under the operations + and
- of F and with its own operations defined by restriction):
  - The rational numbers Q = where Z is the set of integers. The rational number field contains no proper subfields.
  - An algebraic number field is a finite field extension of the rational numbers Q, that is, a field containing Q which has finite dimension as a vector space over Q. Such fields are very important in number theory.
  - The field of algebraic numbers, the algebraic closure of Q.
  - The real numbers R, under the usual operations of addition and multiplication. When the real numbers are given the usual ordering, they form a complete ordered field which is categorical — it is this structure that provides the foundation for most formal treatments of calculus.
    - The real numbers contain several interesting subfields: the real algebraic numbers, the computable numbers, and the definable numbers.
- If q > 1 is a power of a prime number, then there exists (up to isomorphism) exactly one finite field with q elements, usually denoted Fq, Z/qZ, or GF(q). Every other finite field is isomorphic to one of these fields. Such fields are often called a Galois field, whence the notation GF(q).
  - In particular, for a given prime number p, the set of integers modulo p is a finite field with p elements: Fp = where the operations are defined by performing the operation in Z, dividing by p and taking the remainder; see modular arithmetic.
    - Taking p = 2, we obtain the smallest field, F2, which has only two elements: 0 and 1. It can be defined by the two Cayley tables + 0 1
-
0 1 0 0 1 0 0 0 1 1 0 1 0 1 ::::This field has important uses in computer science, especially in cryptography and coding theory.
- The rational numbers can be extended to the fields of p-adic numbers for every prime number p. These fields are very important in both number theory and mathematical analysis.
- Let E and F be two fields with E a subfield of F. Let x be an element of F not in E. Then E(x) is defined to be the smallest subfield of F containing E and x. We call E(x) a simple extension of E. For instance, Q(i) is the number field of complex numbers C consisting of all numbers of the form a + bi where both a and b are rational numbers. In fact, it can be shown that every number field is a simple extension of Q.
- For a given field F, the set F(X) of rational functions in the variable X with coefficients in F is a field; this is defined as the set of quotients of polynomials with coefficients in F. This is the simplest example of a transcendental extension.
- If F is a field, and p(X) is an irreducible polynomial in the polynomial ring F[X], then the quotient F[X]/<p(X)> is a field with a subfield isomorphic to F. For instance, R[X]/<X2 + 1> is a field (in fact, it is isomorphic to the field of complex numbers). It can be shown that every simple algebraic extension of F is isomorphic to a field of this form.
- When F is a field, the set F((X)) of formal Laurent series over F is a field.
- If V is an algebraic variety over F, then the rational functions VF form a field, the function field of V.
- If S is a Riemann surface, then the meromorphic functions S → C form a field.
- If I is an index set, U is an ultrafilter on I, and Fi is a field for every i in I, the ultraproduct of the Fi (using U) is a field.
- Hyperreal numbers and superreal numbers extend the real numbers with the addition of infinitesimal and infinite numbers. There are also proper classes with field structure, which are sometimes called Fields, with a capital F:
- The surreal numbers form a Field containing the reals, and would be a field except for the fact that they are a proper class, not a set. The set of all surreal numbers with birthday smaller than some inaccessible cardinal form a field.
- The nimbers form a Field. The set of nimbers with birthday smaller than 2^, the nimbers with birthday smaller than any infinite cardinal are all examples of fields.

Some first theorems


- The set of non-zero elements of a field F (typically denoted by F×) is an abelian group under multiplication. Every finite subgroup of F× is cyclic.
- The characteristic of any field is zero or a prime number. (The characteristic is defined as follows: the smallest positive integer n such that n·1 = 0, or zero if no such n exists; here n·1 stands for n summands 1 + 1 + 1 + ... + 1. An equivalent definition is the following: the characteristic of a field F is the unique non-negative generator of the kernel of the unique ring homomorphism Z → F which sends 1 |-> 1.)
- The number of elements of any finite field is a prime power.
- As a ring, a field has no ideals except and itself.
- Assuming the axiom of choice, for every field F, there exists a unique field G (up to isomorphism) which contains F, is algebraic over F, and is algebraically closed. G is called the algebraic closure of F.

See also


- field theory for some history and other information.
- Glossary of field theory for more definitions in field theory. Category:Field theory ja:体 (数学) ko:체 (수학)

Normal matrix

A complex square matrix A is a normal matrix iff :A^A=AA^ where A
-
is the conjugate transpose of A (if A is a real matrix, this is the same as the transpose of A).

Examples

All unitary, hermitian, skew-hermitian and positive definite matrices are normal. If A is unitary A
-
A=AA
-
=I. If A is hermitian, then A
-
=A and so AA
-
=AA=A
-
A. But it is not the case that all normal matrices are either unitary, hermitian, or positive definite, for example, :\begin -i & -i & 0 \\ -i & i & 0 \\ 0 & 0 & 1 \end is normal since :\begin -i & -i & 0 \\ -i & i & 0 \\ 0 & 0 & 1 \end\begin -i & -i & 0 \\ -i & i & 0 \\ 0 & 0 & 1 \end^
- = \begin -i & -i & 0 \\ -i & i & 0 \\ 0 & 0 & 1 \end\begin i & i & 0 \\ i & -i & 0 \\ 0 & 0 & 1 \end := \begin 2 & 0 & 0 \\ 0 & 2 & 0 \\ 0 & 0 & 1 \end = \begin i & i & 0 \\ i & -i & 0 \\ 0 & 0 & 1 \end\begin -i & -i & 0 \\ -i & i & 0 \\ 0 & 0 & 1 \end = \begin -i & -i & 0 \\ -i & i & 0 \\ 0 & 0 & 1 \end^
- \begin -i & -i & 0 \\ -i & i & 0 \\ 0 & 0 & 1 \end but the matrix is clearly not unitary, nor is it hermitian, nor positive definite.

Consequences

It is useful to think of normal matrices in analogy to complex numbers, invertible normal matrices in analogy to non-zero complex numbers, the conjugate transpose in analogy to the complex conjugate, unitary matrices in analogy to complex numbers of absolute value 1, hermitian matrices in analogy to real numbers and positive definite matrices in analogy to positive real numbers. The concept of normality is mainly important because normal matrices are precisely the ones to which the spectral theorem applies; in other words, normal matrices are precisely those matrices that can be represented by a diagonal matrix with respect to a properly chosen orthonormal basis of Cn. Phrased differently: a matrix is normal if and only if its eigenspaces span Cn and are pairwise orthogonal with respect to the standard inner product of Cn. In general, the sum or product of two normal matrices need not be normal. However, if A and B are normal with AB = BA, then both AB and A + B are also normal and furthermore we can simultaneously diagonalize A and B in the following sense: there exists a unitary matrix U such UAU
-
and UBU
-
are both diagonal matrices. In this case, the columns of U
-
are eigenvectors of both A and B and form an orthonormal basis of Cn. If A is both a triangular matrix and a normal matrix, then A is diagonal. This can be seen by looking at the diagonal entries of A
-
A and AA
-
, where A is a normal, triangular matrix. If A is an invertible normal matrix, then there exists a unitary matrix U and a positive definite matrix R such that A = RU = UR. The matrices R and U are uniquely determined by A. This statement can be seen as an analog (and generalization) of the polar representation of non-zero complex numbers. The concept of normal matrices can be generalized to normal operators on Hilbert spaces and to normal elements in C-star algebras. Category:Matrices

Triangular matrix

In the mathematical discipline of linear algebra, a triangular matrix is a special kind of square matrix where the entries below or above the main diagonal are zero. Because matrix equations with triangular matrices are easy to solve they are very important in numerical analysis. The LU decomposition gives an algorithm to decompose any invertible matrix A into a normed lower triangle matrix L and an upper triangle matrix U.

Definition

A matrix : \mathbf= \begin l_ & & & & 0 \\ l_ & l_ & & & \\ l_ & l_ & \ddots & & \\ \vdots & \vdots & \ddots & \ddots & \\ l_ & l_ & \ldots & l_ & l_ \end is called lower triangular matrix or left triangular matrix, and analogously a matrix of the form : \mathbf = \begin u_ & u_ & u_ & \ldots & u_ \\ & u_ & u_ & \ldots & u_ \\ & & \ddots & \ddots & \vdots \\ & & & \ddots & u_\\ 0 & & & & u_ \end is called upper triangular matrix or right triangular matrix. A triangular matrix with zero entries on the main diagonal is strictly upper or lower triangular. All strictly triangular matrices are nilpotent. If the entries on the main diagonal are 1, the matrix is termed unit upper/lower or normed upper/lower triangular. If, in addition, all the off-diagonal entries are zero except for the entries in one column, then the matrix is atomic upper/lower triangular; such a matrix is also called a Gauss (transformation) matrix. So an atomic lower triangular matrix is of the form : \mathbf_ = \begin 1 & & & & & 0 \\ & \ddots & & & & \\ & & 1 & & & \\ & & l_ & \ddots & & \\ & & \vdots & & \ddots & \\ 0 & & l_ & & & 1 \\ \end. The inverse of an atomic triangular matrix is again atomic triangular. Indeed, we have : \mathbf_^ = \begin 1 & & & & & 0 \\ & \ddots & & & & \\ & & 1 & & & \\ & &-l_ & \ddots & & \\ & & \vdots & & \ddots & \\ 0 & & -l_ & & & 1 \\ \end, i.e. the off-diagonal entries are replaced by their opposites.

Notes

A matrix which is simultaneously upper and lower triangular is diagonal. The identity matrix is the only matrix which is both normed upper and lower triangular. A matrix which is simultaneously triangular and normal, is also diagonal. This can be seen by looking at the diagonal entries of A
-
A and AA
-
, where A is a normal, triangular matrix. The transpose of a upper triangular matrix is a lower triangular matrix and vice versa. The determinant of a triangular matrix equals the product of the diagonal entries, and the eigenvalues of a triangular matrix are the diagonal entries. The variable L is commonly used for lower triangular matrix, standing for lower/left, while the variable U or R is commonly used for upper triangular matrix, standing for upper/right. Generally, operations can be performed on triangular matrices within half of the time that is needed for the same operation on a general matrix.

Generalizations

The product of two upper triangular matrices is upper triangular, so the set of upper triangular matrices forms an algebra. Algebras of upper triangular matrices have a natural generalization in functional analysis which yields nest algebras on Hilbert spaces. The set of invertible triangular matrices form a group, and is a subgroup of all invertible matrices. The set of 2 by 2 triangular matrices is called the parabolic subgroup; 3 by 3 and larger normed triangular matrices form the Heisenberg group. Both are examples of a Borel subgroup.

Examples

The matrix : \begin 1 & 4 & 2 \\ 0 & 3 & 4 \\ 0 & 0 & 1 \\ \end is upper triangular and : \begin 1 & 0 & 0 \\ 2 & 8 & 0 \\ 4 & 9 & 7 \\ \end is lower triangular. The matrix : \begin 1 & 0 & 0 \\ 4 & 1 & 0 \\ 2 & 0 & 1 \\ \end is atomic lower triangular and its inverse is : \begin 1 & 0 & 0 \\ -4 & 1 & 0 \\ -2 & 0 & 1 \\ \end.

Application

A matrix equation in the form :\mathbf\mathbf = \mathbf or :\mathbf \mathbf = \mathbf is very easy to solve. The matrix equation Lx = b can be written as a system of linear equations : \begin l_ x_1 & & & & & = & b_1 \\ l_ x_1 & + & l_ x_2 & & & = & b_2 \\ \vdots & & \vdots & \ddots & & & \vdots \\ l_ x_1 & + & l_ x_2 & + \ldots + & l_ x_m & = & b_m \\ \end which can be solved by the following recursive relation : x_1 = \frac, : x_2 = \frac, :: \vdots : x_m = \frac. A matrix equation with an upper triangular matrix U can be solved in an analogous way.

See also


- Gaussian elimination
- LU decomposition
- QR decomposition
- Hessenberg matrix Category:Numerical linear algebra Category:Matrices ja:三角行列

Identity matrix

In linear algebra, the identity matrix of size n is the n-by-n square matrix with ones on the main diagonal and zeros elsewhere. It is denoted by In, or simply by I if the size is immaterial or can be trivially determined by the context. : I_1 = \begin 1 \end ,\ I_2 = \begin 1 & 0 \\ 0 & 1 \end ,\ I_3 = \begin 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end ,\ \cdots ,\ I_n = \begin 1 & 0 & \cdots & 0 \\ 0 & 1 & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & 1 \end The important property of In is that :AIn = A   and   InB = B whenever these matrix multiplications are defined. In particular, the identity matrix serves as the unit of the ring of all n-by-n matrices, and as the identity element of the general linear group GL(n) consisting of all invertible n-by-n matrices. (The identity matrix itself is obviously invertible, being its own inverse.) Where n-by-n matrices are used to represent linear transformations from an n-dimensional vector space to itself, In represents the identity function, regardless of the basis. The ith column of an identity matrix is the unit vector ei. The unit vectors are also the eigenvectors of the identity matrix, all corresponding to the eigenvalue 1, which is therefore the only eigenvalue and has multiplicity n. It follows that the determinant of the identity matrix is 1 and the trace is n. Using the notation that is sometimes used to concisely describe diagonal matrices, we can write: : I_n = \mathrm(1,1,...,1) It can also be written using the Kronecker delta notation: :(I_n)_ = \delta_

External link


- Category:Abstract algebra Category:Linear algebra Category:Matrices ja:単位行列

Zero matrix

In mathematics, particularly linear algebra, a zero matrix is a matrix with all its entries being zero. There is exactly one zero matrix of any given size m×n having entries in a given ring, so when the context is clear one often refers to the zero matrix. The zero matrix represents the linear transformation sending all vectors to the zero vector.

See also


- Identity matrix Category:Matrices

Scalar multiplication

In mathematics, scalar multiplication is one of the basic operations defining a vector space in linear algebra (or more generally, a module in abstract algebra). Note that scalar multiplication is different than scalar product which is an inner product between two vectors. More specifically, if K is a field and V is a vector space over K, then scalar multiplication is a function from K × V to V. The result of applying this function to c in K and v in V is cv. Scalar multiplication obeys the following rules (vector in boldface):
- Left distributivity: (c + d)v = cv + dv;
- Right distributivity: c(v + w) = cv + cw;
- Associativity: (cd)v = c(dv);
- Multiplying by 1 does not change a vector: 1v = v;
- Multiplying by 0 gives the null vector: 0v = 0;
- Multiplying by -1 gives the additive inverse: (-1)v = -v. Here + is addition either in the field or in the vector space, as appropriate; and 0 is the additive identity in either. Juxtaposition indicates either scalar multiplication or the multiplication operation in the field. Scalar multiplication may be viewed as an external binary operation or as an action of the field on the vector space. A geometric interpretation to scalar multiplication is a stretching or shrinking of a vector. As a special case, V may be taken to be K itself and scalar multiplication may then be taken to be simply the multiplication in the field. When V is Kn, then scalar multiplication is defined component-wise. The same idea goes through with no change if K is a commutative ring and V is a module over K. K can even be a rig, but then there is no additive inverse. If K is not commutative, then the only change is that the order of the multiplication may be reversed from what we've written above.

See also


- Statics
- Mechanics
- Product (mathematics) Category:linear algebraCategory:abstract algebra

Matrix multiplication

This article gives an overview of the various ways to multiply matrices.

Ordinary matrix product

By far the most important way to multiply matrices is the usual matrix multiplication. It is defined between two matrices only if the number of columns of the first matrix is the same as the number of rows of the second matrix. If A is an m-by-n matrix and B is an n-by-p matrix, then their product A×B is an m-by-p matrix given by : (A\times B)_ = \sum_^n a_b_ = a_b_ + a_b_ + \cdots + a_b_. for each pair i and j. The algebraic system of "matrix units" summarises the abstract properties of this kind of multiplication. The following picture shows how to calculate the (AB)12 element of A×B if A is a 2×4 matrix, and B is a 4×3 matrix. Elements from each matrix are paired off in the direction of the arrows; each pair is multiplied and the products are added. The location of the resulting number in AB corresponds to the row and column that were considered. :Image:Matrix multiplication diagram.PNG :(A\times B)_ = \sum_^4 a_b_ = a_b_+a_b_+a_b_+a_b_

The proportions-vectors method

There is a simpler concept for multiplying matrices. Suppose you want to mix vectors together in different proportions. The matrix on the left is the list of proportions. The matrix on the right is the list of vectors. Here's how it works : : \begin 1 & 0 & 2 \\ -1 & 3 & 1 \end \times \begin 3 & 1 \\ 2 & 1 \\ 1 & 0 \end Use vector notation : : = \begin \begin 1 & 0 & 2 \\ -1 & 3 & 1 \end \begin (Mix_1) \\ (Mix_2) \end & \times & \begin \begin 3 & 1 \end \\ \begin 2 & 1 \end \\ \begin 1 & 0 \end \end \begin (\vec) \\ (\vec) \\ (\vec) \end \end The first proportion in each mix tells how much of the first vector to use and so on : : = \begin \;\;1 \cdot \begin 3 & 1 \end & + & 0 \cdot \begin 2 & 1 \end & + & 2 \cdot \begin 1 & 0 \end \\ -1 \cdot \begin 3 & 1 \end & + & 3 \cdot \begin 2 & 1 \end & + & 1 \cdot \begin 1 & 0 \end \end \begin (Mix_1) \\ (Mix_2) \end Simplify : : = \begin \begin 5 & 1 \end \\ \begin 4 & 2 \end \end \begin (Mix_1) \\ (Mix_2) \end Remove the vector notation : : = \begin 5 & 1 \\ 4 & 2 \end Matrix multiplication is not commutative (that is, A×BB×A), except in special cases. It's easy to see why: you can't expect to switch the proportions with the vectors and get the same result. It's also easy to see why the number of columns in the proportions matrix has to be the same as the number of rows in the vectors matrix: they have to represent the same number of vectors. This notion of multiplication is important because if A and B are interpreted as linear transformations (which is almost universally done), then the matrix product AB corresponds to the composition of the two linear transformations, with B being applied first. The complexity of matrix multiplication, if carried out naively, is O(n³), but more efficient algorithms do exist. Strassen's algorithm, devised by Volker Strassen in 1969 and often referred to as "fast matrix multiplication", uses a mapping of bilinear combinations to reduce complexity to O(nlog2(7)) (approximately O(n2.807...)). In practice, though, it is rarely used since it is awkward to implement, lacking numerical stability. The constant factor involved is about 4.695 asymptotically; Winograd's method improves on this slightly by reducing it to an asymptotic 4.537. The best algorithm currently known, which was presented by Don Coppersmith and S. Winograd in 1990, has an asymptotic complexity of O(n2.376). It has been shown that the leading exponent must be at least 2.

Scalar multiplication

The scalar multiplication of a matrix A=(aij) and a scalar r gives the product : rA=(r aij). If we are concerned with matrices over a ring, then the above multiplication is sometimes called the left multiplication while the right multiplication is defined to be : Ar=(aij r). When the underlying ring is commutative, for example, the real or complex number field, the two multiplications are the same. However, if the ring is not commutative, such as the quaternions, they may be different. For example : i\begin i & 0 \\ 0 & j \\ \end = \begin -1 & 0 \\ 0 & k \\ \end \ne \begin -1 & 0 \\ 0 & -k \\ \end = \begin i & 0 \\ 0 & j \\ \endi

Hadamard product

For two matrices of the same dimensions, we have the Hadamard product or entrywise product. The Hadamard product of two m-by-n matrices A and B, denoted by A · B, is an m-by-n matrix given by (A·B)[i,j]=A[i,j]B[i,j]. For instance : \begin 1 & 3 & 2 \\ 1 & 0 & 0 \\ 1 & 2 & 2 \end \cdot \begin 0 & 0 & 2 \\ 7 & 5 & 0 \\ 2 & 1 & 1 \end = \begin 1 \cdot 0 & 3 \cdot 0 & 2 \cdot 2 \\ 1 \cdot 7 & 0 \cdot 5 & 0 \cdot 0 \\ 1 \cdot 2 & 2 \cdot 1 & 2 \cdot 1 \end = \begin 0 & 0 & 4 \\ 7 & 0 & 0 \\ 2 & 2 & 2 \end Note that the Hadamard product is a submatrix of the Kronecker product (see below). The Hadamard product is studied by matrix theorists, but it is virtually untouched by linear algebraists. It is discussed in (Horn & Johnson, 1994, Ch. 5).

Kronecker product

Main article: Kronecker product. For any two arbitrary matrices A and B, we have the direct product or Kronecker product A B defined as : \begin a_B & a_B & \cdots & a_B \\ \vdots & \vdots & \ddots & \vdots \\ a_B & a_B & \cdots & a_B \end. Note that if A is m-by-n and B is p-by-r then A B is an mp-by-nr matrix. Again this multiplication is not commutative. For example : \begin 1 & 2 \\ 3 & 1 \\ \end \otimes \begin 0 & 3 \\ 2 & 1 \\ \end = \begin 1\cdot 0 & 1\cdot 3 & 2\cdot 0 & 2\cdot 3 \\ 1\cdot 2 & 1\cdot 1 & 2\cdot 2 & 2\cdot 1 \\ 3\cdot 0 & 3\cdot 3 & 1\cdot 0 & 1\cdot 3 \\ 3\cdot 2 & 3\cdot 1 & 1\cdot 2 & 1\cdot 1 \\ \end = \begin 0 & 3 & 0 & 6 \\ 2 & 1 & 4 & 2 \\ 0 & 9 & 0 & 3 \\ 6 & 3 & 2 & 1 \end . If A and B represent linear transformations V1W1 and V2W2, respectively, then A B represents the tensor product of the two maps, V1 V2W1 W2.

Common properties

All three notions of matrix multiplication are associative :A(BC) = (AB)C and distributive: :A(B + C) = AB + AC and :(A + B)C = AC + BC and compatible with scalar multiplication: :c(AB) = (cA)B = A(cB)

See also


- Matrix inversion
- Coppersmith-Winograd algorithm
- Strassen algorithm
- Matrix chain multiplication

External links


- [http://wims.unice.fr/~wims/en_tool~linear~matmult.html WIMS Online Matrix Multiplier]

References


- Strassen, Volker, Gaussian Elimination is not Optimal, Numer. Math. 13, p. 354-356, 1969
- Coppersmith, D., Winograd S., Matrix multiplication via arithmetic progressions, J. Symbolic Comput. 9, p. 251-280, 1990
- Horn, Roger; Johnson, Charles: "Topics in Matrix Analysis", Cambridge, 1994. Category:Matrix theory

Matrix multiplication

This article gives an overview of the various ways to multiply matrices.

Ordinary matrix product

By far the most important way to multiply matrices is the usual matrix multiplication. It is defined between two matrices only if the number of columns of the first matrix is the same as the number of rows of the second matrix. If A is an m-by-n matrix and B is an n-by-p matrix, then their product A×B is an m-by-p matrix given by : (A\times B)_ = \sum_^n a_b_ = a_b_ + a_b_ + \cdots + a_b_. for each pair i and j. The algebraic system of "matrix units" summarises the abstract properties of this kind of multiplication. The following picture shows how to calculate the (AB)12 element of A×B if A is a 2×4 matrix, and B is a 4×3 matrix. Elements from each matrix are paired off in the direction of the arrows; each pair is multiplied and the products are added. The location of the resulting number in AB corresponds to the row and column that were considered. :Image:Matrix multiplication diagram.PNG :(A\times B)_ = \sum_^4 a_b_ = a_b_+a_b_+a_b_+a_b_

The proportions-vectors method

There is a simpler concept for multiplying matrices. Suppose you want to mix vectors together in different proportions. The matrix on the left is the list of proportions. The matrix on the right is the list of vectors. Here's how it works : : \begin 1 & 0 & 2 \\ -1 & 3 & 1 \end \times \begin 3 & 1 \\ 2 & 1 \\ 1 & 0 \end Use vector notation : : = \begin \begin 1 & 0 & 2 \\ -1 & 3 & 1 \end \begin (Mix_1) \\ (Mix_2) \end & \times & \begin \begin 3 & 1 \end \\ \begin 2 & 1 \end \\ \begin 1 & 0 \end \end \begin (\vec) \\ (\vec) \\ (\vec) \end \end The first proportion in each mix tells how much of the first vector to use and so on : : = \begin \;\;1 \cdot \begin 3 & 1 \end & + & 0 \cdot \begin 2 & 1 \end & + & 2 \cdot \begin 1 & 0 \end \\ -1 \cdot \begin 3 & 1 \end & + & 3 \cdot \begin 2 & 1 \end & + & 1 \cdot \begin 1 & 0 \end \end \begin (Mix_1) \\ (Mix_2) \end Simplify : : = \begin \begin 5 & 1 \end \\ \begin 4 & 2 \end \end \begin (Mix_1) \\ (Mix_2) \end Remove the vector notation : : = \begin 5 & 1 \\ 4 & 2 \end Matrix multiplication is not commutative (that is, A×BB×A), except in special cases. It's easy to see why: you can't expect to switch the proportions with the vectors and get the same result. It's also easy to see why the number of columns in the proportions matrix has to be the same as the number of rows in the vectors matrix: they have to represent the same number of vectors. This notion of multiplication is important because if A and B are interpreted as linear transformations (which is almost universally done), then the matrix product AB corresponds to the composition of the two linear transformations, with B being applied first. The complexity of matrix multiplication, if carried out naively, is O(n³), but more efficient algorithms do exist. Strassen's algorithm, devised by Volker Strassen in 1969 and often referred to as "fast matrix multiplication", uses a mapping of bilinear combinations to reduce complexity to O(nlog2(7)) (approximately O(n2.807...)). In practice, though, it is rarely used since it is awkward to implement, lacking numerical stability. The constant factor involved is about 4.695 asymptotically; Winograd's method improves on this slightly by reducing it to an asymptotic 4.537. The best algorithm currently known, which was presented by Don Coppersmith and S. Winograd in 1990, has an asymptotic complexity of O(n2.376). It has been shown that the leading exponent must be at least 2.

Scalar multiplication

The scalar multiplication of a matrix A=(aij) and a scalar r gives the product : rA=(r aij). If we are concerned with matrices over a ring, then the above multiplication is sometimes called the left multiplication while the right multiplication is defined to be : Ar=(aij r). When the underlying ring is commutative, for example, the real or complex number field, the two multiplications are the same. However, if the ring is not commutative, such as the quaternions, they may be different. For example : i\begin i & 0 \\ 0 & j \\ \end = \begin -1 & 0 \\ 0 & k \\ \end \ne \begin -1 & 0 \\ 0 & -k \\ \end = \begin i & 0 \\ 0 & j \\ \endi

Hadamard product

For two matrices of the same dimensions, we have the Hadamard product or entrywise product. The Hadamard product of two m-by-n matrices A and B, denoted by A · B, is an m-by-n matrix given by (A·B)[i,j]=A[i,j]B[i,j]. For instance : \begin 1 & 3 & 2 \\ 1 & 0 & 0 \\ 1 & 2 & 2 \end \cdot \begin 0 & 0 & 2 \\ 7 & 5 & 0 \\ 2 & 1 & 1 \end = \begin 1 \cdot 0 & 3 \cdot 0 & 2 \cdot 2 \\ 1 \cdot 7 & 0 \cdot 5 & 0 \cdot 0 \\ 1 \cdot 2 & 2 \cdot 1 & 2 \cdot 1 \end = \begin 0 & 0 & 4 \\ 7 & 0 & 0 \\ 2 & 2 & 2 \end Note that the Hadamard product is a submatrix of the Kronecker product (see below). The Hadamard product is studied by matrix theorists, but it is virtually untouched by linear algebraists. It is discussed in (Horn & Johnson, 1994, Ch. 5).

Kronecker product

Main article: Kronecker product. For any two arbitrary matrices A and B, we have the direct product or Kronecker product A B defined as : \begin a_B & a_B & \cdots & a_B \\ \vdots & \vdots & \ddots & \vdots \\ a_B & a_B & \cdots & a_B \end. Note that if A is m-by-n and B is p-by-r then A B is an mp-by-nr matrix. Again this multiplication is not commutative. For example : \begin 1 & 2 \\ 3 & 1 \\ \end \otimes \begin 0 & 3 \\ 2 & 1 \\ \end = \begin 1\cdot 0 & 1\cdot 3 & 2\cdot 0 & 2\cdot 3 \\ 1\cdot 2 & 1\cdot 1 & 2\cdot 2 & 2\cdot 1 \\ 3\cdot 0 & 3\cdot 3 & 1\cdot 0 & 1\cdot 3 \\ 3\cdot 2 & 3\cdot 1 & 1\cdot 2 & 1\cdot 1 \\ \end = \begin 0 & 3 & 0 & 6 \\ 2 & 1 & 4 & 2 \\ 0 & 9 & 0 & 3 \\ 6 & 3 & 2 & 1 \end . If A and B represent linear transformations V1W1 and V2W2, respectively, then A B represents the tensor product of the two maps, V1 V2W1 W2.

Common properties

All three notions of matrix multiplication are associative :A(BC) = (AB)C and distributive: :A(B + C) = AB + AC and :(A + B)C = AC + BC and compatible with scalar multiplication: :c(AB) = (cA)B = A(cB)

See also


- Matrix inversion
- Coppersmith-Winograd algorithm
- Strassen algorithm
- Matrix chain multiplication

External links


- [http://wims.unice.fr/~wims/en_tool~linear~matmult.html WIMS Online Matrix Multiplier]

References


- Strassen, Volker, Gaussian Elimination is not Optimal, Numer. Math. 13, p. 354-356, 1969
- Coppersmith, D., Winograd S., Matrix multiplication via arithmetic progressions, J. Symbolic Comput. 9, p. 251-280, 1990
- Horn, Roger; Johnson, Charles: "Topics in Matrix Analysis", Cambridge, 1994. Category:Matrix theory

Invertible matrix

In mathematics and especially linear algebra, an n-by-n (square) matrix A is called invertible, non-singular, or regular if there exists another n-by-n matrix B such that :AB = BA = In, where In denotes the n-by-n identity matrix and the multiplication used is ordinary matrix multiplication. If this is the case, then the matrix B is uniquely determined by A and is called the inverse of A, denoted by A−1. A square matrix that is not invertible is called singular. While the most common case is that of matrices over the real or complex numbers, all these definitions can be given for matrices over any ring. As a rule of thumb, almost all matrices are invertible. Over the field of real numbers, this can be made precise as follows: the set of singular n-by-n matrices, considered as a subset of Rn×n, is a null set, i.e., has Lebesgue measure zero. Intuitively, this means that if you pick a random square matrix over the reals, the probability that it will be singular is zero. This is true because singular matrices can be thought of as the roots of the polynomial function given by the determinant. Matrix inversion is the process of finding the matrix B that satisfies the prior equation for a given invertible matrix A.

Properties of invertible matrices

If A be a square n by n matrix over a field K (for example the field R of real numbers), the following statements are equivalent:
- A is invertible.
- A is row-equivalent to the n by n identity matrix In.
- A has n pivot positions.
- det A ≠ 0.
- rank A = n.
- The equation Ax = 0 has only the trivial solution x = 0 (i.e. Null A = ).
- The equation Ax = b has exactly one solution for each b in Kn.
- The columns of A are linearly independent.
- The columns of A span Kn (i.e. Col A = Kn).
- The columns of A form a basis of Kn.
- The linear transformation mapping x to Ax is a bijection from Kn to Kn.
- There is an n by n matrix B such that BA = In.
- There is an n by n matrix B such that AB = In.
- The transpose AT is an invertible matrix.
- The matrix times its transpose, ATA, is an invertible matrix.
- The number 0 is not an eigenvalue of A. In general, a square matrix over a commutative ring is invertible if and only if its determinant is a unit in that ring. The inverse of an invertible matrix A is itself invertible, with :(A−1)−1 = A The inverse of an invertible matrix A multiplied by a scalar k yields the product of the inverse of both the matrix and the scalar :(kA)−1 = k−1A−1 The product of two invertible matrices A and B of the same size is again invertible, with the inverse given by :(AB)−1 = B−1A−1 (note that the order of the factors is reversed.) As a consequence, the set of invertible n-by-n matrices forms a group, known as the general linear group Gl(n).

Proof for matrix product rule

If A_1, A_2, ..., A_n are nonsingular square matrices over a field, then :(A_1A_2\cdots A_n)^ = A_n^A_^\cdots A_1^ It becomes evident why this is the case if one attempts to find an inverse for the product of the A_is from first principles, that is, that we wish to determine B such that : (A_1A_2\cdots A_n)B=I where B is some matrix, in terms of the A_is. To remove A_n from the product, we can then write : (A_1A_2\cdots A_n)A_n^B'=I where B' is some matrix, which would reduce the equation to : (A_1A_2\cdots A_)B'=I Likewise, then, from : (A_1A_2\cdots A_n)A_n^B'=I we use the same technique, removing A_ from the equation, yielding : (A_1A_2\cdots A_A_n)A_n^A_^B=I where B' is some matrix, which, when simplified, gives : (A_1A_2\cdots A_)B=I If one repeat the process up to A_1, the above property is established.

Methods of matrix inversion

Gauss-Jordan elimination

Gauss-Jordan elimination is an algorithm that can be used to determine whether a given matrix is invertible and to find the inverse. An alternative is the LU decomposition which generates an upper and a lower triangular matrices which are easier to invert. For special purposes, it may be convenient to invert matrices by treating mn-by-mn matrices as m-by-m matrices of n-by-n matrices, and applying one or another formula recursively (other sized matrices can be padded out with dummy rows and columns). For other purposes, a variant of Newton's method may be convenient (particularly when dealing with families of related matrices, so inverses of earlier matrices can be used to seed generating inverses of later matrices).

Analytic solution

Writing another special matrix of cofactors, known as an adjugate matrix, can also be an efficient way to calculate the inverse of small matrices (since this method is essentially recursive, it becomes inefficient for large matrices). To determine the inverse, we calculate a matrix of cofactors: :A^=\left(C_\right)^= \begin C_ & C_ & \cdots & C_ \\ C_ & \ddots & & C_ \\ \vdots & & \ddots & \vdots \\ C_ & \cdots & \cdots & C_ \\ \end where |A| is the determinant of A, Cij is the matrix cofactor, and AT represents the matrix transpose. In most practical applications, it is in fact not necessary to invert a matrix to solve a system of linear equations. This can instead be done using decomposition techniques like LU decomposition, which are much faster than inversion. Various fast algorithms for special classes of linear systems have also been developed.

Inversion of 2 x 2 matrices

The cofactor equation listed above yields the following result for 2 x 2 matrices. Inversion of these matrices can be done easily as follows: :A^ = \begin a & b \\ c & d \\ \end^ = \frac \begin d & -b \\ -c & a \\ \end

Inversion of 3 x 3 matrices

The cofactor equation listed above yields the following result for 3 x 3 matrices. Inversion of these matrices can be done quite easily as follows: :A^ = \begin a & b & c\\ d & e & f \\ g & h & i \\ \end^ = \frac \begin ei - fh & ch - bi & bf - ce \\ fg - di & ai - cg & cd - af \\ dh - eg & bg - ah & ae - bd \end :|A| = a(ei-fh) - b(di-fg) + c(dh-eg)

Blockwise inversion

Matrices can also be inverted blockwisely by using the following analytic inversion formula: :\begin A & B \\ C & D \end^ = \begin A^+A^B(D-CA^B)^CA^ & -A^B(D-CA^B)^ \\ -(D-CA^B)^CA^ & (D-CA^B)^ \end where A, B, C and D are matrix sub-blocks of arbitrary size. This strategy is particularly advantageous if A is diagonal and (D-CA^B) (the Schur complement of A) is a small matrix, since they are the only matrices requiring to be inverted. This technique was invented by Volker Strassen, who also invented the Strassen algorithm for fast(er) matrix multiplication.

The derivative of the matrix inverse

Suppose that the matrix A depends on a parameter t. Then the derivative of the inverse of A with respect to t is given by : \frac = - A^ \frac A^. This formula can be found by differentiating the identity A−1A = I.

The Moore-Penrose pseudoinverse

Some of the properties of inverse matrices are shared by (Moore-Penrose) pseudoinverses, which can be defined for any m-by-n matrix.

See also


- matrix decomposition
- matrix multiplication
- pseudoinverse (Moore-Penrose inverse)
- singular value decomposition

References


- Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. Introduction to Algorithms, Second Edition. MIT Press and McGraw-Hill, 2001. ISBN 0262032937. Section 28.4: Inverting matrices, pp.755–760.

External links


- [http://www.easycalculation.com/matrix/matrix-inverse.php Inverse Matrix Calculator]
- [http://www.vias.org/tmdatanaleng/cc_matrix_pseudoinv.html Moore Penrose Pseudoinverse]
- Category:Linear algebra Category:Matrices ja:正則行列



EigenValue

, the picture was deformed in such a way that its central vertical axis was not modified. (Note: The corners have been cropped on the right hand picture.) The blue vector, from her chest to her shoulder, has changed direction, but the red one, from her chest to her chin, is unchanged. The red vector is thus an eigenvector of the transformation and the blue vector is not. Since the red vector was neither stretched nor compressed, its eigenvalue is 1. All vectors along the same vertical line are also eigenvectors, with the same eigenvalue. They form the eigenspace for this eigenvalue.]] In mathematics, an eigenvector of a transformation is a vector whose direction is unchanged by that transformation. The factor by which the magnitude is scaled is called the eigenvalue of that vector. A pictorial example is provided in Fig. 1. Often, a transformation is completely described by its eigenvalues and eigenvectors. An eigenspace is a set of eigenvectors with the same eigenvalue. These concepts play a major role in several branches of both pure and applied mathematics — appearing prominently in linear algebra, functional analysis, and even a variety of nonlinear situations. The German word eigen was first used in this context by Hilbert in 1904 (there was an earlier related usage by Helmholtz). "Eigen" can be translated as "own", "peculiar to", "characteristic" or "individual"—emphasizing how important eigenvalues are to defining the unique nature of a specific transformation. In English mathematical jargon, the closest translation would be "characteristic"; and some older references do use expressions like "characteristic value" and "characteristic vector", or even "eigenwert", German for eigenvalue. In the past, the standard translation used to be "proper". Today the more distinctive term "eigenvalue" is standard.

Definitions

Transformations of space—such as translation (or shifting the origin), rotation, reflection, stretching, compression, or any combination of these; other transformations could also be listed—may be visualized by the effect they produce on vectors. Vectors can be visualised as arrows pointing from one point to another.
- Eigenvectors of transformations are vectors which are either left unaffected or simply multiplied by a scale factor after the transformation.
- An eigenvector's eigenvalue is the scale factor that it has been multiplied by.
- An eigenspace is a space consisting of all eigenvectors which have the same eigenvalue.
- The geometric multiplicity of an eigenvalue is the dimension of the associated eigenspace.
- The spectrum of a transformation is the set of all its eigenvalues. For instance, an eigenvector of a rotation in three dimensions is a vector located within the axis around which the rotation is performed. The corresponding eigenvalue is 1 and the corresponding eigenspace contains all the vectors parallel to the axis. As this is a one-dimensional space, its geometric multiplicity is one. This is the only eigenvalue of the spectrum (of this rotation) that is a real number.

Examples

As the Earth rotates, every arrow pointing outward from the center of the Earth also rotates, except those arrows that lie on the axis of rotation. Consider the transformation of the Earth after one hour of rotation: An arrow from the center of the Earth to the Geographic South Pole would be an eigenvector of this transformation, but an arrow from the center of the Earth to anywhere on the equator would not be an eigenvector. Since the arrow pointing at the pole is not stretched by the rotation of the Earth, its eigenvalue is 1. Another example is provided by a thin metal sheet expanding uniformly about a fixed point in such a way that the distances from any point of the sheet to the fixed point are doubled. This expansion is a transformation with eigenvalue 2. Every vector from the fixed point to a point on the sheet is an eigenvector, and the eigenspace is the set of all these vectors. equator is scaled but its shape is not modified. In this case the eigenvalue is time dependent.]] However, three-dimensional geometric space is not the only vector space. For example, consider a stressed rope fixed at both ends, like the vibrating strings of a string instrument (Fig. 2). The distances of atoms of the vibrating rope from their positions when the rope is at rest can be seen as the components of a vector in a space with as many dimensions as there are atoms in the rope. If one considers the transformation of the rope as time passes, its eigenvectors, or eigenfunctions, if one assumes the rope is a continuous medium, are its standing waves—the things that, mediated by the surrounding air, humans can experience as the twang of a bow string or the plink of a guitar. The standing waves correspond to particular oscillations of the rope such that the shape of the rope is scaled by a factor (the eigenvalue) as time passes. Each component of the vector associated with the rope is multiplied by this time-dependent factor. The amplitude (eigenvalues) of the standing waves decrease with time if damping is considered. One can then associate a lifetime with the eigenvector, and relate the concept of an eigenvector to the concept of resonance.

Eigenvalue equation

Mathematically, vλ is an eigenvector and λ the corresponding eigenvalue of a transformation \mathcal if the equation: :\mathcal(\mathbf_\lambda)=\lambda\,\mathbf_\lambda is true, where \mathcal(\mathbf_\lambda) is the vector obtained when applying the transformation \mathcal to vλ. Suppose \mathcal is a linear transformation (which means that \mathcal(a\mathbf+b\mathbf)=a\mathcal(\mathbf)+b\mathcal(\mathbf) for all scalars a, b, and vectors v, w). Consider a basis in that vector space. Then, \mathcal and vλ can be represented relative to that basis by a matrix T—a two-dimensional array—and respectively a column vector vλ—a one-dimensional vertical array. The eigenvalue equation in its matrix representation is written :T\,v_\lambda=\lambda\,v_\lambda where the juxtaposition is matrix multiplication. This is equivalent to a set of n linear equations, where n is the number of basis vectors in the basis set. In this equation both the eigenvalue λ and the n components of vλ are unknowns. However, it is sometimes unnatural or even impossible to write down the eigenvalue equation in a matrix form. This occurs for instance when the vector space is infinite dimensional, for example, in the case of the rope above. Depending on the nature of the transformation \mathcal and the space to which it applies, it can be advantageous to represent the eigenvalue equation as a set of differential equations depending on the nature of the transformation \mathcal and the space to which it applies. If \mathcal is a differential operator, the eigenvectors are commonly called eigenfunctions of the differential operator representing \mathcal. For example, differentiation itself is a linear transformation since (if M and N are differentiable functions, and a and b are constants) : d(aM+bN)/dt = a dM/dt + b dN/dt Consider differentiation with respect to time, t. Its eigenfunctions obey the eigenvalue equation: : = \lambda N, where λ is the eigenvalue associated with the function. Such a function of time is constant if \lambda = 0, grows proportionally to itself if \lambda is positive, and decays proportionally to itself if \lambda is negative. For example, an idealized population of rabbits breeds faster the more rabbits there are, and thus satisfies the equation with a positive lambda. The solution to the eigenvalue equation is N= \exp (\lambda t), the exponential function; thus that function is an eigenfunction of the differential operator d/dt with the eigenvalue λ. If λ is negative, we call the evolution of N an exponential decay; if it is positive, an exponential growth. The value of λ can be any complex number. The spectrum of d/dt is therefore the whole complex plane. In this example the vector space in which the operator d/dt acts is the space of the differentiable functions of one variable. This space has an infinite dimension (because it is not possible to express any differentiable function as a linear combination of a finite number of basis functions). However, the eigenspace associated with any given eigenvalue λ is one dimensional. It is the set of all functions N= N_0 \exp (\lambda t) . N0 is an arbitrary constant, the initial population at t=0.

Spectral theorem

The spectral theorem depicts the whole importance of the eigenvalues and eigenvectors for characterizing a linear transformation in a unique way. In its simplest version, the spectral theorem states that, under precise conditions, a linear transformation of a vector can be expressed as the linear combination of the eigenvectors with coefficients equal to the eigenvalues times the scalar product (or dot product) of the eigenvectors with the vector on which the transformation is applied. Mathematically, it can be written as: :\mathcal(\mathbf)= \lambda_1 (\mathbf_1 \cdot \mathbf) \mathbf_1 + \lambda_2 (\mathbf_2 \cdot \mathbf) \mathbf_2 + \dots where \mathbf_1, \mathbf_2, \dots and \lambda_1, \lambda_2, \dots stand for the eigenvectors and eigenvalues of \mathcal. The simplest case in which the theorem is valid is the case where the linear transformation is given by a real symmetric matrix or complex Hermitian matrix. If one defines the nth power of a transformation as the result of applying it n times in succession, one can also define polynomials of transformations. A more general version of the theorem is that any polynomial P of \mathcal is equal to: :P(\mathcal)(\mathbf)= P(\lambda_1) (\mathbf_1 \cdot \mathbf) \mathbf_1 + P(\lambda_2) (\mathbf_2 \cdot \mathbf) \mathbf_2 + \dots The theorem can be extended to other functions of transformations like analytic functions, the most general case being Borel functions.

Eigenvalues and eigenvectors of matrices

Computing eigenvalues and eigenvectors of matrices

Suppose that we want to compute the eigenvalues of a given matrix. If the matrix is small, we can compute them symbolically using the characteristic polynomial. However, this is often impossible for larger matrices, in which case we must use a numerical method.

Symbolic computations

;Finding eigenvalues An important tool for describing eigenvalues of square matrices is the characteristic polynomial: saying that λ is an eigenvalue of A is equivalent to stating that the system of linear equations (AλI) v = 0 (where I is the identity matrix) has a non-zero solution v (an eigenvector), and so it is equivalent to the determinant: :\det(A - \lambda I) = 0 \!\ The function p(λ) = det(AλI) is a polynomial in λ since determinants are defined as sums of products. This is the characteristic polynomial of A: the eigenvalues of a matrix are the zeros of its characteristic polynomial. All the eigenvalues of a matrix A can be computed by solving the equation p_A(\lambda) = 0 . If A is an n×n matrix, then p_A has degree n and A can therefore have at most n eigenvalues. Conversely, the fundamental theorem of algebra says that this equation has exactly n roots (zeroes), counted with multiplicity. All real polynomials of odd degree have a real number as a root, so for odd n, every real matrix has at least one real eigenvalue. In the case of a real matrix, for even and odd n, the non-real eigenvalues come in conjugate pairs. ;Finding eigenvectors Once the eigenvalues λ are known, the eigenvectors can then be found by solving: : (A - \lambda I) v = 0 \!\ An example of a matrix with no real eigenvalues is the 90-degree clockwise rotation: :\begin0 & 1\\ -1 & 0\end whose characteristic polynomial is \lambda^2+1 and so its eigenvalues are the pair of complex conjugates i, -i. The associated eigenvectors are also not real.

Numerical computations

In practice, eigenvalues of large matrices are not computed using the characteristic polynomial. Computing the polynomial becomes expensive in itself, and exact (symbolic) roots of a high-degree polynomial can be difficult to compute and express: the Abel–Ruffini theorem implies that the roots of high-degree (5 and above) polynomials cannot be expressed simply using nth roots. Effective numerical algorithms for approximating roots of polynomials exist, but small errors in the eigenvalues can lead to large errors in the eigenvectors. Therefore, general algorithms to find eigenvectors and eigenvalues, are iterative. The easiest method is the power method: a random vector v is chosen and computed as Av, A^2v, A^3v, ... This sequence will after normalization almost always converge to an eigenvector corresponding to the dominant eigenvalue. This algorithm is easy, but not very useful by itself. However, popular methods such as the QR algorithm are based on it.

Properties

Algebraic multiplicity

The algebraic multiplicity of an eigenvalue λ of A is the order of λ as a zero of the characteristic polynomial of A; in other words, if t is one root of the polynomial, it is the number of factors (tλ) in the characteristic polynomial after factorization. An n×n matrix has n eigenvalues, counted according to their algebraic multiplicity, because its characteristic polynomial has degree n. An eigenvalue of algebraic multiplicity 1 is called a "simple eigenvalue". In an article on matrix theory, a statement like the one below might be encountered: :"the eigenvalues of a matrix A are 4,4,3,3,3,2,2,1," meaning that the algebraic multiplicity of 4 is two, of 3 is three, of 2 is two and of 1 is one. This style is used because algebraic multiplicity is the key to many mathematical proofs in matrix theory. The algebraic multiplicity can also be thought of as a dimension: it is the dimension of the associated generalized eigenspace (1st sense), which is the nullspace of the matrix (λI − A)k for any sufficiently large k. That is, it is the space of generalized eigenvectors (1st sense), where a generalized eigenvector is any vector which eventually becomes 0 if λI − A is applied to it enough times successively. Any eigenvector is a generalized eigenvector, and so each eigenspace is contained in the associated generalized eigenspace. This provides an easy proof that the geometric multiplicity is always less than or equal to the algebraic multiplicity. The first sense should not to be confused with generalized eigenvalue problem as stated below. For example: : A=\begin 1 & 1 \\ 0 & 1 \end. It has only one eigenvalue, namely λ = 1. The characteristic polynomial is (\lambda-1)^2, so this eigenvalue has algebraic multiplicity 2. However, the associated eigenspace is the axis usually called the x axis, spanned by the unit vector \begin 1 \\ 0 \end , so the geometric multiplicity is only 1.

Decomposition theorem

The decomposition theorem is a version of the spectral theorem in the particular case of matrices. This theorem is usually introduced in terms of coordinate transformation. If U is an invertible matrix, it can be seen as a transformation from one coordinate system to another. In this new system the coordinates of the vector \mathbf are labeled v'. The latter are obtained from the coordinates v in the original coordinate system by the relation v'=Uv and, the other way around, we have v=U^v'. Applying successively v'=Uv, w'=Uw and U^U=I, to the relation Av=w defining the matrix multiplication provides A'v'=w' with A'=UAU^, the representation of A in the new basis. The columns of U being the components of the new basis vectors within the old basis set. The decomposition theorem states that, if one chooses as columns of U n linearly independent eigenvectors of A, the new matrix A'=UAU^ is diagonal and its diagonal elements are the eigenvalues of A. If this is possible the matrix A is diagonalizable. An example of non-diagonalizable matrix is given by the matrix A above. There are several generalizations of this decomposition which can cope with the non-diagonalizable case, suited for different purposes:
- the singular value decomposition, A=U \Sigma V^
- where \Sigma is diagonal but U is not necessarily equal to V;
- the Jordan normal form, where A=U \Lambda U^ where \Lambda is not diagonal but of a simple form with non vanishing entries only along the diagonal and one element above;
- any matrix A can be written uniquely as A = S + N where S is diagonalizable, N is nilpotent (i.e., such that Nq=0 for some q), and S commutes with N (SN=NS).
- any invertible matrix A can be written uniquely as A = SJ where S is diagonalizable and J is unipotent (i.e., such that the characteristic polynomial is a power of (λ-1), and S commutes with J).

Other theorems

The spectrum is invariant under similarity transformations: the matrices A and P-1AP have the same eigenvalues for any matrix A and any invertible matrix P. The spectrum is also invariant under transposition: the matrices A and AT have the same eigenvalues. A matrix is invertible if and only if zero is not an eigenvalue of the matrix. A matrix is diagonalizable if and only if the algebraic and geometric multiplicities coincide for all its eigenvalues. In particular, an n×n matrix which has n different eigenvalues is always diagonalizable. The vector space on which the matrix acts is always the direct sum of the generalized eigenspaces (i.e., is spanned by them and they are independent). This is true of the ordinary (non-generalized) eigenspaces if and only if they are equal to the generalized eigenspaces, i.e., if and only if the matrix is diagonalizable. The location of the spectrum is often restricted if the matrix has a special form:
- All eigenvalues of a Hermitian matrix (A = A
-
) are real. Furthermore, all eigenvalues of a positive-definite matrix (v
-
Av > 0 for all vectors v) are positive.
- All eigenvalues of a skew-Hermitian matrix (A = −A
-
) are purely imaginary.
- All eigenvalues of a unitary matrix (A-1 = A
-
) have absolute value one.
- The eigenvalues of a triangular matrix are the entries on the main diagonal. This holds a fortiori for diagonal matrices. Generally, the trace, or the sum of the elements on the main diagonal of a matrix, equals the sum of the eigenvalues, and the determinant equals the product of the eigenvalues (counted according to algebraic multiplicity). Suppose that A is an m×n matrix, with mn, and that B is an n×m matrix. Then BA has the same eigenvalues as AB plus nm eigenvalues equal to zero.

Conjugate eigenvector

A conjugate eigenvector or coneigenvector is a vector sent after transformation to a scalar multiple of its conjugate, where the scalar is called the conjugate eigenvalue or coneigenvalue of the linear transformation. The coneigenvectors and coneigenvalues represent essentially the same information and meaning as the regular eigenvectors and eigenvalues, but arise when a alternative coordinate system is used. The corresponding equation is : Av = \lambda v^
- .\, For example, in coherent electromagnetic scattering theory, the linear transformation A represents the action performed by the scattering object, and the eigenvectors represent polarization states of the electromagnetic wave. In optics, the coordinate system is defined from the wave's viewpoint, known as the Forward Scattering Alignment (FSA), and gives rise to a regular eigenvalue equation, whereas in radar, the coordinate system is defined from the radar's viewpoint, known as the Back Scattering Alignment (BSA), and gives rise to a coneigenvalue equation.

Generalized eigenvalue problem

A generalized eigenvalue problem (2nd sense) is of the form : Av = \lambda B v \quad \quad where A and B are matrices. The generalized eigenvalues (2nd sense) λ can be obtained by solving the equation :\det(A - \lambda B)=0.\, If B is invertible, then the original problem can be written in the form : B^Av = \lambda v \quad \quad which is a standard eigenvalue problem. However, in most situations it is preferable not to perform the inversion, and solve the generalized eigenvalue problem as stated originally. If A and B are symmetric matrices with real entries, then the eigenvalues are real. This would have not been easy to see from the second equivalent formulation, because the matrix B^A is not necessarily symmetric if A and B are. An example is provided by the molecular orbital application below.

Entries from a ring

In the case of a square matrix A with entries in a ring, λ is called a right eigenvalue if there exists a column vector x such that Axx, or a left eigenvalue if there exists a nonzero row vector y such that yA=yλ. If the ring is commutative, the left eigenvalues are equal to the right eigenvalues and are just called eigenvalues. If not, for instance if the ring is the set of quaternions, they may be different.

Infinite-dimensional spaces

quaternion (cross section) of atomic Chlorine. The sharp lines obtained in theory correspond to the discrete spectrum (Rydberg series) of the Hamiltonian; the broad structure on the right is associated to the continuous spectrum (ionization). The associated experimental results have been obtained by measuring the intensity of X-rays absorbed by a gas of atoms as a function of the incident photon energy in eV.]] If the vector space is infinite dimensional, it may be advantageous to define the concept of spectral values. The spectral values are the set of scalars λ for which the Green's operator, \left(\mathcal-\lambda\right)^, associated to the transformation is not defined, that is such that \mathcal-\lambda is not invertible (i.e., the inverse transformation to \mathcal-\lambda does not exist). If λ is an eigenvalue of \mathcal, λ is also a spectral value of it. However, the reverse relation is not true: any spectral value is not an eigenvalue. There are operators on Hilbert or Banach spaces which have no eigenvectors at all. This can be seen on the following example. The bilateral shift on the Hilbert space \ell^2(\mathbf) (the space of all sequences of scalars \dots a_, a_0, a_1,a_2,\dots such that \dots |a_|^2 + |a_0|^2 + |a_1|^2 + |a_2|^2 +\dots converge) has no eigenvalue but has spectral values. In functional analysis, the spectrum of an operator is defined as the set of all its spectral values. This is a key concept in scattering theory. In infinite-dimensional spaces, the spectrum of an operator can be discrete or continuous. The former case occurs if the spectrum is a countable set of scalars; the latter if not. The exponential growth or decay provides an example of a continuous spectrum and the vibrating string an example above. The hydrogen atom is an example where both type of spectra appear. The bound states of the hydrogen atom correspond to the discrete part of the spectrum while the ionization processes are described by the continuous part. Fig. 3 exemplifies this concept in the case of the Chlorine atom.

Applications

;Schrödinger equation Chlorines of an electron in a hydrogen atom can be seen as the eigenvectors of the hydrogen atom Hamiltonian as well as of the angular momentum operator. They are associated to eigenvalues interpreted as their energies (increasing downward: n=1,2,3,...) and angular momentum (increasing across: s, p, d,...). Here are plotted the square of the absolute value of the wavefunctions. Brighter areas correspond to higher probability density for a position measurement. The center of each figure is the atomic nucleus, a proton]] An example of an eigenvalue equation where the transformation \mathcal is represented in terms of a differential operator is the time-independent Schrödinger equation in quantum mechanics :H\Psi_E = E\Psi_E where H, the Hamiltonian, is a second-order differential operator and \Psi_E, the wavefunction, is one of its eigenfunctions corresponding to the eigenvalue E, interpreted as its energy. However, in the case we only look for the bound state solutions of the Schrödinger equation, as is usually the case in quantum chemistry, we look for \Psi_E within the space of square integrable functions. Since this space is a Hilbert space, with a well-defined scalar product, we can introduce a basis set in which \Psi_E and H can be represented as a one-dimensional array and a matrix respectively. This allows us to represent the Schrödinger equation in a matrix form. (Fig. 4 presents the lowest eigenfunctions of the Hydrogen atom Hamiltonian.) The Dirac notation often used in this context stresses the difference between the vector or state |\Psi_E\rangle and its representation, the function \Psi_E. In this context one writes the Schrödinger equation :H|\Psi_E\rangle = E|\Psi_E\rangle and call |\Psi_E\rangle an eigenstate of H (sometimes written \hat in introductory textbooks) which is seen as a transformation (see Observable) instead of particular representation of it in terms of differential operators. In the equation above H|\Psi_E\rangle is understood as the vector obtained by application of the transformation H to |\Psi_E\rangle. ;Molecular orbitals In quantum mechanics, and in particular in atomic and molecular physics, within the Hartree-Fock theory, the atomic and molecular orbitals can be defined by the eigenvectors of the Fock operator. The corresponding eigenvalues are interpreted as ionization potentials via Koopmans' theorem. In this case, the term eigenvector is used in a somewhat more general meaning, since the Fock operator is explicitly dependent of the orbitals and their eigenvalues. If one wants to underline this aspect one speaks of implicit eigenvalue equation. Such equations are usually solved by an iteration procedure, called in this case self-consistent field method. In quantum chemistry, one often represents the Hartree-Fock equation in a non-orthogonal basis set. This particular representation is a generalized eigenvalue problem called Roothaan equations. ;Factor analysis In factor analysis, the eigenvectors of a covariance matrix correspond to factors, and eigenvalues to factor loadings. Factor analysis is a statistical technique used in the social sciences and in marketing, product management, operations research, and other applied sciences that deal with large quantities of data. The objective is to explain most of the variability among a number of observable random variables in terms of a smaller number of unobservable random variables called factors. The observable random variables are modeled as linear combinations of the factors, plus "error" terms. error ;Eigenfaces In image processing, processed images of faces can be seen as vectors whose components are the brightnesses of each pixel. The dimension of this vector space is the number of pixels. The eigenvectors of the covariance matrix associated to a large set of normalized pictures of faces are called eigenfaces. They are very useful for expressing any face image as a linear combination of some of them. Eigenfaces provide a means of applying data compression to faces for identification purposes. ;Tensor of inertia In mechanics, the eigenvectors of the inertia tensor define the principal axes of a rigid body. The tensor of inertia is a key quantity required in order to determine the rotation of a rigid body around its center of mass. ;Stress tensor In solid mechanics, the stress tensor is symmetric and so can be decomposed into a diagonal tensor with the eigenvalues on the diagonal and eigenvectors as a basis. Because it is diagonal, in this orientation, the stress tensor has no shear components; the components it does have are the principal components. ; Eigenvalues of a graph In spectral graph theory, an eigenvalue of a graph is defined as an eigenvalue of the graph's adjacency matrix A, or (increasingly) of the graph's Laplacian matrix I-T^AT^, where T is a diagonal matrix holding the degree of each vertex, and in T^, 0 is substituted for 0^.

External links


- [http://ocw.mit.edu/OcwWeb/Mathematics/18-06Spring-2005/VideoLectures/index.htm Videos of MIT Linear Algebra Course, spring 2005] - See Lecture Eigenvalues and Eigenvectors
- [http://mathworld.wolfram.com/Eigenvector.html MathWorld: Eigenvector]
- [http://members.aol.com/jeff570/e.html Earliest Known Uses of Some of the Words of Mathematics: E - see eigenvector and related terms]
- [http://www.caam.rice.edu/software/ARPACK/ ARPACK] is a collection of FORTRAN subroutines for solving large scale eigenvalue problems
-
- [http://www.arndt-bruenner.de/mathe/scripts/engl_eigenwert.htm Online calculator for Eigenvalues and Eigenvectors]

References


- Roger A. Horn and Charles R. Johnson, Matrix Analysis, Cambridge University Press (1985). ISBN 0-521-30586-1 (hardback), ISBN 0-521-38632-2 (paperback).
- John B. Fraleigh and Raymond A. Beauregard, Linear Algebra (3rd edition), Addison-Wesley Publishing Company (1995). ISBN 0-201-83999-7 (international edition).
- Claude Cohen-Tannoudji, Quantum Mechanics, Wiley (1977). ISBN 0-471-16432-1. (Chapter II. The mathematical tools of quantum mechanics.)

Notes

# T. W Gorczyca, Auger Decay of the Photoexcited Inner Shell Rydberg Series in Neon, Chlorine, and Argon, Abstracts of the 18th International Conference on X-ray and Inner-Shell Processes, Chicago, August 23-27 (1999). # In this context, only linear transformations from a vector space to itself are considered. # Since all linear transformations leave the zero vector unchanged, it is not considered an eigenvector. Category:Abstract algebra Category:Linear algebra Category:German loanwords ja:固有値

Unit vectors

Unit vector

EigenVectors

, the picture was deformed in such a way that its central vertical axis was not modified. (Note: The corners have been cropped on the right hand picture.) The blue vector, from her chest to her shoulder, has changed direction, but the red one, from her chest to her chin, is unchanged. The red vector is thus an eigenvector of the transformation and the blue vector is not. Since the red vector was neither stretched nor compressed, its eigenvalue is 1. All vectors along the same vertical line are also eigenvectors, with the same eigenvalue. They form the eigenspace for this eigenvalue.]] In mathematics, an eigenvector of a transformation is a vector whose direction is unchanged by that transformation. The factor by which the magnitude is scaled is called the eigenvalue of that vector. A pictorial example is provided in Fig. 1. Often, a transformation is completely described by its eigenvalues and eigenvectors. An eigenspace is a set of eigenvectors with the same eigenvalue. These concepts play a major role in several branches of both pure and applied mathematics — appearing prominently in linear algebra, functional analysis, and even a variety of nonlinear situations. The German word eigen was first used in this context by Hilbert in 1904 (there was an earlier related usage by Helmholtz). "Eigen" can be translated as "own", "peculiar to", "characteristic" or "individual"—emphasizing how important eigenvalues are to defining the unique nature of a specific transformation. In English mathematical jargon, the closest translation would be "characteristic"; and some older references do use expressions like "characteristic value" and "characteristic vector", or even "eigenwert", German for eigenvalue. In the past, the standard translation used to be "proper". Today the more distinctive term "eigenvalue" is standard.

Definitions

Transformations of space—such as