Eigenvalue Problems and Matrix Invariants

Many problems in physics and mathematics can be expressed as a set 
of linear, first-order differential equations in the state variables 
x1,x2,..,xn and the time t as shown below

       ---  =  a11 x1  +  a12 x2 + .. + a1n xn

       ---  =  a21 x1  +  a22 x2 + .. + a2n xn
                         .                                      (1)

       ---  =  an1 x1  +  an2 x2 + .. + ann xn

In matrix notation this system of equations can be written as

                       X  =  AX                                (2)
where X and X are the column vectors
             _  _                _      _
            |    |              |        |
            | x1 |              | dx1/dt |
            |    |              |        |
            | x2 |          .   | dx2/dt |
        X = | .  |          X = |   .    |
            | .  |              |   .    |
            | .  |              |   .    |
            |    |              |        |
            | xn |              | dxn/dt |
            |_  _|              |_      _|

and A is the coefficient matrix
                 _                    _
                |                      |
                |  a11  a12  ...  a1n  |
                |                      |
                |  a21  a22  ...  a2n  |
            A = |                      |
                |            .         |
                |            .         |
                |            .         |
                |  an1  an2  ...  ann  |
                |_                    _|

The general solution of this set of equations consists of an
expression for each variable xi as a function of time, and each 
of these expressions is of the form

  xi(t) = ci1 e^(L1 t) + ci1 e^(L1 t) + ... + cin e^(Ln t)       (3)

where the coefficients cij are determined by the initial conditions.
The values Li, i=1,2,..,n  are called the eigenvalues (as the
"characteristic values") of the coefficient matrix A.

The meaning of the eigenvalues can be expressed in several different
ways.  One way is to observe that each of the n variables in a first-
order system of differential equations is individually the solution 
of the same nth-order differential equation, called the characteristic
equation of the system.  The corresponding characteristic polynomial
has n roots, and these roots are the eigenvalues of the system.  
(The characteristic equation can be found by expressing the set 
of equations (1) entirely in terms of any single variable.)

Another (more common) way of looking at eigenvalues is in terms of
the basis elements of the space of vector solutions to (2).  For
any specific one of the eigenvalues, Lj, one solution of the system
(2) is given by just taking the jth component of each variable
xi(t), giving a solution vector X(t) with components

          xi(t)  =  cij e^(Lk t)   ,   i = 1,2,..,n            (4)

We can safely assume this represents a solution (though certainly
not the most general solution) of (2), because the "c" coefficients
of (3) are arbitrary functions of the initial conditions, and there
are certainly conditions such that all but the jth coefficient of
each xi(t) is zero.

Hereafter we will denote the specific eigenvalue Lk as simply L.
Taking the derivative of (4) gives the components of the derivative
vector X'(t)

        d xi(t)
        -------    =    L cij e^(L t)    =   L xi           (5)

for  i = 1,2,..,n.  As a result, equation (2) becomes

                            AX = LX

In other words, the effect of multiplying the vector X by the matrix
A is to produce a rescaled version of X.  (Another application of this,
aside from differential equations, is in studying spatial rotations,
where the vector X that is left pointing in the same direction after
being multiplied by a rotation matrix A is obviously pointed along
the axis of rotation.)  Substituting this particular solution into 
the original system of equations (1) and combining terms, we have 
the matrix equation

        _                         _  _   _       _   _
       |                           ||     |     |     |
       |(a11-L)  a12   ...   a1n   ||  x1 |     |  0  |
       |                           ||     |     |     |
       |  a21  (a22-L) ...   a2n   ||  x2 |     |  0  |
       |                           ||     |  =  |     |          (6a)
       |              .            ||  .  |     |  .  |
       |              .            ||  .  |     |  .  |
       |                           ||     |     |     |
       |  an1    an2   ... (ann-L) ||  xn |     |  0  |
       |_                         _||_   _|     |_   _|

Notice that the particular eigenvalue L has been subtracted from
each of the diagonal terms of the coefficient matrix.  Thus, we can
express the above symbolically as

                     (A - LI) X  =  0                           (6b)

where I is the nxn identity matrix
                     _               _
                    |  1  0  0 ... 0  |
                    |  0  1  0 ... 0  |
              I =   |  0  0  1 ... 0  |
                    |         .       |
                    |         .       |
                    |_ 0  0  0 ... 1 _|

It might seem at first that the algebraic system of equations (6)
can be solved simply by Gaussian elimination or any other standard
technique for solving n equations in n unknowns, for any arbitrary
value of the parameter L.  For example, we might try to express xj
as the ratio of the determinant of (A-LI) with the right-hand column
vector substituted for the jth column, divided by the determinant of
the unaltered (A-LI).  However, since the right hand column consists
entirely of zeros, the numerator of every xj will be identically
zero, which just corresponds to the trivial solution xj(t) = 0 for
all j.  Does this mean there are no non-trivial solutions?

Fortunately, no, because there is an escape clause.  Every xj(t) is
of the form  0/det(A-LI), but need not be zero IF (and only if) the
determinant of (A-LI) itself is zero.  In that special case, the
system is singular, and each xj(t) is indeterminate (0/0) based on
the full system, and so the system degenerates, and we in general
we have non-trivial solutions.

Of course, this escape clause only works if det(A-LI) = 0, which
will not generally be the case for arbitrary values of L.  In general
if we algebraically expand the determinant of (A-LI) we get a
polynomial of degree n in the parameter L, which is the characteristic
polynomial of the system, and we need L to be one of the n roots of 
this polynomial.  For each one of these roots (the eigenvalues of
the system) we can then determine the corresponding non-trivial
solution X(t).  These are called the eigenVECTORS of the system.
(It is possible for more than one vector to satisfy the system for
a given eigenvalue, but we typically deal with systems in which 
each eigenvalue allows a unique eigenvector.)

We've now seen two ways of arriving at the characteristic polynomial
of the system.  The first was to algebraically eliminate all but one
of the x variables from the set of equations, resulting in a single
nth-order differential equation in the remaining variable, and then
evaluating the characteristic polynomial of this equation.  The second
way is to algebraically write out the formula for the determinant of
the coefficient matrix with the diagonal elements decremented by L.

To expidite this latter approach, it's useful to note that the
coefficients of the characteristic polynomial of an nxn matrix M
are given by the n  "invariants"  of the matrix.  If we write the
characteristic polynomial as

   x^n - g_1 x^(n-1) + g_2 x^(n-2) - ... + (-1)^n g_n  =  0      (7)

then the coefficient g_j is the sum of the determinants of all the 
sub-matrices of M taken j rows and columns at a time (symmetrically).  
Thus, g_1 is the TRACE of M (i.e., the sum of the diagonal elements), 
and g_2 is the sum of the determinants of the n(n-1)/2 sub-matricies 
that can be formed from M by deleting all but two rows and columns 
(symmetrically), and so on.  Continuing in this way, we can find 
g_3, g_4,... up to g_n, which of course is the determinant of the 
entire n x n matrix.

Incidentally, as was first noted by Cayley, the matrix M itself
is automatically a root of its own characteristic polynomial.  It's
also worth noting that since (7) is also the characteristic polynomial
of the single nth order differential equation satisfied by each of
the variables xj(t), the technique described in the note entitled
Root-Matched Recurrences For DiffEQs can immediately be used to 
determine the coefficients qi of the recurrence relation for any 
given discrete time increment Dt, giving the vector X(t) as a linear 
combination of the n prior vectors

     X(t) = q1 X(t-Dt) + q2 X(t-2Dt) + ... + qn X(t-nDt)

Return to MathPages Main Menu