Skip to main content
Logo image

Section 5.4 Eigenvectors and eigenvalues

For the remaining sections of this chapter we will focus our investigation on linear transformations of the form \(T\colon V\rightarrow V\text{:}\) that is, transformations from a space \(V\) into itself. When \(V\) is finite-dimensional we can get a computational grip on \(T\) by choosing an ordered basis \(B\) and considering the matrix representation \([T]_B\text{.}\) As was illustrated in Example 5.2.12, different matrix representations \([T]_B\) and \([T]_{B'}\) provide different insights into the nature of \(T\text{.}\) Furthermore, we see from this example that if the action of \(T\) on a chosen basis is simple to describe, then so too is the matrix representation of \(T\) with respect to that basis.
A particularly agreeable situation arises when the basis \(B=(\boldv_1, \boldv_2, \dots, \boldv_n)\) satisfies
\begin{equation*} T(\boldv_i)=c_i\boldv_i, c_i\in \R \end{equation*}
for all \(1\leq i\leq n\text{.}\) Using recipe (5.2.1) we in this case that the corresponding matrix representation
\begin{equation*} [T]_B=\begin{bmatrix} c_1 \amp 0\amp \dots \amp \amp 0\\ 0 \amp c_2\amp 0\amp \dots \amp 0\\ 0\amp 0\amp \ddots \amp \amp 0\\ \vdots \amp \amp \amp \amp \vdots \\ 0\amp 0\amp \dots \amp 0\amp c_n \end{bmatrix} \end{equation*}
is diagonal! Diagonal matrices are about as simple as they come: they wear all of their properties (rank, nullity, invertibility, etc.) on their sleeve. If we hope to find a diagonal matrix representation of \(T\text{,}\) then we should seek nonzero vectors \(\boldv\) satisfying \(T(\boldv)=c\boldv\) for some \(c\in \R\text{:}\) these are called eigenvectors of \(T\text{.}\)

Subsection 5.4.1 Eigenvectors

We further motivate the notion of an eigenvector with an illustrative example.

Example 5.4.1.

Consider \(T_A\colon \R^2\rightarrow \R^2\) where
\begin{equation*} A=\frac{1}{5}\begin{amatrix}[rr]-3\amp 4\\ 4\amp 3 \end{amatrix}\text{.} \end{equation*}
It turns out that \(T=T_A\) has a simple geometric description, though you would not have guessed this from \(A\text{.}\) To reveal the geometry at play, we represent \(T\) with respect to the orthogonal basis \(B'=(\boldv_1=(1,2), \boldv_2=(2,-1))\text{.}\) Since
\begin{align*} T((1,2)) \amp=A\colvec{1 \\ 2 }=\colvec{1 \\ 2}=1(1,2)+0(2,-1) \\ T_A((2,-1)) \amp=A\colvec{2 \\ -1}=-(2,-1)=0(1,2)+(-1)(2,-1) \text{,} \end{align*}
it follows that
\begin{equation*} [T]_{B'}=\begin{amatrix}[rr] 1\amp 0\\ 0\amp -1 \end{amatrix}\text{.} \end{equation*}
The alternative representation given by \(A'=[T]_{B'}\) reveals that \(T\) is none other than reflection through the line \(\ell=\Span\{(1,2)\}\text{!}\) How? Given any vector \(\boldv\in \R^2\text{,}\) we can write
\begin{equation} \boldv=c_1\boldv_1+c_2\boldv_2\text{.}\tag{5.4.1} \end{equation}
Note that since \(\boldv_1\) and \(\boldv_2\) are orthogonal, we have \(c_1\boldv_1\in \ell\) and \(c_2\boldv_2\in \ell^\perp\text{:}\) i.e., (5.4.1) is the orthogonal decomposition of \(\boldv\) with respect to \(\ell\text{.}\) Next, the representation \(A'=[T]_{B'}\) tells us that \(T(\boldv_1)=\boldv_1\) and \(T(\boldv_2)=-\boldv_2\text{.}\) It follows that \(T(\boldv)=c_1\boldv_1-c_2\boldv_2\text{.}\) This is nothing more than a vector description of reflection through the line \(\ell\text{,}\) as Figure 5.4.2 makes clear.
Figure 5.4.2. Reflection through \(\ell=\Span\{(1,2)\}\)
The success of our analysis in Example 5.4.1 depended on finding the vectors \(\boldv_1\) and \(\boldv_2\) satisfying \(T(\boldv_1)=\boldv_1\) and \(T(\boldv_2)=(-1)\boldv_2\text{.}\) These are examples of eigenvectors, a concept we now officially define. For reasons that will become clear below, it is convenient to give separate definitions for linear transformations and matrices.

Definition 5.4.3. Eigenvectors and eigenvalues.

Eigenvectors of linear transformations.
Let \(T\colon V\rightarrow V\) be a linear transformation. A nonzero vector \(\boldv\in V\) satisfying
\begin{equation} T(\boldv)=\lambda\boldv\tag{5.4.2} \end{equation}
for some \(\lambda\in\R\) is called an eigenvector of \(T\) with eigenvalue \(\lambda\text{.}\)
Eigenvectors of matrices.
Let \(A\) be an \(n\times n\) matrix. A nonzero \(\boldx\in \R^n\) satisfying
\begin{equation} A\boldx=\lambda\boldx\tag{5.4.3} \end{equation}
for some \(\lambda\in \R\) is called an eigenvector of \(A\) with eigenvalue \(\lambda\text{.}\)
In both cases we will call an eigenvector with eigenvalue \(\lambda\) a \(\lambda\)-eigenvector for short.

Remark 5.4.4.

You ask: Why use \(\lambda\) instead of \(c\) or \(k\text{?}\) My answer: tradition!

Remark 5.4.5.

Note well the important condition that an eigenvector must be nonzero. This means the zero vector \(\boldzero\) by definition is not an eigenvector. If we allowed \(\boldzero\) as an eigenvector, then the notion of the eigenvalue of an eigenvector would no longer be well-defined. This is because for any linear transformation we have
\begin{equation*} T(\boldzero)=\boldzero, \end{equation*}
which implies that
\begin{equation*} T(\boldzero)=\lambda\boldzero \end{equation*}
for all \(\lambda\in \R\text{.}\)

Remark 5.4.6. Visualizing eigenvectors.

Suppose \(\boldv\ne\boldzero\) is an eigenvector of the linear transformation \(T\colon V\rightarrow V\text{.}\) Letting \(W=\Span\{\boldv\}\text{,}\) this means that \(T(\boldv)=\lambda\boldv\in W\text{:}\) i.e., \(T\) maps an eigenvector to some other element of the one-dimensional subspace it defines. The special case where \(V=\R^2\) is easy to visualize and can help guide your intuition in the general case. (See Figure 5.4.7) Here the space \(\Span\{\boldv\}=\ell\) is a line passing through origin. If \(\boldv\) is an eigenvector of a given linear transformation, then it must be mapped to some other vector pointing along \(\ell\text{:}\) e.g., \(\lambda_1\boldv\) or \(\lambda_2\boldv_2\text{.}\) It it is not an eigenvector, it gets mapped to a vector \(\boldw\) that does not point along \(\ell\text{.}\)
Figure 5.4.7. Visualizing eigenvectors
Given a linear transformation \(T\colon V\rightarrow V\) we wish to (a) determine which values \(\lambda\in \R\) are eigenvalues of \(T\text{,}\) and (b) find all the eigenvectors corresponding to a given eigenvalue \(\lambda\text{.}\) In the next examples we carry out such an investigation in an ad hoc manner.

Example 5.4.8. Zero and identity transformations.

Assume \(V\) is nonzero. Recall that the zero transformation \(0_V\colon V\rightarrow V \) and identity transformation \(\id_V\colon V\rightarrow V\) are defined as \(0_V(\boldv)=\boldzero\) and \(\id_V(\boldv)=\boldv\) for all \(\boldv\in V\text{.}\) Find all eigenvalues and eigenvectors of \(0_V\) and \(\id_V\text{.}\)
Solution.
Since \(0_V(\boldv)=\boldzero=0\boldv\) for all \(\boldv\in V\text{,}\) we see that \(0\) is the only eigenvalue of \(0_V\text{,}\) and that all nonzero vectors of \(V\) are \(0\)-eigenvectors.
Similarly, since \(\id_V(\boldv)=\boldv=1\boldv\) for all \(\boldv\in V\text{,}\) we see that \(1\) is the only eigenvalue of \(\id_V\text{,}\) and that all nonzero vectors of \(V\) are \(1\)-eigenvectors.

Example 5.4.9. Reflection.

Let \(\ell\) be a line in \(\R^2\) passing through the origin, and define \(T\colon \R^2\rightarrow \R^2\) to be reflection through \(\ell\text{.}\) (See Definition 3.2.16.) Find all eigenvectors and eigenvalues of \(T\text{.}\) Use a geometric argument.
Solution.
Since the reflection operator fixes all elements of the line \(\ell\text{,}\) we have \(T(\boldx)=\boldx\) for any \(\boldx\in \ell\text{.}\) This shows that any nonzero element of \(\ell\) is an eigenvectors of \(T\) with eigenvalue \(1\text{.}\)
Similarly, since \(\ell^\perp\) is orthogonal to \(\ell\text{,}\) reflection through \(\ell\) takes any element \(\boldx=(x,y)\in \ell^\perp\) and maps it to \(-\boldx=(-x,-y)\text{.}\) Thus any nonzero element \(\boldx\in \ell^\perp\) is an eigenvector of \(T\) with eigenvalue \(-1\text{.}\)
We claim that these two cases exhaust all eigenvectors of \(T\text{.}\) Indeed, in general a nonzero vector \(\boldv\) lies in the line \(\ell'=\Span\{\boldx\}\text{,}\) and its reflection \(T(\boldx)\) lies in the line \(\ell''=\Span\{T(\boldx)\}\text{,}\) which itself is the result of reflecting the line \(\ell'\) through \(\ell\text{.}\) Now assume \(T(\boldx)=\lambda\boldx\text{.}\) We must have \(\lambda\ne 0\text{,}\) since \(T(\boldv)\ne \boldzero\) if \(\boldx\ne \boldzero\text{;}\) but if \(\lambda\ne 0\) it follows that the line \(\ell=\Span\{\boldx\}\) and its reflection \(\ell''=\Span\{T(\boldv)\}\) are equal. Clearly the only lines that are mapped to themselves by reflection through \(\ell\) are \(\ell\) and \(\ell^\perp\text{.}\) Thus if \(\boldx\) is an eigenvector of \(T\) it must lie in \(\ell\) or \(\ell^\perp\text{.}\)

Example 5.4.10. Rotation.

Fix \(\theta\in (0,2\pi)\) and define \(T\colon \R^2\rightarrow \R^2\) to be rotation by \(\theta\text{.}\) (See Definition 3.2.12) Find all eigenvectors and eigenvalues of \(T\text{.}\) Use a geometric argument. Your answer will depend on the choice of \(\theta\text{.}\)
Solution.
Case: \(\theta=\pi\).
Rotation by \(\pi\) sends every vector \(\boldx\in \R^2\) to \(-\boldx\text{:}\) i.e., \(T(\boldx)=-\boldx=(-1)\boldx\text{.}\) It follows that \(\lambda=-1\) is the only eigenvalue of \(T\) and all nonzero elements of \(\R^2\) are eigenvectors with eigenvalue \(-1\text{.}\)
Case: \(\theta\ne \pi\).
A similar argument to the one in Example 5.4.9 shows that \(T\) has no eigenvalues in this case. In more detail, a nonzero vector \(\boldx\) lies in the line \(\ell=\Span\{\boldx\}\text{,}\) and its rotation \(T(\boldx)\) lies in the line \(\ell'=\Span\{T(\boldx)\}\text{,}\) which is the result of rotating \(\ell\) by the angle \(\theta\text{.}\) Since \(\theta\ne \pi\text{,}\) it is clear that \(\ell\ne \ell'\text{,}\) and thus we cannot have \(T(\boldv)=\lambda\boldv\) for some \(\lambda\in \R\text{.}\)

Example 5.4.11. Transposition.

Consider the linear transformation
\begin{align*} S\colon M_{22} \amp\rightarrow M_{22} \\ A \amp\mapsto A^T \text{.} \end{align*}
Determine all eigenvectors and eigenvalues of \(S\text{.}\)
Solution.
To be an eigenvector of \(S\) a nonzero matrix \(A\) must satisfy \(S(A)=\lambda A\) for some \(\lambda\in \R\text{.}\) Using the definition of \(S\text{,}\) this means
\begin{equation} A^T=\lambda A\text{.}\tag{5.4.4} \end{equation}
We ask for which scalars \(\lambda\in \R\) does there exist a nonzero matrix \(A\) satisfying (5.4.4). Let’s consider some specific choices of \(\lambda\text{.}\)
Case: \(\lambda=1\).
In this case (5.4.4) reads \(A^T=A\text{.}\) Thus the eigenvectors of \(S\) with eigenvalue \(1\) are precisely the nonzero symmetric matrices: i.e.,
\begin{equation*} A=\begin{amatrix}[rr]a\amp b\\ b\amp c \end{amatrix}\text{.} \end{equation*}
Case: \(\lambda= -1\).
For this choice of \(\lambda\) we seek nonzero matrices satisfying \(S(A)=A^T=(-1)A=-A\text{.}\) These are precisely the nonzero skew-symmetric matrices: i.e.,
\begin{equation*} A=\begin{amatrix}[rr]0\amp a\\ -a \amp 0 \end{amatrix}\text{.} \end{equation*}
Case: \(\lambda\ne \pm 1\).
Suppose \(A=\begin{amatrix}[cc]a\amp b \\ c\amp d \end{amatrix}\) satisfies \(A^T=\lambda A\text{.}\) Equating the entries of these two matrices yields the system
\begin{align*} a \amp =\lambda a \\ d \amp = \lambda d\\ b \amp =\lambda c \\ c \amp =\lambda b \text{.} \end{align*}
The first two equations imply \(a=d=0\text{,}\) using the fact that \(\lambda\ne 1\text{.}\) The second two equations imply further that \(b=\lambda^2 b\) and \(c=\lambda^2 c\text{.}\) Since \(\lambda\ne \pm 1\text{,}\) \(\lambda^2\ne 1\text{.}\) It follows that \(b=c=0\text{.}\) We conclude that for \(\lambda\ne \pm 1\text{,}\) if \(A^T=\lambda A\text{,}\) then \(A=\boldzero\text{.}\) It follows that \(\lambda\) is not an eigenvalue of \(S\) in this case.
In summation, our analysis shows that the transposition operator \(S\) has exactly two eigenvalues, \(\lambda_1=1\) and \(\lambda_2=-1\text{,}\) that the eigenvectors of \(S\) with eigenvalue 1 are the nonzero symmetric matrices, and that the eigenvalues of \(S\) with eigenvalue \(-1\) are the nonzero skew-symmetric matrices.

Example 5.4.12. Differentiation.

Let \(T\colon C^\infty(\R)\rightarrow C^\infty(\R)\) be defined as \(T(f)=f'\text{.}\) Find all eigenvalues and eigenvectors of \(T\text{.}\)
Solution.
An eigenvector of \(T\) is a nonzero function \(f\) satisfying \(T(f)=\lambda f\) for some \(\lambda\text{.}\) By definition, this means
\begin{equation} f'=\lambda f\tag{5.4.5} \end{equation}
for some \(\lambda\in\R\text{.}\) Thus \(\lambda\) is an eigenvalue of \(T\) if and only if the differential equation (5.4.5) has a nonzero solution. This is true for all \(\lambda\in \R\text{!}\) Indeed for any \(\lambda\) the exponential function \(f(x)=e^{\lambda x}\) satisfies \(f'(x)=\lambda e^{\lambda x}=\lambda f(x)\) for all \(x\in \R\text{.}\) Furthermore, any solution to (5.4.5) is of the form \(f(x)=Ce^{\lambda x}\) for some \(C\in \R\text{.}\) We conclude that (a) every \(\lambda\in \R\) is an eigenvalue of \(T\text{,}\) and (b) for a given \(\lambda\text{,}\) the \(\lambda\)-eigenvectors of \(T\) are precisely the functions of the form \(f(x)=Ce^{\lambda x}\) for some \(C\ne 0\text{.}\)

Subsection 5.4.2 Finding eigenvalues and eigenvectors systematically

You can imagine that our ad hoc approach to finding eigenvalues and eigenvectors will break down once the linear transformation under consideration becomes complicated enough. As such it is vital to have a systematic method of finding all eigenvalues and eigenvectors of a linear transformation \(T\colon V\rightarrow V\text{.}\) The rest of this section is devoted to describing just such a method in the special case where \(\dim V=n\lt\infty\text{.}\) The first key observation is that we can answer the eigenvalues/eigenvectors of \(T\) by answering the same question about \(A=[T]_B\text{,}\) where \(B\) is an ordered basis of \(V\text{.}\)
We prove statement (1) as a chain of equivalences:
\begin{align*} \boldv \text{ is an eigenvector of } T \amp \iff \boldv\ne 0 \text{ and } T\boldv=\lambda \boldv \\ \amp \iff \boldx=[\boldv]_B\ne \boldzero \text{ and } [T\boldv]_B=[\lambda\boldv] \amp (\knowl{./knowl/th_coordinates.html}{\text{5.1.12}}, (2)) \\ \amp \iff \boldx=[\boldv]_B\ne \boldzero \text{ and } [T\boldv]_B=\lambda[\boldv]_B \amp (\knowl{./knowl/th_coordinates.html}{\text{5.1.12}}, (1))\\ \amp \iff \boldx=[\boldv]\ne \boldzero \text{ and } [T]_B[\boldv]_B=\lambda[\boldv]_B \amp (\knowl{./knowl/th_matrixrep.html}{\text{5.2.6}})\\ \amp \iff \boldx\ne 0 \text{ and } A\boldx=\lambda\boldx\\ \amp \iff \boldx \text{ is an eigenvector of } A\text{.} \end{align*}
From (1) it follows directly that if \(\lambda\) is an an eigenvalue of \(T\text{,}\) then it is an eigenvalue of \(A=[T]_B\text{.}\) Conversely, if \(\lambda\) is an eigenvalue of \(A=[T]_B\text{,}\) then there is a nonzero \(\boldx\in\R^n \) such that \(A\boldx=\lambda \boldx\text{.}\) Since \([\phantom{\boldv}]_B\) is surjective (Theorem 5.1.12, (3)), there is a vector \(\boldv\in V\) such that \([\boldv]_B\text{.}\) It follows from (1) that \(\boldv\) is a \(\lambda\)-eigenvector of \(T\text{,}\) and thus that \(\lambda\) is an eigenvalue of \(T\text{.}\)
Thanks to Theorem 5.4.13, we can boil down the eigenvector/eigenvalue question for linear transformations of finite vector spaces to the analogous question about square matrices. The next theorem is the key result.
First observe that
\begin{equation*} A\boldx=\lambda\boldx \iff \lambda\boldx-A\boldx=\boldzero \iff (\lambda I-A)\boldx=\boldzero\text{.} \end{equation*}
From this equivalence it follows that
\begin{equation*} \NS(\lambda I-A)=\{\boldx\in \R^n\colon A\boldx=\lambda \boldx\}\text{.} \end{equation*}
Since an eigenvector must be nonzero, we conclude that the \(\lambda\)-eigenvectors of \(A\) are precisely the nonzero elements of \(\NS(\lambda I-A)\text{.}\) This proves statement (1). As a consequence, we see that \(A\) has \(\lambda\) as an eigenvalue if and only if \(\NS (\lambda I-A)\) contains nonzero element elements: i.e., if and only if \(\NS (\lambda I-A)\ne \{\boldzero\}\text{.}\) By the invertibility theorem this is true if and only if \(\lambda I-A\) is not invertible.
According to Theorem 5.4.14, the eigenvectors of \(A\) live in null spaces of matrices of the form \(\lambda I-A\text{.}\) Accordingly, we call these spaces eigenspaces.

Definition 5.4.15. Eigenspaces.

Let \(A\) be an \(n\times n\) matrix. Given \(\lambda\in \R\) the \(\lambda\)-eigenspace of \(A\) is the subspace \(W_\lambda\subseteq \R^n\) defined as
\begin{equation*} W_\lambda=\NS (\lambda I -A)\text{.} \end{equation*}
Similarly, given a finite-dimensional vector space \(V\text{,}\) a linear transformation \(T\colon V\rightarrow V\text{,}\) and \(\lambda\in \R\text{,}\) the \(\lambda\)-eigenspace of \(T\) is the subspace \(W\subseteq V\) defined as
\begin{equation*} W_\lambda=\NS(\lambda \id_V-T)\text{.} \end{equation*}
In both cases the nonzero elements of \(W_\lambda\) are precisely the \(\lambda\)-eigenvectors.
We nearly have a complete method for computing the eigenvalues and eigenvectors of a square matrix \(A\text{.}\) The last step is to identify the values of \(\lambda\) for which \(\lambda I-A\) is not invertible. By the invertibility theorem, the matrix \(\lambda I-A\) is not invertible if and only if \(\det (\lambda I-A)=0\text{.}\) Thus the eigenvalues of \(A\) are precisely the zeros of the function \(p(t)=\det(tI-A)\text{.}\) We have proved the following corollary.

Example 5.4.17. Rotation (again).

Fix \(\theta\in (0,2\pi)\) and let \(T\colon \R^2\rightarrow \R^2\) be rotation by \(\theta\text{.}\) Recall that we have \(T=T_A\) where
\begin{equation*} A=\begin{amatrix}[rr]\cos\theta\amp -\sin\theta\\ \sin\theta\amp \cos\theta \end{amatrix}\text{.} \end{equation*}
Now compute
\begin{align*} p(t)=\det(tI-A) \amp= \det \begin{bmatrix} t-\cos\theta\amp \sin\theta\\ -\sin\theta\amp t-\cos\theta \end{bmatrix} \\ \amp=t^2-2(\cos\theta)t+1\text{.} \end{align*}
We can use the quadratic formula to find the roots of \(p(t)\text{:}\)
\begin{equation*} t=\frac{2\cos\theta\pm \sqrt{4\cos^2\theta-4}}{2}=\cos\theta\pm \sqrt{\cos^2\theta-1}\text{.} \end{equation*}
When \(\theta=\pi\text{,}\) this reduces to \(t=\cos\pi=-1\text{,}\) confirming our conclusion in Example 5.4.10 that \(\lambda=-1\) is the only eigenvalue of the rotation by \(\pi\) operator.
When \(\theta\in (0,\pi)\) and \(\theta\ne \pi\text{,}\) then \(\cos^2\theta-1\lt 0\) and we see that \(p(t)\) has no real roots. This confirms our conclusion in Example 5.4.10 that such rotations have no eigenvalues.
Clearly the function \(p(t)=\det(tI-A)\) deserves a name; we call it the characteristic polynomial of \(A\text{.}\)

Definition 5.4.18. Characteristic polynomial of a matrix.

Let \(A\) be \(n\times n\text{.}\) The characteristic polynomial of \(A\) is the function
\begin{equation*} p(t)=\det(tI-A)\text{.} \end{equation*}
We will show below that \(p(t)=\det(tI-A)\) is indeed a polynomial (Theorem 5.4.25). We postpone that discussion for now in order to present some examples of systematically computing eigenvalues and eigenvectors of matrices. Below you find the the complete description of this procedure.

Example 5.4.20.

Let \(A=\begin{amatrix}[rr]1\amp 2\\ 1\amp 2 \end{amatrix}\text{.}\)
  1. Find all eigenvalue of \(A\text{.}\)
  2. Compute a basis for the eigenspace \(W_\lambda\) for each eigenvalue \(\lambda\text{.}\)
Solution.
We compute
\begin{equation*} p(t)=\det(tI-A)=\det \begin{amatrix}[rr]t-1\amp -2\\ -1\amp t-2 \end{amatrix}=t^2-3t=t(t-3)\text{.} \end{equation*}
Thus the eigenvalues of \(A\) are \(\lambda=0\) and \(\lambda=3\text{.}\)
Basis for \(W_0\).
We have
\begin{align*} W_0 \amp =\NS(0I-A)\\ \amp = \NS(-A) \\ \amp =\NS\begin{amatrix}[rr]-1\amp -2\\ -1\amp -2 \end{amatrix}\\ \amp =\Span\{(2,-1)\}\text{.} \end{align*}
Thus all \(0\)-eigenvectors of \(A\) are of the form \(c(2,-1)\text{,}\) where \(c\ne 0\text{.}\)
Basis for \(W_3\).
We have
\begin{align*} W_3 \amp =\NS(3I-A)\\ \amp =\NS\begin{amatrix}[rr]2\amp -2\\ -1\amp 1 \end{amatrix}\\ \amp =\Span\{(1,1)\}\text{.} \end{align*}
Thus all \(3\)-eigenvectors of \(A\) are of the form \(c(1,1)\text{,}\) where \(c\ne 0\text{.}\)

Example 5.4.21.

Let
\begin{equation*} A=\begin{amatrix}[rrr]2\amp -1\amp -1\\ -1\amp 2\amp -1\\ -1\amp -1\amp 2 \end{amatrix}\text{.} \end{equation*}
  1. Find all eigenvalues of \(A\text{.}\)
  2. Compute a basis for the eigenspace \(W_\lambda\) for each eigenvalue \(\lambda\text{.}\)
Solution.
First compute
\begin{align*} p(t) \amp = \det (tI-A)\\ \amp = \det \begin{amatrix}[rrr]t-2\amp 1\amp 1\\ 1\amp t-2\amp 1\\ 1\amp 1\amp t-2 \end{amatrix} \\ \amp = t^3-6t^2+9t\\ \amp = t(t^2-6t+9)\\ \amp = t(t-3)^2\text{.} \end{align*}
We see that the eigenvalues of \(A\) are \(\lambda=0\) and \(\lambda=3\text{.}\) Now compute bases for their corresponding eigenspaces.
Basis of \(W_0\).
We have
\begin{align*} W_0 \amp = \NS(0I-A)\\ \amp = \NS \begin{amatrix}[rrr]-2\amp 1\amp 1\\ 1\amp -2\amp 1\\ 1\amp 1\amp -2 \end{amatrix} \\ \amp = \Span\{(1,1,1)\text{.} \end{align*}
(We have skipped the Gaussian elimination steps involved in computing a basis for \(\NS(-A)\text{.}\))
Basis of \(W_3\).
We have
\begin{align*} W_3 \amp = \NS(3I-A)\\ \amp= \NS \begin{amatrix}[rrr]1\amp 1\amp 1\\ 1\amp 1\amp 1\\ 1\amp 1\amp 1 \end{amatrix} \\ \amp = \Span\{ (1,-1,0),(1,0,-1)\}\text{.} \end{align*}
We conclude that the \(0\)-eigenvectors of \(A\) are the nonzero scalar multiples of \((1,1,1)\text{,}\) and that the \(3\)-eigenvectors are all nonzero vectors of the form \(\boldx=c_1(1,-1,0)+c_2(1,0,-1)\text{.}\)
Example 5.4.20–5.4.21 bring to light a connection between eigenvalues and invertibility that is worth highlighting.
  1. We have
    \begin{equation*} W_0=\NS(0I-A)=\NS(-A). \end{equation*}
    Since \(-A\boldx=\boldzero\) if and only if \(A\boldx=\boldzero\text{,}\) we conclude that \(\NS(-A)=\NS A\text{,}\) and hence \(W_0=\NS A\text{.}\)
  2. Zero is an eigenvalue of \(A\) if and only if \(W_0\) is nontrivial, if and only if \(\NS A\) is nontrivial (by (1)), if and only if \(A\) is not invertible.
Of course statement (2) of Corollary 5.4.22 gives rise to yet another equivalent formulation of invertibility, and we include this in our ever-expanding invertibility theorem at the end of the section. We end our current discussion with an example illustrating how to compute the eigenvalues and eigenvectors of an arbitrary linear transformation \(T\colon V\rightarrow V\) of a finite-dimensional space. The idea is to first represent \(T\) as a matrix with respect to some basis \(B\text{,}\) apply Procedure 5.4.19 to this matrix, and then “lift” the results back to \(V\text{.}\)

Example 5.4.24. Transposition (again).

Let \(S\colon M_{22}\rightarrow M_{22}\) be defined as \(S(A)=A^T\text{.}\)
  1. Find all eigenvalues of \(S\text{.}\)
  2. For each eigenvalue \(\lambda\) of \(S\) compute a basis for \(W_\lambda\text{.}\)
Solution.
Let \(B=(E_{11}, E_{12}, E_{21}, E_{22})\text{,}\) the standard ordered basis of \(M_{22}\text{.}\) Compute \(A=[S]_B\text{:}\)
\begin{align*} A=[S]_B \amp =\begin{bmatrix} \vert \amp \vert\amp \vert\amp \vert \\ [S(E_{11})]_B\amp [S(E_{12})]_B \amp [S(E_{21})]_B \amp [S(E_{22})]_B\\ \vert \amp \vert \amp \vert\amp \vert \end{bmatrix} \\ \amp = =\begin{bmatrix} \vert \amp \vert\amp \vert\amp \vert \\ [E_{11}]_B\amp [E_{21}]_B \amp [E_{12}]_B \amp [E_{22}]_B \\ \vert \amp \vert \amp \vert\amp \vert \end{bmatrix} \amp (E_{11}^T=E_{11}, E_{12}^T=E_{21}, \text{etc.})\\ \amp =\begin{bmatrix} 1\amp 0\amp 0\amp 0\\ 0\amp 0\amp 1\amp 0\\ 0\amp 1\amp 0\amp 0\\ 0\amp 0\amp 0\amp 1 \end{bmatrix} \text{.} \end{align*}
Now apply Procedure 5.4.19 to \(A\text{.}\) From \(p(t)=\det(tI-A)=(t-1)^2(t^2-1)=(t-1)^3(t+1)\) we conclude that \(\lambda=1\) and \(\lambda=-1\) are the only eigenvalues of \(A\) (and hence also \(S\)). Bases for the corresponding eigenspaces of \(A\) are easily computed as
\begin{align*} \NS(I-A) \amp= \Span\{(1,0,0,0), (0,1,1,0), (0,0,0,1)\} \\ \NS(-I-A) \amp =\Span\{(0,1,-1,0)\} \text{.} \end{align*}
Now lift these up to bases of the eigenspaces \(W_1\) and \(W_{-1}\) of \(S\text{:}\)
\begin{align*} W_1 \amp=\Span\left\{ A_1=\begin{bmatrix} 1\amp 0\\ 0 \amp 0 \end{bmatrix}, A_2=\begin{bmatrix} 0\amp 1\\ 1 \amp 0 \end{bmatrix}, A_3=\begin{bmatrix} 0\amp 0\\ 0 \amp 1 \end{bmatrix} \right\} \\ W_{-1} \amp= \Span\left\{A_4= \begin{amatrix}[rr] 0\amp 1\\ -1 \amp 0 \end{amatrix} \right\} \text{.} \end{align*}
It is easy to see that \(W_1\) and \(W_2\) are the subspaces of symmetric and skew-symmetric matrices, respectively. This is consistent with our analysis in Example 5.4.11.

Subsection 5.4.3 Properties of the characteristic polynomial

We turn now to some very useful properties of the characteristic polynomial \(p(t)=\det(tI-A)\) of a square matrix \(A\text{.}\)
We show the claim by induction on \(n\geq 1\text{.}\) The claim is easily seen to be true for all \(1\times 1\) and \(2\times 2\) matrices.
Fix \(n\geq 3\) and assume the claim is true for all \((n-1)\times (n-1)\) matrices. Let \(A=[a_{ij}]_{1\leq i,j\leq n}\text{.}\) Expanding \(p(t)=\det (tI-A)\) along the first row yields
\begin{equation} p(t)=(t-a_{11})\det(t-A)_{11}-a_{12}\det(t-A)_{12}+a_{13}\det(t-A)_{13}+\cdots,\text{.}\tag{5.4.11} \end{equation}
(Recall that for any matrix \(B\) the notation \(B_{ij}\) denotes the submatrix obtained by removing the \(i\)-th row and \(j\)-th column of \(B\text{.}\)) First observe that \((t-A)_{11}=t-A_{11}\text{,}\) and thus \(\det(t-A)_{11}\) is the characteristic polynomial of the \((n-1)\times (n-1)\) matrix \(A_{11}\text{.}\) By induction this is a monic polynomial of degree \(n-1\text{,}\) and hence the first term of (5.4.11), \((t-a_{11})\det(t-A)_{11}\) is a monic polynomial of degree \(n\text{.}\) Unfortunately, the remaining terms in the expansion (5.4.11) do not lend themselves to a direct application of the induction hypothesis. However, we observe that the \((n-1)\times (n-1)\) submatrices \((t-A)_{1j}\) for \(j\geq 2\) all satisfy the following property: their first columns contains only scalars; the remaining columns contain exactly one entry of the form \(t-c\text{,}\) while the rest of the entries are scalars. An easy (additional) induction argument shows that the determinant of such a matrix is polynomial of degree at most \(n-2\text{.}\) (We leave this to you!) Since the first term of (5.4.11) is a monic polynomial of degree \(n\) and the rest of the terms are polynomials of degree at most \(n-2\text{,}\) we conclude that \(p(t)\) itself is a monic polynomial of degree \(n\text{,}\) as desired.
Statement (2) is the fundamental theorem of algebra: every polynomial with real coeofficients factors completely over the complex numbers. Statement (3) follows from Corollary 5.4.16.
Statement (6) clearly follows from statements (4) and (5). From (1) and (2) we conclude
\begin{equation} t^n+a_{n-1}t^{n-1}+\cdots +a_1t+a_0=(t-\lambda_1)(t-\lambda_2)\cdots (t-\lambda_n)\text{.}\tag{5.4.12} \end{equation}
Expanding out the right-hand side of this polynomial equality yields a polynomial of the form
\begin{equation*} t^n-(\lambda_1+\lambda_2+\cdots \lambda_n)t^{n-1}+\cdots +b_1t+(-1)^n\lambda_1\lambda_2\cdots \lambda_n\text{.} \end{equation*}
Going back and equating coefficients between the left- and right-hand sides of (5.4.12) yields
\begin{align*} a_{n-1} \amp = -(\lambda_1+\lambda_2+\cdots +\lambda_n)\\ a_0 \amp =(-1)^n\lambda_1\lambda_2\cdots \lambda_n \text{.} \end{align*}
This proves half of statements (4) and (5). The fact that \(a_{n-1}=-\tr A\) can be proved by induction using a modified version of the argument from the proof of (1) above. It remains to show that \(a_n=(-1)^n\det A\text{.}\) We have
\begin{align*} a_n \amp= p(0) \\ \amp=\det(0I-A) \\ \amp=\det(-A) \\ \amp =(-1)^n\det A\text{.} \end{align*}

Remark 5.4.26. Characteristic polynomial for \(2\times 2\) matrices.

Let \(A\) be a \(2\times 2\) matrix, and let \(p(t)=\det(tI-A)\text{.}\) Using (5.4.6)–(5.4.8) we have
\begin{equation*} p(t)=t^2-(\tr A) t+\det A\text{.} \end{equation*}
This is a useful trick if you want to produce a \(2\times 2\) matrix with a prescribed characteristic polynomial. For example, a matrix with characteristic polynomial \(p(t)=t^2-2\) has trace equal to 0 and determinant equal to \(-2\text{.}\) Such matrices are easy to construct: e.g.,
\begin{equation*} A_1=\begin{amatrix}[rr]0\amp 2\\ 1\amp 0 \end{amatrix}, A_2=\begin{amatrix}[rr]1\amp 1 \\ 1\amp -1 \end{amatrix}, A_3=\begin{amatrix}[rr] 3\amp -7 \\ 1\amp -3 \end{amatrix}\text{.} \end{equation*}
An important consequence of Theorem 5.4.25 is that an \(n\times n\) matrix \(A\) can have at most \(n\) distinct eigenvalues. Indeed, the eigenvalues of \(A\) are the real roots appearing among the \(\lambda_i\) in the factorization
\begin{equation*} p(t)=\det(tI-A)=(t-\lambda_1)(t-\lambda_2)\cdots (t-\lambda_n)\text{.} \end{equation*}
This proves the following corollary.
Lastly, as promised we end by incorporating statement (2) of Corollary 5.4.22 into our invertibility theorem.

Exercises 5.4.4 Exercises

Exercise Group.

For each matrix below (a) compute the characteristic polynomial \(p(t)\text{,}\) (b) find all eigenvalues of the matrix, and (c) compute bases for eigenspaces of each eigenvalue.
1.
\(A=\begin{bmatrix} 2\amp 1\\ 1\amp 2 \end{bmatrix}\)
2.
\(A=\begin{amatrix}[rr] 2\amp -3\\ 0\amp 2 \end{amatrix}\)
3.
\(A=\begin{amatrix}[rr] 2\amp 0\\ 0\amp 2 \end{amatrix}\)
4.
A=\(\begin{amatrix}[rr] 1\amp 2\\ 2\amp -4 \end{amatrix}\)
5.
\(A=\begin{amatrix}[rrr] 1\amp 0\amp -2\\ 0\amp 0\amp 0\\ -2\amp 0\amp 4 \end{amatrix}\)
6.
\(A=\begin{amatrix}[rrr] 0\amp 1\amp 1\\ 1\amp 0\amp 1\\ 1\amp 1\amp 0 \end{amatrix}\)
7.
\(A=\begin{amatrix}[rrr] 1\amp -3\amp 3\\ 3\amp -5\amp 3\\ 6\amp -6\amp 4 \end{amatrix}\)
8.
\(A=\begin{amatrix}[rrrr] 9\amp -8\amp 6\amp 3\\ 0\amp -1\amp 0\amp 0\\ 0\amp 0\amp 3\amp 0\\ 0\amp 0\amp 0\amp 7 \end{amatrix}\)
9.
Matrices \(A\) and \(B\) below both have characteristic polynomial \(p(t)=t^3-3t+2\text{.}\) For each matrix compute a basis of \(W_\lambda\) for each eigenvalue \(\lambda\text{.}\)
  1. \(\displaystyle A=\begin{amatrix}[rrr] -5\amp 0\amp 3\\ -6\amp 1\amp 3\\ -6\amp 0\amp 4 \end{amatrix}\)
  2. \(\displaystyle B=\begin{amatrix}[rrr] -2\amp -3\amp 3\\ -3\amp -3\amp 4\\ -3\amp -4\amp 5 \end{amatrix}\)
10.
Let \(A\) be an invertible matrix. Prove: \(\lambda\) is an eigenvalue of \(A\) if and only if \(\lambda^{-1}\) is an eigenvalue of \(A^{-1}\text{.}\)
11.
Let \(A\) be a square matrix.
(a)
Prove: \(A\) and \(A^T\) have the same characteristic polynomial.
Hint.
First show that \((tI-A^T)=(tI-A)^T\text{,}\) then use a relevant property of the determinant.
(b)
Prove: \(A\) and \(A^T\) have the same eigenvalues.
(c)
Fix \(\lambda\in \R\) and let \(W_\lambda, W_\lambda'\) be the \(\lambda\)-eigenspaces of \(A\) and \(A^T\text{,}\) respectively. Prove: \(\dim W_\lambda=\dim W_\lambda'\text{.}\)
Hint.
First show that \((\lambda I-A^T)=(\lambda I -A)^T\text{,}\) then use the rank-nullity theorem.
12.
Let \(\lambda\) be an eigenvalue of the matrix \(A\text{.}\)
  1. Prove: \(\lambda^k\) is an eigenvalue of \(A^k\) for any positive integer \(k\text{.}\)
  2. Assume that \(A\) is invertible. Prove: \(\lambda^k\) is an eigenvector of \(A^k\) for any integer \(k\text{.}\)
13.
Assume \(A\) is a square matrix satisfying \(A^k=\boldzero\) for some positive integer \(k\text{.}\) Show that \(0\) is the only eigenvalue of \(A\text{.}\) Your argument must make clear that \(0\) is in fact an eigenvalue of \(A\text{.}\)
14.
Assume \(A\) is a square matrix satisfying \(A^2=A\)
  1. Prove: if \(\lambda\in \R\) is an eigenvalue of \(A\text{,}\) then \(\lambda=0\) or \(\lambda=1\text{.}\)
  2. Prove: if \(A\) does not have \(0\) as an eigenvalue, then \(A=I\text{.}\)
  3. Give an example of