Skip to main content
Logo image

Section 5.3 Change of basis

Coordinate vectors and matrix representations work in tandem to model vectors in an abstract vector space \(V\) as column vectors in \(\R^n\text{,}\) and linear transformations \(T\colon V\rightarrow W\) as \(m\times n\) matrices. In both cases the model depends on our choice of basis. In this section we investigate how different basis choices affect these various models. Specifically, we consider the two questions below.
  1. Given \(V\) and two ordered bases \(B\) and \(B'\text{,}\) what is the algebraic relation between \([\boldv]_B\) and \([\boldv]_{B'}\text{?}\)
  2. Given \(T\colon V\rightarrow V\) and two ordered bases \(B\) and \(B'\text{,}\) what is the relation between \([T]_{B}\) and \([T]_{B'}\text{?}\)
We will tackle each question in turn. Both answers rely on something called a change of basis matrix \(\underset{B\rightarrow B'}{P}\text{.}\)

Subsection 5.3.1 Change of basis matrix

We define change of basis matrices via a column-by-column formula and motivate the definition retroactively with Theorem 5.3.2.

Definition 5.3.1. Change of basis matrix.

Let \(B=(\boldv_1, \boldv_2, \dots, \boldv_n)\) and \(B'\) be two ordered bases for the vector space \(V\text{.}\) The change of basis from \(B\) to \(B'\) is the \(n\times n\) matrix \(\underset{B\rightarrow B'}{P}\) defined as
\begin{equation*} \underset{B\rightarrow B'}{P}= \begin{bmatrix} \vert \amp \vert \amp \amp \vert \\ \phantom{v}[\boldv_1]_{B'} \amp \phantom{v}[\boldv_2]_{B'}\amp \dots \amp \phantom{v}[\boldv_n]_{B}\\ \vert \amp \vert \amp \amp \vert \end{bmatrix}\text{.} \end{equation*}
In other words, the \(j\)-th column of \(\underset{B\rightarrow B'}{P}\) is obtained by computing the coordinate vector of the \(j\)-th element of the original basis \(B\) with respect to the new basis \(B'\text{.}\)
  1. Let \(I_V\colon V\rightarrow V\) be the identity transformation: i.e., \(I_V(\boldv)=\boldv\) for all \(\boldv\in V\text{.}\) By Theorem 5.2.6 the matrix \([I_V]_B^{B'}\) is the unique matrix satisfying
    \begin{equation*} [I_V]_B^{B'}[\boldv]_B=[I_V(\boldv)]_{B'}=[\boldv]_{B'}\text{.} \end{equation*}
    Comparing the formulas for \([I_V]_{B}^{B'}\) and \(\) we see directly that
    \begin{equation*} [I_V]_B^{B'} = \underset{B\rightarrow B'}{P}\text{.} \end{equation*}
  2. This follows from (1) and Theorem 5.2.6:
    \begin{align*} \underset{B\rightarrow B'}{P}[\boldv]_B \amp = [\id_V]_{B}^{B'}[\boldv]_B \\ \amp = [\id_V(\boldv)]_{B'} \amp (\knowl{./knowl/th_matrixrep.html}{\text{Theorem 5.2.6}})\\ \amp =[\boldv]_{B'} \text{.} \end{align*}
  3. By (2) of Theorem 5.2.6 (the uniqueness claim), if \(A\) satisfies \(A[\boldv]_B=[\boldv]_{B'}\) for all \(\boldv\in \R^n\text{,}\) then \(A=[\id_V]_{B}^{B'}\text{.}\) Since \([\id_V]_{B}^{B'}=\underset{B\rightarrow B'}{P}\text{,}\) we conclude \(A=\underset{B\rightarrow B'}{P}\text{.}\)

Example 5.3.3.

Let \(V=\R^2\text{,}\) \(B=(\boldv_1=(1,1),\boldv_2=(1,-1))\text{,}\) \(B'=(\boldw_1=(1,2), \boldw_2=(2,-1))\text{.}\) Observe that \(B\) and \(B'\) are both orthogonal with resepct to the dot product. This will simplify your computations below.
  1. Compute \(\underset{B\rightarrow B'}{P}\text{.}\)
  2. Let \(\boldx=(4,-2)\text{.}\) Compute \([\boldx]_{B'}\) using (5.3.1).
Solution.
  1. Using Definition 5.3.1, we have
    \begin{align*} \underset{B\rightarrow B'}{P}\amp = \begin{bmatrix}\vert \amp \vert \\ \hspace{7pt}[\boldv_1]_{B'} \amp \hspace{7pt}[\boldv_2]_{B'}\\ \vert \amp \vert \end{bmatrix}\\ \amp = \begin{bmatrix}\frac{3}{5}\amp -\frac{1}{5}\\ \frac{1}{5}\amp \frac{3}{5} \end{bmatrix} \text{.} \end{align*}
    Here the two coordinate vector computations \([(1,1)]_{B'}=(3/5, 1/5)\) and \([(1,-1)]_{B'}=(-1/5,3/5)\) were done using Theorem 5.1.9.
  2. First we easily compute \([\boldx]_{B}=(1, 3)\text{,}\) again using Theorem 5.1.9. Now use (5.3.1):
    \begin{align*} [\boldx]_{B'} \amp =\underset{B\rightarrow B'}{P}[\boldx]_B \\ \amp = \begin{bmatrix}\frac{3}{5}\amp -\frac{1}{5}\\ \frac{1}{5}\amp \frac{3}{5} \end{bmatrix} \begin{amatrix}[c]1 \\ 3 \end{amatrix}\\ \amp = \begin{amatrix}[r]0\\ 2 \end{amatrix}\text{.} \end{align*}
    This should come as now surprise since
    \begin{equation*} \boldx=(4,-2)=2(2,-1)=0\boldw_1+2\boldw_2\text{.} \end{equation*}

Example 5.3.4.

Let \(V=P_2\text{,}\) \(B=(x^2,x,1)\text{,}\) \(B'=(p_1(x)=(x-2)^2, p_2(x)=x-2, p_3(x)=1)\text{.}\)
  1. Compute \(\underset{B\rightarrow B'}{P}\text{.}\)
  2. Compute \([x^2+x+1]_{B'}\) using (5.3.1).
Solution.
  1. We have
    \begin{align*} \underset{B\rightarrow B'}{P} \amp = \begin{bmatrix} \vert \amp \vert \amp \vert \\ [x^2]_{B'}\amp [x]_{B'}\amp [1]_{B'} \\ \vert \amp \vert \amp \vert \end{bmatrix} \\ \amp = \begin{bmatrix} 1\amp 0\amp 0 \\ 4\amp 1\amp 0\\ 4\amp 2\amp 1 \end{bmatrix} \text{.} \end{align*}
    The first two coordinate vector computations are nontrivial; you can verify for yourself that \(x^2=1(x-2)^2+4(x-2)+4\) and \(x=0(x-2)^2+1(x-2)+2\text{.}\) Alternatively, see Remark 5.3.5) for a neat trick for computing these coordinate vectors.
  2. Since \(B\) is the standard basis, we see easily that \([x^2+x+1]_{B}=(1,1,1)\text{.}\) Using (5.3.1) we have
    \begin{align*} [x^2+x+1]_{B'} \amp = \underset{B\rightarrow B'}{P}\begin{bmatrix} 1\\ 1\\ 1 \end{bmatrix}\\ \amp =\begin{bmatrix} 1\\ 5\\ 7 \end{bmatrix} \text{.} \end{align*}
    Verify for yourself that we do indeed have
    \begin{equation*} x^2+x+1=1(x-2)^2+5(x-2)+7\text{.} \end{equation*}

Remark 5.3.5. Taylor’s formula and change of basis.

Let \(B=(x^n, x^{n-1}, \dots, x, 1)\) be the standard basis of \(P_n\text{.}\) Fix any constant \(a\in \R\text{,}\) and let \(B'=((x-a)^n, (x-a)^{n-1}, \dots, (x-a), 1)\text{.}\) It is easy to see that \(B'\) is also an ordered basis: a simple degree argument shows that the polynomials \(p_k(x)=(x-a)^k\) are linearly independent. It follows from Taylor’s theorem (from single-variable calculus) that given any polynomial \(p\in P_n\) we have
\begin{equation*} p(x)=p(a)+p'(a)(x-a)+\frac{p''(x)}{2}(x-a)^2+\cdots \frac{p^{(n)}(a)}{n!}(x-a)^n\text{.} \end{equation*}
We call this expression the expansion of \(p(x)\) about \(x=a\text{.}\) In terms of coordinate vectors, this means that
\begin{equation} [p]_{B'}=\left(\frac{p^{(n)}(a)}{n!}, \frac{p^{(n-1)}(a)}{(n-1)!}, \dots, p'(a), p(a)\right)\text{.}\tag{5.3.2} \end{equation}
In other words, Taylor’s theorem provides a simple derivative formula for computing coordinate vectors with respect to the basis \(B'\text{.}\)
The following properties are often useful when computing various change of basis matrices.
  1. Let \(B=(\boldv_1, \boldv_2, \dots, \boldv_n)\text{.}\) By definition, the \(j\)-th column of \(\underset{B\rightarrow B'}{P}\) is \([\boldv_j]_B\text{.}\) Since \(\boldv_j=0\boldv_1+0\boldv_2+\cdots +1\boldv_j+\cdots\text{,}\) we see that \([\boldv_j]_B=\bolde_j\text{,}\) and hence that \(\underset{B\rightarrow B'}{P}=I\text{,}\) as claimed.
  2. Let \(A=\underset{B\rightarrow B'}{P}\) and \(B=\underset{B'\rightarrow B"}{P}\text{.}\) For any \(\boldv\in V\) we have
    \begin{align*} BA[\boldv]_B \amp = \underset{B'\rightarrow B}{P}\, \underset{B\rightarrow B'}{P}[\boldv]\\ \amp =\underset{B'\rightarrow B}{P}[\boldv]_{B'} \\ \amp = [\boldv]_{B}\text{.} \end{align*}
    It follows from (3) of Theorem 5.3.2 that \(BA=\underset{B\rightarrow B}{P}\text{.}\) Since \(\underset{B\rightarrow B}{P}=I\) by (1), we conclude that \(BA=I\text{,}\) and hence \(B=A^{-1}\text{,}\) as desired.

Example 5.3.7. \(V=\R^n\text{,}\) \(B\) standard basis.

Consider the special situation where \(V=\R^n\text{,}\) \(B\) is the standard basis, and \(B'=\{\boldv_1,\dots,\boldv_n\}\) is some nonstandard basis. In this case we have
\begin{align*} \underset{B'\rightarrow B}{P}\amp =\begin{bmatrix}\vert\amp\vert\amp \amp \vert \\ [\boldv_1]_B\amp [\boldv_2]_B \amp \cdots\amp [\boldv_n]_B\\ \vert\amp \vert\amp \amp \vert \end{bmatrix}\\ \amp = \begin{bmatrix}\vert\amp\vert\amp \amp \vert \\ \boldv_1\amp \boldv_2\amp\cdots\amp \boldv_n\\ \vert\amp \vert\amp \amp \vert \end{bmatrix} \amp (B \text{ standard basis})\text{.} \end{align*}
In other words, \(\underset{B'\rightarrow B}{P}\) is the matrix whose \(j\)-th column is just the \(j\)-th element of \(B'\text{.}\) Thus, in this situation we can compute \(\underset{B'\rightarrow B}{P}\) by placing the elements of \(B'\) as columns of a matrix, and then use (2) of Theorem 5.3.6 to compute \(\underset{B\rightarrow B'}{P}=\left(\underset{B'\rightarrow B}{P}\right)^{-1}\text{.}\)

Example 5.3.8.

Let \(V=\R^2\text{,}\) \(B=((1,0),(0,1))\text{,}\) \(B'=\{(1,\sqrt{3}),(-\sqrt{3},1)\}\text{.}\) Compute \(\underset{B\rightarrow B'}{P}\) and \(\underset{B'\rightarrow B}{P}\text{.}\)
Solution.
According to Example 5.3.7 we have
\begin{equation*} \underset{B'\rightarrow B}{P}=\begin{bmatrix}1\amp -\sqrt{3}\\ \sqrt{3}\amp 1 \end{bmatrix}\text{.} \end{equation*}
We then compute
\begin{equation*} \underset{B\rightarrow B'}{P}=(\underset{B'\rightarrow B}{P})^{-1}=\left(\begin{bmatrix}1\amp -\sqrt{3}\\ \sqrt{3}\amp 1 \end{bmatrix} \right)^{-1}=\frac{1}{4}\begin{bmatrix}1\amp \sqrt{3}\\ -\sqrt{3}\amp 1 \end{bmatrix}\text{.} \end{equation*}

Remark 5.3.9. \(B\) standard basis of \(V\).

The observation from Example 5.3.7 applies more generally when \(B\) is the standard basis of the given vector space \(V\) and \(B'=(\boldv_1, \boldv_2, \dots, \boldv_n)\) is nonstandard. In this case computing \(\underset{B'\rightarrow B}{P}\) will be easy as the coordinate vectors \([\boldv_j]_{B}\) can be produced by inspection. See Example 5.3.10.

Example 5.3.10.

Let \(V=M_{22}\text{,}\) \(B=(E_{11}, E_{12}, E_{21}, E_{22})\) (standard basis) and \(B'=(A_1,A_2,A_3,A_4)\text{,}\) where
\begin{equation*} A_1=\begin{amatrix}[rr] 1\amp 1\\ 1\amp 1 \end{amatrix}, A_2=\begin{amatrix}[rr] 1\amp -1\\ 1\amp -1 \end{amatrix}, A_3=\begin{amatrix}[rr] 1\amp 1\\ -1\amp -1 \end{amatrix}, A_4=\begin{amatrix}[rr] -1\amp 1\\ 1\amp -1 \end{amatrix}\text{.} \end{equation*}
Compute \(\underset{B'\rightarrow B}{P}\text{.}\)
Solution.
We have
\begin{align*} \underset{B'\rightarrow B}{P}\amp = \begin{bmatrix} \vert\amp \vert\amp \vert\amp \vert\\ [A_1]_{B}\amp [A_2]_B\amp [A_3]_B\amp [A_4]_B\\ \vert\amp \vert\amp \vert\amp \vert \end{bmatrix}\\ \amp = \begin{amatrix}[rrrr] 1\amp 1\amp 1\amp -1\\ 1\amp -1\amp 1\amp 1\\ 1\amp 1\amp -1\amp 1\\ 1\amp -1\amp -1\amp -1 \end{amatrix} \text{.} \end{align*}
Here the coordinate vectors \([A_i]_B\) are easily computed by inspection since \(B\) is the standard basis.
It turns out that \(\underset{B\rightarrow B'}{P}=(\underset{B'\rightarrow B}{P})^{-1}\) is not so difficult to compute in this case since the columns \(\boldc_j\) of \(\underset{B'\rightarrow B}{P}\) satisfy
\begin{equation*} \boldc_i\cdot\boldc_j=\begin{cases} 4\amp \text{if } i=j\\ 0\amp \text{if } i\ne j \end{cases}\text{.} \end{equation*}
From this observation and Theorem 4.1.9 it is easy to see that
\begin{equation*} \underset{B\rightarrow B'}{P}=(\underset{B'\rightarrow B}{P})^{-1}=\frac{1}{4} \begin{amatrix}[rrrr] 1\amp 1\amp 1\amp 1\\ 1\amp -1\amp 1\amp -1\\ 1\amp 1\amp -1\amp -1\\ -1\amp -1\amp 1\amp -1 \end{amatrix}\text{.} \end{equation*}

Video example: change of basis matrix.

Figure 5.3.11. Video: change of basis matrix

Subsection 5.3.2 Working with orthonormal bases

Let \(B\) and \(B'\) be ordered bases of an \(n\)-dimensional inner product space \((V,\langle\, , \rangle)\text{.}\) Not surprisingly, if one or more of \(B\) and \(B'\) are orthornormal, our change of basis computations are simplified significantly. For example, if \(B'\) is orthonormal (or even orthogonal), the coordinate vector computations necessary to compute \(\underset{B\rightarrow B'}{P}\) can be done using the inner product formula of Theorem 5.1.9. Furthermore, as we see below, if both \(B\) and \(B'\) are orthonormal, then the columns of \(\underset{B\rightarrow B'}{P}\) form an orthonormal basis of \(\R^n\text{,}\) in which case we can compute \(\underset{B'\rightarrow B}{P}\) as
\begin{equation*} \underset{B'\rightarrow B}{P}=\left(\underset{B\rightarrow B'}{P}\right)^{-1}=\left(\underset{B\rightarrow B'}{P}\right)^T\text{.} \end{equation*}
In other words, when \(B\) and \(B'\) are orthonormal bases, the change of basis matrix is what we call an orthogonal matrix.

Definition 5.3.12. Orthogonal matrices.

An invertible \(n\times n\) matrix \(A\) is orthogonal if \(A^{-1}=A^T\text{.}\)

Remark 5.3.13.

Since for an invertible matrix \(A\) we have \((A^T)^{-1}=(A^{-1})^T\) it follows immediately from Definition 5.3.12 that
\begin{equation*} A \text{ is orthogonal}\iff A^T \text{ is orthogonal} \iff A^{-1} \text{ is orthogonal}\text{.} \end{equation*}

Example 5.3.14.

The matrix
\begin{equation*} A=\begin{amatrix}[rr]\frac{\sqrt{2}}{2}\amp -\frac{\sqrt{2}}{2}\\ \frac{\sqrt{2}}{2} \amp \frac{\sqrt{2}}{2} \end{amatrix} \end{equation*}
is orthogonal, as one easily checks by computing \(A^TA\text{.}\) Observe that the columns of \(A\) form an orthonormal set with respect to the dot product, as do the rows. This is not a coincidence!
Let \(\boldr_i\) and \(\boldc_i\) be the \(i\)-th row and column of \(A\text{,}\) respectively, for each \(1\leq i\leq n\text{.}\) From Theorem 4.1.9 we see that
\begin{align} A^TA \amp=[\boldc_i\cdot \boldc_j]_{1\leq i,j\leq n} \tag{5.3.3}\\ AA^T \amp=[\boldr_i\cdot \boldr_j]_{1\leq i,j\leq n} \text{.}\tag{5.3.4} \end{align}
We use here that rows of \(A^T\) are the columns of \(A\text{,}\) and the columns of \(A^T\) are the rows of \(A\text{.}\) From (5.3.3)–(5.3.4) it follows easily that
\begin{align*} A^{-1}=A^T \amp \iff A^TA=I \\ \amp \iff \boldc_i\cdot\boldc_j=\begin{cases} 1\amp \text{if } i=j\\ 0\amp \text{if } i\ne j\end{cases} \\ \amp \iff \{\boldc_1,\boldc_2,\dots, \boldc_n\} \text{ is orthonormal}\\ \amp \iff \{\boldc_1,\boldc_2,\dots, \boldc_n\} \text{ is an orthonormal basis}\amp (n=\dim\R^n)\text{,} \end{align*}
and
\begin{align*} A^{-1}=A^T \amp \iff AA^T=I \\ \amp \iff \boldr_i\cdot\boldr_j=\begin{cases} 1\amp \text{if } i=j\\ 0\amp \text{if } i\ne j\end{cases} \\ \amp \iff \{\boldr_1,\boldr_2,\dots, \boldr_n\} \text{ is orthonormal}\\ \amp \iff \{\boldr_1,\boldr_2,\dots, \boldr_n\} \text{ is an orthonormal basis} \amp (n=\dim\R^n)\text{.} \end{align*}
This proves \((1)\iff (2)\) and \((1)\iff (3)\text{.}\) The result follows.

Remark 5.3.16.

It is somewhat unfortunate that the property of being an orthogonal matrix is equivalent to your rows or columns forming an orthonormal basis. You ask: Why not simply call such matrices orthonormal matrices? My answer: tradition!
  1. Let \(B=(\boldv_1, \boldv_2,\dots, \boldv_n)\text{.}\) By definition, the columns of \(\underset{B\rightarrow B'}{P}\) are the coordinate vectors \([\boldv_i]_{B'}\text{,}\) \(1\leq i\leq n\text{.}\) By Exercise 5.1.3.13, these coordinate vectors form an orthonormal subset of \(\R^n\text{;}\) since there are \(n=\dim\R^n\) of them, they form an orthonormal basis. From Theorem 5.3.15 it follows that \(\underset{B\rightarrow B'}{P}\) is orthogonal. Lastly, from Remark 5.3.13 it follows that \(\underset{B'\rightarrow B}{P}=\left(\underset{B\rightarrow B'}{P}\right)^{-1}\) is also orthogonal.
  2. Since \(\underset{B\rightarrow B'}{P}\) is orthogonal, we have
    \begin{equation*} \underset{B'\rightarrow B}{P}=\left(\underset{B\rightarrow B'}{P}\right)^{-1}=\left(\underset{B\rightarrow B'}{P}\right)^T\text{.} \end{equation*}

Example 5.3.18.

Consider the vector space \(P_1\) with inner product \(\langle p(x), q(x)\rangle=p(-1)q(-1)+p(1)q(1)\text{.}\) The ordered bases
\begin{equation*} B=\left(p_1(x)=\frac{1}{\sqrt{2}}x,p_2(x)=\frac{1}{\sqrt{2}}\right), B'=\left(q_1(x)=\frac{1}{2}(x-1), q_2(x)=\frac{1}{2}(x+1)\right) \end{equation*}
are both orthonormal with respect to this inner product. Compute \(\underset{B\rightarrow B'}{P}\) and \(\underset{B'\rightarrow B}{P}\text{.}\)
Solution.
Since \(B'\) is orthonormal, we use Theorem 5.1.9 to compute
\begin{align*} [p_1(x)]_{B'} \amp = \left(\langle p_1(x),q_1(x)\rangle, \langle p_1(x),q_2(x)\rangle\right)\\ \amp =(p_1(-1)q_1(-1)+p_1(1)q_2(1),p_1(1)q_1(1)+p_1(1)q_2(1))=\frac{1}{\sqrt{2}}(1,1)\\ [p_2(x)]_{B'} \amp = \left(\langle p_2(x),q_1(x)\rangle, \langle p_1(x),q_2(x)\rangle\right)\\ \amp =(p_2(-1)q_1(-1)+p_2(1)q_2(1),p_2(1)q_1(1)+p_2(1)q_2(1))=\frac{1}{\sqrt{2}}(-1,1)\text{.} \end{align*}
Thus
\begin{equation*} \underset{B\rightarrow B'}{P}=\frac{1}{\sqrt{2}}\begin{amatrix}[rr] 1\amp -1\\ 1\amp 1 \end{amatrix} \end{equation*}
\begin{equation*} \underset{B'\rightarrow B}{P}=(\underset{B\rightarrow B'}{P})^T=\frac{1}{\sqrt{2}}\begin{amatrix}[rr] 1\amp 1\\ -1\amp 1 \end{amatrix}\text{.} \end{equation*}
Before connecting change of basis matrices with matrix representations of linear transformations, it is worth gathering some the different techniques for computing change of basis matrices we have discussed so far.

Subsection 5.3.3 Change of basis for transformations

We now investigate how our choice of basis affects matrix representations of linear transformations. We will only consider the special case where \(T\colon V\rightarrow V\) and we are comparing matrix representations \([T]_B\) and \([T]_{B'}\) for two different ordered bases of \(V\text{.}\)
First observe that (5.3.6) follows from (5.3.5) and (2) of Theorem 5.3.6. Next, to prove (5.3.5), it suffices by (2) of Theorem 5.2.6 to show that the matrix \(A=\underset{B\rightarrow B'}{P}\, [T]_B\, \underset{B'\rightarrow B}{P}\) satisfies
\begin{equation*} A[\boldv]_{B'}=[T(\boldv)]_{B'} \end{equation*}
for all \(\boldv\in V\text{.}\) To this end, given any \(\boldv\in V\text{,}\) we have
\begin{align*} A[\boldv]_{B'}=\underset{B\rightarrow B'}{P}\, [T]_B\, \underset{B'\rightarrow B}{P}[\boldv]_{B'} \amp= \underset{B\rightarrow B'}{P}\, [T]_B [\boldv]_B \amp (\knowl{./knowl/th_change_of_basis_coordinates.html}{\text{Theorem 5.3.2}})\\ \amp= \underset{B\rightarrow B'}{P}[T(\boldv)]_{B} \amp (\knowl{./knowl/th_matrixrep.html}{\text{Theorem 5.2.6}}, (1)) \\ \amp = [\boldv]_{B'} \amp (\knowl{./knowl/th_change_of_basis_coordinates.html}{\text{Theorem 5.3.2}})\text{.} \end{align*}

Remark 5.3.21. Getting change of basis formulas correct.

It is easy to get the various details of the change of basis formula wrong. Here is a potential way to keep things organized in your mind.
  1. We wish to relate \([T]_{B'}\) and \([T]_B\) with an equation of the form \([T]_{B'}=*[T]_B*\text{,}\) where the asterisks are to be replaced with change of basis matrices or their inverses. Think of the three matrices on the right-hand side of this equation as a sequence of three things done to coordinate vectors, reading from right to left.
  2. \([T]_{B'}\) takes as inputs \(B'\)-coordinates of vectors, and outputs \(B'\)-coordinates. Thus the same should be true for \(*[T]_B*\text{.}\)
  3. Since \([T]_B\) takes as inputs \(B\)-coordinates, we must first convert from \(B'\)-coordinates to \(B\)-coordinates. So we should have \([T]_{B'}=*[T]_B\underset{B'\rightarrow B}{P}\text{.}\)
  4. Since \([T]_B\) outputs \(B\)-coordinates, we need to then convert back to \(B'\)-coordinates. Thus \([T]_{B'}=\underset{B\rightarrow B'}{P}[T]_B\underset{B'\rightarrow B}{P}\text{.}\)
  5. If desired you may replace \(\underset{B\rightarrow B'}{P}\) with \(\underset{B'\rightarrow B}{P}^{-1}\text{.}\)

Example 5.3.22.

Let \(T\colon P_2\rightarrow P_2\) be defined as \(T(p(x))=p(x)+2p'(x)+xp''(x)\text{.}\)
  1. Let \(B=(x^2, x, 1)\text{.}\) Compute \([T]_B\text{.}\)
  2. Let \(B'=(x^2+x+1, x^2+1, x+1)\text{.}\) Use the change of basis formula to compute \([T]_{B'}\text{.}\)
Solution.
  1. We easily compute \([T]_B=\begin{bmatrix}1\amp 0\amp 0\\ 6\amp 1\amp 0\\ 0\amp 2\amp 1 \end{bmatrix}\) using our usual recipe.
  2. We need to compute both change of basis matrices. Since \(B\) is standard we compute
    \begin{equation*} \underset{B'\rightarrow B}{P}=\begin{bmatrix}1\amp 1\amp 0\\ 1\amp 0\amp 1\\ 1\amp 1\amp 1 \end{bmatrix} \end{equation*}
    essentially by inspection. It follows that
    \begin{equation*} \underset{B\rightarrow B'}{P}=(\underset{B'\rightarrow B}{P})^{-1}=\begin{amatrix}[rrr] 1\amp 1\amp -1\\ 0\amp -1\amp 1\\ -1\amp 0\amp 1 \end{amatrix}\text{.} \end{equation*}
    Lastly, using (5.3.5) we have
    \begin{align*} [T]_{B'}\amp =\underset{B\rightarrow B'}{P}[T]_B\underset{B'\rightarrow B}{P} \\ \amp = \begin{amatrix}[rrr] 1\amp 1\amp -1\\ 0\amp -1\amp 1\\ -1\amp 0\amp 1 \end{amatrix} \begin{bmatrix}1\amp 0\amp 0\\ 6\amp 1\amp 0\\ 0\amp 2\amp 1 \end{bmatrix} \begin{bmatrix}1\amp 1\amp 0\\ 1\amp 0\amp 1\\ 1\amp 1\amp 1 \end{bmatrix}\\ \amp = \begin{amatrix}[rrr] 5\amp 6\amp -2\\ -4\amp -5\amp 2\\ 2\amp 0\amp 3 \end{amatrix}\text{.} \end{align*}
Consider the special case where \(T\colon \R^n\rightarrow \R^n\text{:}\) that is, when \(V=\R^n\) is a space of \(n\)-tuples. We know from Corollary 3.6.16 that \(T=T_A\) for a unique \(n\times n\) matrix \(A\text{.}\) Recall that \(A\) is called the standard matrix of \(T\) (3.6.17), and satisfies \(T(\boldx)=A\boldx\) for all \(\boldx\in \R^n\text{.}\) We often wish to compute \(A\text{,}\) as it provides a convenient matrix formula for \(T\text{.}\)
To compute \(A\) directly using the recipe in 3.6.16, we must compute \(T(\bolde_j)\) for each of the standard basis elements \(\bolde_j\text{.}\) For many naturally occurring transformations \(T\text{,}\) this is often not so easy to do. Theorem 5.3.20 provides an indirect method in such cases.
According to Theorem 5.2.3 we have \(A=[T]_B\text{:}\) i.e., the standard matrix of \(T\) is none other than the matrix representing \(T\) with respect to the standard basis. This connection allows us to compute \(A=[T]_B\) by first computing \([T]_{B'}\) for some more convenient basis \(B'\text{,}\) and then using the change of basis formula.
Procedure 5.3.23 is a powerful technique for computing matrix formulas for many interesting geometric linear transformations of \(\R^n\text{:}\) e.g., rotations, reflections, and orthogonal projections. Often the very definition of such transformations will suggest a more convenient nonstandard basis \(B'\text{:}\) one that reflects the geometry involved. The next example illustrates this nicely.

Example 5.3.24. Orthogonal projection (again).

Consider \(V=\R^3\) together with the dot product. Let’s derive (once again) a matrix formula for orthogonal projection \(\operatorname{proj}_W\colon \R^3\rightarrow \R^3\text{,}\) where \(W=\{(x,y,z)\colon x+y+z=0\}\text{.}\) In other words we want to compute \(A=[\operatorname{proj}_W]_B\text{,}\) where \(B=((1,0,0), (0,1,0), (0,0,1))\) is the standard basis. We will do so indirectly by first computing \([\operatorname{proj}_W]_{B'}\) with respect to a more convenient basis: namely, \(B'=((1,-1,0),(1,1,-2), (1,1,1))\text{.}\) This is the same basis from Example 5.2.12, and was selected deliberateley so that the first two vectors form a basis of \(W\text{,}\) and the third vector spans the normal line to \(W\text{.}\) As in Example 5.2.12 we then easily compute
\begin{equation*} [\operatorname{proj}_W]_{B'}=\begin{bmatrix} 1\amp 0\amp 0\\ 0\amp 1\amp 0\\ 0\amp 0\amp 0 \end{bmatrix}\text{.} \end{equation*}
Now use (5.3.6) to compute
\begin{align*} A=[\operatorname{proj}_W]_{B}\amp = \underset{B'\rightarrow B}{P}[\operatorname{proj}_W]_{B'}\underset{B\rightarrow B'}{P} \\ \amp= \begin{amatrix}[rrr] 1\amp 1\amp 1\\ -1\amp 1\amp 1\\ 0\amp -2\amp 1 \end{amatrix} \begin{bmatrix} 1\amp 0\amp 0\\ 0\amp 1\amp 0\\ 0\amp 0\amp 0 \end{bmatrix} \begin{amatrix}[rrr] \frac{1}{2}\amp -\frac{1}{2}\amp 0\\ \frac{1}{6}\amp \frac{1}{6}\amp -\frac{1}{3}\\ \frac{1}{3}\amp \frac{1}{3}\amp \frac{1}{3} \end{amatrix}\\ \amp = \frac{1}{3}\begin{amatrix}[rrr] 2\amp -1\amp -1\\ -1\amp 2\amp -1\\ -1\amp -1\amp 2 \end{amatrix}\text{.} \end{align*}
Lo and behold, we’ve discovered our matrix formula for projection onto \(W\) once again! (Compare with Example 4.3.17 and Example 5.2.12.)

Video example: change of basis for transformations.

Figure 5.3.25. Video: change of basis for transformations

Video example: change of basis and reflection.

Figure 5.3.26. Video: computing reflection via change of basis

Subsection 5.3.4 Similarity and the holy commutative tent of linear algebra

Theorem 5.3.20 supplies an algebraic answer to the question: What is the relation between two matrix representations \(A=[T]_B\) and \(A'=[T]_{B'}\text{?}\) Letting \(P=\underset{B'\rightarrow B}{P}\text{,}\) equation (5.3.6) becomes \(A'=P^{-1}AP\text{.}\) Matrices satisfying such a relation are said to be similar.

Definition 5.3.27.

Matrices \(A, A'\in M_{nn}\) are similar if there is an invertible matrix \(P\) such that \(A'=P^{-1}AP\text{.}\)
So any two matrix representations of a linear transformation \(T\colon V\rightarrow V\) are similar in the technical sense of Definition 5.3.27. In fact, a converse of sorts is also true, as articulated in the theorem below.
The discussion above shows that if \(A=[T]_B\) and \(A'=[T]_{B'}\text{,}\) then \(A'=P^{-1}AP\text{,}\) where \(P=\underset{B'\rightarrow B}{P}\text{;}\) thus \(A\) and \(A'\) are similar in this case.
Now assume that \(A\) and \(A'\) are similar. By definition this means there is an invertible matrix \(P\) such that \(A'=P^{-1}AP\text{.}\) Define \(T\colon \R^n\rightarrow \R^n\) as the matrix transformation \(T=T_A\text{.}\) According to Theorem 5.2.3 we have \(A=[T]_B\) where \(B\) is the standard basis of \(\R^n\text{.}\) Next, letting \(B'\) be the ordered basis whose \(j\)-th element is the \(j\)-th column of \(P\text{,}\) we have \(P=\underset{B'\rightarrow B}{P}\) (Example 5.3.7), and hence
\begin{equation*} A'=P^{-1}AP=\underset{B\rightarrow B'}{P}\, [T]_B\, \underset{B'\rightarrow B}{P}=[T]_{B'}\text{,} \end{equation*}
as desired.
We will see in Section 5.4 that similar matrices are indeed similar algebraically speaking: i.e., they share many of the same properties. Theorem 5.3.28 provides the theoretical foundation to understand why this should be so: if \(A\) and \(A'\) are similar, then they are two matrix representations of a common linear transformation \(T\text{;}\) their many shared properties are simply inherited from the single overlying linear transformation that they both represent! This circle of ideas is neatly encompassed by Figure 5.3.29.
Figure 5.3.29. The holy commutative tent of linear algebra. Here we have \(P=\underset{B'\rightarrow B}{P}\) and \(A'=P^{-1}AP\text{.}\)
Perhaps a little exegesis is in order here. Think of the map \(T\colon V\rightarrow V\) as a linear transformation up in abstract heaven; and think of the two matrices \(A=[T]_B\) and \(A'=[T]_{B'}\) as two earthly shadows of \(T\text{.}\) OK, this gets at the holy bit somewhat, but why commutative? Each face of the tent is a commutative diagram, as we now explain.

Slanted sides of the tent.

The commutativity of the two slanted sides of the tent is a consequence of Theorem 5.2.9:
\begin{align*} [T(\boldv)]_B[\boldv]_B \amp = [T(\boldv)]_B \amp [T]_{B'}[\boldv]_{B'}\amp =[T(\boldv)]_{B'}\text{.} \end{align*}

Triangular ends of the tent.

Let \(P=\underset{B'\rightarrow B}{P}\text{,}\) so that \(P^{-1}=\underset{B\rightarrow B'}{P}\text{.}\) The commutativity of the two triangular ends of the tent are consequences of Theorem 5.3.2:
\begin{align*} P[\boldv]_{B'} \amp=[\boldv]_B \amp P^{-1}[\boldv]_B\amp=[\boldv]_{B'} \text{.} \end{align*}

Base of tent.

Lastly the commutativity of the base of the tent is a consequence of Theorem 5.3.20:
\begin{equation*} [T]_{B'}=\underset{B\rightarrow B'}{P}[T]_B\underset{B'\rightarrow B}{P}, \end{equation*}
or equivalently,
\begin{equation*} A'=P^{-1}AP\text{.} \end{equation*}
In summary, the holy commutative tent conveys a close connection between the three maps
\begin{equation*} \R^n\xrightarrow{A}\R^n, \R^n\xrightarrow{A'}\R^n, V\xrightarrow{T}V\text{.} \end{equation*}
Since the base of the tent is commutative, and since the maps given by \(P\) and \(P^{-1}\) are invertible, we can translate back and forth between the matrices \(A\) and \(A'\text{.}\) Furthermore, since the two slanted sides of the tent are commutative, and since the coordinate vector transformations are invertible, we can translate up and down between our two matrix representations \(A\) and \(A'\) and the overlying linear transformation \(T\text{.}\) There is one true \(T\text{!}\)

Exercises 5.3.5 Exercises

Change of basis matrix.

In each exercise a vector space \(V\) is given along with two ordered bases \(B\) and \(B'\text{.}\) Compute \(\underset{B\rightarrow B'}{P}\) and \(\underset{B'\rightarrow B}{P}\text{.}\)
1.
\(V=\R^2\text{,}\) \(B=(\bolde_1, \bolde_2)\text{,}\) \(B'=\left((3,4),(1,1)\right)\)
2.
\(V=\R^2\text{,}\) \(B=\left((1,2), (2,1)\right)\text{,}\) \(B'=\left((-1,1),(1,1)\right)\)
3.
\(V=\R^3\text{,}\) \(B=(\bolde_1, \bolde_2, \bolde_3)\text{,}\) \(B'=\left( (1,0,0),(1,1,0),(1,1,1) \right)\)
4.
\(V=P_2\text{,}\) \(B=(x^2, x, 1)\text{,}\) \(B'=((x-2)^2, (x-2), 1)\)

5.

Let \(V=\R^3\text{,}\) \(B=(\bolde_1, \bolde_2, \bolde_3)\text{,}\) \(B'=\left( (1,0,0),(1,1,0),(1,1,1) \right)\text{,}\) as in Exercise 5.3.5.3.
  1. Compute \([(1,2,3)]_{B'}\) directly using Definition 5.1.4.
  2. Compute \([(1,2,3)]_{B'}\) using the change of basis matrix \(\underset{B\rightarrow B'}{P}\) and (5.3.1).

6.

Let \(V=P_2\text{,}\) \(B=(x^2, x, 1)\text{,}\) \(B'=((x-2)^2, (x-2), 1)\text{,}\) as in Exercise 5.3.5.4.
  1. Compute \([x^2-x+4]_{B'}\) directly using Definition 5.1.4.
  2. Compute \([x^2-x+4]_{B'}\) using the change of basis matrix \(\underset{B\rightarrow B'}{P}\) and (5.3.1).

7.

Let \(B\) be the standard basis of \(\R^2\text{.}\) Find the ordered basis \(B'\) for which the change of basis matrix \(\underset{B\rightarrow B'}{P}\) is given by
\begin{equation*} \underset{B\rightarrow B'}{P}=\begin{amatrix}[rr]5\amp 1\\ -3\amp 2 \end{amatrix}\text{.} \end{equation*}

8.

Let \(B\) be the standard basis of \(P_2\text{.}\) Find the ordered basis \(B'\) for which the change of basis matrix \(\underset{B'\rightarrow B}{P}\) is given by
\begin{equation*} \underset{B'\rightarrow B}{P}=\begin{amatrix}[rrr]1\amp 1\amp 2\\ -3\amp 2\amp 0\\ 0\amp -1\amp 1 \end{amatrix}\text{.} \end{equation*}

Orthonormal change of basis.

In each exercise an inner product space \((V,\langle\, , \rangle)\) is given along with two orthonormal ordered bases \(B\) and \(B'\text{.}\) Compute \(\underset{B\rightarrow B'}{P}\) and \(\underset{B'\rightarrow B}{P}\text{.}\) Use Theorem 5.3.17 to reduce your workload.
9.
\(V=\R^2\) with the dot product, \(B=\left((\sqrt{2}/2, \sqrt{2}/2), (-\sqrt{2}/2, \sqrt{2}/2)\right)\text{,}\) \(B'=\left( (\sqrt{3}/2, -1/2), (1/2,\sqrt{3}/2)\right)\)
10.
\(V=\R^4\) with the dot product, \(B=(\bolde_1, \bolde_2, \bolde_3, \bolde_4)\text{,}\) \(B'=\left(\frac{1}{2}(1,1,1,1), \frac{1}{2}(1,-1,1,-1), \frac{1}{2}(1,1,-1,-1), \frac{1}{2}(1,-1,-1,1)\right)\)

11.

Suppose \(B=(\boldv_1, \boldv_2)\) and \(B'=(\boldw_1, \boldw_2)\) are two bases for the space \(V\) related by the change of basis matrix
\begin{equation*} \underset{B\rightarrow B'}{P}=\begin{amatrix}[rr]1\amp -2 \\ 3\amp 1 \end{amatrix}\text{.} \end{equation*}
  1. Let \(\boldv=-3\boldv_1+2\boldv_2\text{.}\) Compute \([\boldv]_B\) and \([\boldv]_{B'}\text{.}\)
  2. Let \(\boldw=\boldw_1+2\boldw_2\text{.}\) Compute \([\boldw]_B\) and \([\boldw]_{B'}\text{.}\)

12.

Let \(B\text{,}\) \(B'\text{,}\) and \(B''\) be three ordered bases of the vector space \(V\text{.}\)
  1. Show that
    \begin{equation} \underset{B\rightarrow B''}{P}=\underset{B'\rightarrow B''}{P}\underset{B\rightarrow B'}{P}\text{.}\tag{5.3.7} \end{equation}
    To do so, set \(A=\underset{B'\rightarrow B''}{P}\) and \(B=\underset{B\rightarrow B'}{P}\) and show that the matrix \(AB\) satisfies the defining property of \(\underset{B\rightarrow B''}{P}\text{:}\) i.e.,
    \begin{equation*} AB[\boldv]_{B}=[\boldv]_{B''} \end{equation*}
    for all \(\boldv\in V\text{.}\)
  2. Using (a), show that
    \begin{equation*} \underset{B'\rightarrow B''}{P}=\underset{B\rightarrow B''}{P}\underset{B'\rightarrow B}{P}\text{.} \end{equation*}

Change of basis methods.

In each exercise a vector space \(V\) is given along with two ordered bases \(B'\) and \(B''\text{.}\)
  1. Compute \(\underset{B'\rightarrow B''}{P}\) directly using Definition 5.3.1
  2. Let \(B\) be the standard basis for \(V\text{.}\) Compute \(\underset{B'\rightarrow B''}{P}\) using formula (5.3.7) from Exercise 5.3.5.12.
13.
\(V=\R^2\text{,}\) \(B'=\left((1,1),(1,-1)\right)\text{,}\) \(B''=\left((2,1),(-1,2)\right)\)
14.
\(V=P_2\text{,}\) \(B'=(x^2+1, x+1, 1)\text{,}\) \(B''=(x^2+x+1, x^2+x, x^2)\)

15. Determinant of orthogonal matrices.

Prove: if \(Q\in M_{nn}\text{,}\) then \(\det Q=\pm 1\text{.}\)

16.

Let \(Q\) be an orthogonal \(2\times 2\) matrix. Prove the following statements.
  1. If \(\det Q=1\text{,}\) then \(Q\) is a rotation matrix: i.e., there is a \(\theta\in [0,2\pi]\) such that
    \begin{equation*} Q=\begin{amatrix}[rr] \cos\theta\amp -\sin\theta\\ \sin\theta\amp \cos\theta \end{amatrix}\text{.} \end{equation*}
  2. If \(\det Q=-1\text{,}\) then \(Q\) is a reflection matrix: i.e., there is a \(\theta\in [0,\pi]\) such that
    \begin{equation*} Q=\begin{amatrix}[rr] \cos 2\theta\amp \sin 2\theta\\ \sin 2\theta \amp -\cos 2\theta \end{amatrix}\text{.} \end{equation*}
It is easy to verify that rotation and reflection matrices are orthogonal. We now know that conversely any orthogonal matrix is either a rotation or a reflection matrix.

17.

Let \(T\colon \R^3\rightarrow \R^3\) be the linear transformation defined as \(T(x,y,z)=(x+2y+z, -y, x+7z)\text{.}\) Let \(B\) be the standard basis of \(\R^3\text{,}\) and let \(B'=\left((1,0,0), (1,1,0), (1,1,1)\right)\text{.}\)
  1. Compute \([T]_B\text{.}\)
  2. Compute \([T]_{B'}\) using Theorem 5.3.20.

18.

Let \(T\colon P_1\rightarrow P_1\) be the linear transformation defined as \(T(p(x))=(x+1)p'(x)\text{.}\) Let \(B\) be the standard basis of \(P_1\text{,}\) and let \(B'=\left(2x+1, x-1\right)\text{.}\)
  1. Compute \([T]_B\text{.}\)
  2. Compute \([T]_{B'}\) using Theorem 5.3.20.

19. Reflection in \(\R^2\).

Let \(\boldv=(a,b)\in \R^2\) be nonzero and define \(\ell=\Span\{\boldv\}\text{,}\) the line passing through the origin with direction vector \(\boldv\text{.}\) Let \(T\colon \R^2\rightarrow \R^2\) be reflection through \(\ell\text{.}\) (See Definition 3.2.16.) In this exercise we will use a change of basis argument to find a formula for the standard matrix of \(T\text{:}\) i.e., the matrix \(A\) satisfying \(T(\boldx)=A\boldx\) for all \(\boldx\in \R^2\text{.}\) Our answer will be expressed in terms of \(a\) and \(b\text{.}\)
  1. Pick an orthonormal basis \(B'\) with one vector pointing along \(\ell\) and compute \([T]_{B'}\text{.}\)
  2. Let \(B\) be the standard basis of \(\R^2\text{.}\) Use Theorem 5.3.20 to compute \(A=[T]_{B}\text{.}\) (Use Theorem 5.3.17 to reduce your workload.)
  3. How do we know that \(A\) is the standard matrix of \(T\text{?}\)
  4. Explain why your matrix \(A\text{,}\) expressed in terms of \(a\) and \(b\) for \(T\) agrees with the matrix formula provided in Theorem 3.2.17, which is expressed in terms of the angle \(\alpha\) that \(\ell\) makes with the \(x\)-axis.