Skip to main content
Logo image

Section 2.2 Matrix algebra

The last section was devoted to what might be called the arithmetic of matrices. We learned the basic operations of adding, multiplying, scaling, and transposing matrices. In this section we tackle the algebra of matrices. We will investigate the properties enjoyed (and not enjoyed) by our matrix operations, and will show how to use these operations to solve matrix equations.
As you learn about matrix algebra, always keep in mind your old friend, real number algebra. For the most part these two algebraic systems closely resemble one another, as Theorem 2.2.1 below makes clear. However, there are two crucial points where they differ (see Theorem 2.2.8): two important properties of real number algebra that do not hold for matrices. The consequences of these two simple aberrations are far-reaching and imbue matrix algebra with a fascinating richness in comparison to real number algebra.
How does one actually prove one of these properties? These are all matrix equalities of the form \(X=Y\text{,}\) so according to the matrix equality definition we must show (1) that the matrices \(X\) and \(Y\) have the same dimension, and (2) that \((X)_{ij}=(Y)_{ij}\) for all \((i,j)\text{.}\) The proof below illustrates this technique for the multiplication associative law of Theorem 2.2.1.
We prove only the multiplication associative law. Let \(A=[a_{ij}]_{m\times r}\text{,}\) \(B=[b_{ij}]_{r\times s}\text{,}\) \(C=[c_{ij}]_{s\times n}\text{.}\) To show
\begin{equation*} A(BC)=(AB)C\text{,} \end{equation*}
we must show (1) that \(A(BC)\) and \((AB)C\) have the same dimension, and (2) that
\begin{equation*} [A(BC)]_{ij}=[(AB)C]_{ij} \end{equation*}
for all possible \((i,j)\text{.}\)
(1) The usual observation about “inner” and “outer” dimensions shows that both \(A(BC)\) and \((AB)C\) have dimension \(m\times n\text{.}\)
Definition 2.1.17
(2) Given any \((i,j)\) with \(1\leq i\leq m\) and \(1\leq j\leq n\text{,}\) we have:
\begin{align*} [A(BC)]_{ij}\amp=\sum_{\ell=1}^ra_{i\ell}[BC]_{\ell j} \amp (\knowl{./knowl/d_matrix_mult.html}{\text{Definition 2.1.17}})\\ \amp=\sum_{\ell=1}^ra_{i\ell}\left(\sum_{k=1}^sb_{\ell k}c_{kj}\right) \amp (\knowl{./knowl/d_matrix_mult.html}{\text{Definition 2.1.17}})\\ =\amp \sum_{\ell=1}^r\sum_{k=1}^sa_{i\ell}(b_{\ell k}c_{kj}) \amp \text{ (real dist.) }\\ =\amp \sum_{k=1}^s\sum_{\ell=1}^r(a_{i\ell}b_{\ell k})c_{kj} \amp \text{ (real comm., real assoc.) }\\ =\amp \sum_{k=1}^s\left(\sum_{\ell=1}^ra_{i\ell}b_{\ell k}\right)c_{kj} \amp \text{ (real dist.) }\\ =\amp \sum_{k=1}^s[AB]_{ik}c_{kj} =\Bigl[(AB)C]_{ij} \end{align*}
This proves that all entries of the two matrices are equal, and hence \(A(BC)=(AB)C\text{.}\)
Like real number algebra, we can identify some special matrices that act as additive identities and multiplciative identities; and every matrix has an additive inverse. What we mean here is spelled out in detail in Theorem 2.2.4.

Definition 2.2.2. Additive inverse of a matrix.

Given an \(m\times n\) matrix \(A=[a_{ij}]\text{,}\) its additive inverse \(-A\) is defined as
\begin{equation*} -A=(-1)A=[-a_{ij}]\text{.} \end{equation*}

Definition 2.2.3. Identity matrix.

The \(n\times n\) identity matrix is the square \(n\times n\) matrix
\begin{equation*} I_n=\begin{bmatrix}1\amp 0\amp 0 \amp \dots\amp 0\\ 0\amp 1 \amp 0 \amp \dots \amp 0\\ \vdots \\ 0\amp 0\amp \dots \amp 0\amp 1\end{bmatrix} \end{equation*}
with ones along the diagonal and zeros everywhere else. In other words, for all \(1\leq i\leq n\) and \(1\leq j\leq n\text{,}\) we have
\begin{equation*} (I_n)_{ij}=\begin{cases} 1 \amp \text{if } i=j \\ 0\amp \text{if } i\ne j \end{cases}\text{.} \end{equation*}
When the size \(n\) of the identity matrix is not important, we will often denote it simply as \(I\text{.}\)
Left as an exercise.
As simple as this claim might seem, remember that we are dealing with a completely new algebraic system here. We will prove both implications of “if and only if” statement separately.

Proof: \(A+B=A+C\implies B=C\).

We prove this via a chain of implications:
\begin{align*} A+B=A+C\amp \implies -A+(A+B)=-A+(A+C) \\ \amp\implies (-A+A)+B=(-A+A)+C\amp (\knowl{./knowl/th_matrixadd_assoc.html}{\text{Addition associative law}}) \\ \amp\implies \boldzero_{m\times n}+B=\boldzero_{m\times n}+C \amp (\knowl{./knowl/th_matrix_add_inverse.html}{\text{Item 2}}) \\ \amp \implies B=C \amp (\knowl{./knowl/th_matrix_add_ident.html}{\text{Item 1}}) \text{.} \end{align*}

Proof: \(B=C\implies A+B=A+C\).

This direction is obvious: if \(B\) and \(C\) are equal matrices, then they remain equal when we add \(A\) to each of them.

Remark 2.2.6.

The algebraic importance of Corollary 2.2.5 is that we can perform additive cancellation in matrix equations just as we do in real number algebra. For example, we can solve the matrix equation \(A+B=3A\) for \(B\) as follows:
\begin{align*} A+B\amp = 3A\\ A+B\amp = (1+2)A\\ A+B\amp = A+2A\\ B\amp = 2A \amp (\knowl{./knowl/c_matrix_additive_canc.html}{\text{Corollary 2.2.5}})\text{.} \end{align*}

Warning 2.2.7.

Though we can perform additive cancellation in matrix algebra, we can not always perform multiplicative cancellation. For example, consider the matrices
\begin{equation*} A=\begin{bmatrix}1\amp 1\\ 1\amp 1\end{bmatrix}, B=\begin{bmatrix}1\amp 0\\ 0\amp 0 \end{bmatrix}, C=\begin{bmatrix}0\amp 0\\ 1\amp 0 \end{bmatrix}\text{.} \end{equation*}
Check for yourself that \(AB=AC\text{,}\) and yet \(B\ne C\text{.}\) In other words, we cannot always “cancel” \(A\) from the matrix equation \(AB=AC\text{.}\)
The example in our warning above is but one instance of the general failure of the principle of multiplicative cancellation in matrix algebra. This in turn is a consequence of the following theorem, which identifies the two crucial places where matrix algebra differs significantly from real number algebra.
This is a good place to point out that to prove an identity does not hold, it suffices to provide a single counterexample to that effect. We do so for each failed identity of Theorem 2.2.8 in turn. There is no significance to the particular counterexamples chosen here, and indeed there are infinitely many counterexamples to choose from in both cases.
  1. We have
    \begin{align*} \begin{bmatrix}1\amp 0\\ 0\amp 0 \end{bmatrix} \begin{bmatrix}0\amp 1\\ 0\amp 0 \end{bmatrix}\amp = \begin{bmatrix} 0\amp 1 \\ 0 \amp 0\end{bmatrix}\\ \begin{bmatrix}0\amp 1\\ 0\amp 0 \end{bmatrix} \begin{bmatrix}1\amp 0\\ 0\amp 0 \end{bmatrix} \amp = \begin{bmatrix} 0\amp 0\\ 0\amp 0 \end{bmatrix} \end{align*}
    and thus
    \begin{equation*} \begin{bmatrix}1\amp 0\\ 0\amp 0 \end{bmatrix} \begin{bmatrix}0\amp 1\\ 0\amp 0 \end{bmatrix} \ne \begin{bmatrix}0\amp 1\\ 0\amp 0 \end{bmatrix} \begin{bmatrix}1\amp 0\\ 0\amp 0 \end{bmatrix}\text{.} \end{equation*}
  2. Observe that
    \begin{equation*} \begin{bmatrix}0\amp 1\\ 0\amp 0 \end{bmatrix} \begin{bmatrix}1\amp 0\\ 0\amp 0 \end{bmatrix} =\begin{bmatrix}0\amp 0\\ 0\amp 0 \end{bmatrix} \text{.} \end{equation*}
    This is an example of two nonzero matrices whose product is the zero matrix.
An important consequence of the abnormality (2.2.1) is that matrix algebra does not enjoy the property of multiplicative cancellation.
Again, we need only provide explicit counterexamples for each statement.
  1. Let \(A=\begin{bmatrix}1\amp 1\amp 0\\ 0\amp 0\amp 0 \end{bmatrix}\text{,}\) \(B=\begin{bmatrix}1\amp 0\\ 0\amp 0\\ 0\amp 0 \end{bmatrix}\text{,}\) \(C=\begin{bmatrix}0\amp 0\\ 1\amp 0\\ 0\amp 0 \end{bmatrix}\text{.}\) Verify for yourself that
    \begin{equation*} \begin{bmatrix}1\amp 1\amp 0\\ 0\amp 0\amp 0 \end{bmatrix} \begin{bmatrix}1\amp 0\\ 0\amp 0\\ 0\amp 0 \end{bmatrix} = \begin{bmatrix}1\amp 1\amp 0\\ 0\amp 0\amp 0 \end{bmatrix} \begin{bmatrix}0\amp 0\\ 1\amp 0\\ 0\amp 0 \end{bmatrix}= \begin{bmatrix} 1\amp 0\\ 0\amp 0\\ 0\amp 0 \end{bmatrix}\text{.} \end{equation*}
    Thus \(AB=AC\text{,}\) but clearly \(B\ne C\text{.}\)
  2. Let \(B=\begin{bmatrix}2\amp 0\\ 0\amp 0 \end{bmatrix}\text{,}\) \(C=\begin{bmatrix}1\amp 1\\ 0\amp 0 \end{bmatrix}\text{,}\) \(D=\begin{bmatrix}1\amp 1\amp 1\\ 1\amp 1\amp 1 \end{bmatrix}\text{.}\) We have
    \begin{equation*} \begin{bmatrix}2\amp 0\\ 0\amp 0 \end{bmatrix} \begin{bmatrix}1\amp 1\amp 1\\ 1\amp 1\amp 1 \end{bmatrix} = \begin{bmatrix}1\amp 1\\ 0\amp 0 \end{bmatrix} \begin{bmatrix}1\amp 1\amp 1\\ 1\amp 1\amp 1 \end{bmatrix}= \begin{bmatrix} 2\amp 2\amp 2\\ 2\amp 2\amp 2 \end{bmatrix}\text{.} \end{equation*}
    Thus \(BD=CD\text{,}\) but \(B\ne C\text{.}\)

Remark 2.2.10.

Mark well this important abnormality of matrix algebra. Confronted with a real number equation of the form \(ab=ac\text{,}\) we have a deeply ingrained impulse to declare that either \(a=0\) or \(b=c\text{.}\) (If we’re sloppy we may forget about that first possibility.) The corresponding maneuver for the matrix equation \(AB=AC\) is simply not available to us, unless we know something more about \(A\text{.}\)
We end our foray into matrix algebra with some properties articulating how matrix transposition interacts with matrix addition, multiplication and scalar multiplication.
We prove only the first statement. First observe that if \(A\) is \(m\times n\text{,}\) then so is \(B\) and \(A+B\text{.}\) Then \((A+B)^T\) is \(n\times m\) by Definition 2.1.29. Similarly, we see that \(A^T+B^T\) is \(n\times m\text{.}\)
Next, given any \((i,j)\) with \(1\leq i\leq n\text{,}\) \(1\leq j\leq m\text{,}\) we have
\begin{align*} \left((A+B)^T\right)_{ij}\amp =(A+B)_{ji} \amp (\knowl{./knowl/d_transpose.html}{\text{Definition 2.1.29}})\\ \amp = A_{ji}+B_{ji} \amp (\knowl{./knowl/d_matrix_add_subtract.html}{\text{Definition 2.1.9}})\\ \amp =(A^T)_{ij}+(B^T)_{ij} \amp (\knowl{./knowl/d_transpose.html}{\text{Definition 2.1.29}} \\ \amp =(A^T+B^T)_{ij} \amp (\knowl{./knowl/d_matrix_add_subtract.html}{\text{Definition 2.1.9}})\text{.} \end{align*}
Since the \(ij\)-entries of both matrices are equal for each \((i,j)\text{,}\) it follows that \((A+B)^T=A^T+B^T\text{.}\)

Video examples: proving matrix equalities.

Figure 2.2.12. Video: matrix multiplication is associative
Figure 2.2.13. Video: transpose property

Exercises Exercises

WeBWork Exercises

1.
Determine which of the following statements are true and which are false.
  1. If \(\displaystyle A\) and \(\displaystyle B\) are matrices with sizes such that \(\displaystyle AB\) is square, then \(\displaystyle A\) and \(\displaystyle B\) must be square.
  2. If \(\displaystyle A\) and \(\displaystyle B\) are square matrices of the same size, then \(\displaystyle (AB)^T = B^TA^T\text{.}\)
  3. If \(\displaystyle A \in \mathcal{M}_{n,m}\) and \(\displaystyle \mathbf{b} \in \mathbb{R}^m\) is a column vector, then \(\displaystyle A\mathbf{b}\) is a linear combination of the columns of \(\displaystyle A\text{.}\)
  4. If \(\displaystyle A\) is a square matrix for which \(\displaystyle A^2 = I\text{,}\) then \(\displaystyle A = I\) or \(\displaystyle A = -I\text{.}\)
  5. If \(\displaystyle A\) and \(\displaystyle B\) are matrices such that \(\displaystyle AB = O\) and \(\displaystyle A \neq O\text{,}\) then \(\displaystyle B = O\text{.}\)
3.
Let
\begin{equation*} A=\left(\begin{array}{cc} 2 \amp 3 \cr 3 \amp 9 \end{array}\right) , B=\left(\begin{array}{cc} -2 \amp 7 \cr -4 \amp 9 \end{array}\right) , C=\left(\begin{array}{cc} 5 \amp 4 \cr 7 \amp 6 \end{array}\right). \end{equation*}
Then
\begin{equation*} A + B^T=\left(\begin{array}{cc} a_{11} \amp a_{12} \cr a_{21} \amp a_{22} \end{array}\right) \end{equation*}
where \(a_{11}=\) , \(a_{12}=\) , \(a_{21}=\) , \(a_{22}=\) ,
and
\begin{equation*} A - B + C^T=\left(\begin{array}{cc} c_{11} \amp c_{12} \cr c_{21} \amp c_{22} \end{array}\right) \end{equation*}
where \(c_{11}=\) , \(c_{12}=\) , \(c_{21}=\) , \(c_{22}=\) .
Answer 1.
\(0\)
Answer 2.
\(-1\)
Answer 3.
\(10\)
Answer 4.
\(18\)
Answer 5.
\(9\)
Answer 6.
\(3\)
Answer 7.
\(11\)
Answer 8.
\(6\)

Written Exercises

4.
In this exercise you will complete the proof of Theorem 2.2.1.
  1. Prove Item i.
  2. Prove Item ii.
  3. Prove Item iv.
  4. Prove Item v.
  5. Prove Item vi.
  6. Prove Item vii.
  7. Prove Item viii.
  8. Prove Item ix.
Solution.
  1. We need both \(A\) and \(B\) to be \(m\times n \) for the expression to make sense. It is easy to see that \(A+B\) and \(B+A\) are both \(m\times n\) matrices. We must show \((A+B)_{ij}=(B+A)_{ij}\) for all \(1\leq i\leq m\text{,}\) \(1\leq j\leq n\text{.}\) We have
    \begin{align*} (A+B)_{ij} \amp =(A)_{ij}+(B)_{ij}\amp (\knowl{./knowl/d_matrix_add_subtract.html}{\text{Definition 2.1.9}}) \\ \amp =(B)_{ij}+(A)_{ij}\amp \text{(real addition is comm.)} \\ \amp =(B+A)_{ij} \amp (\knowl{./knowl/d_matrix_add_subtract.html}{\text{Definition 2.1.9}} \text{.} \end{align*}
7.
Let \(A\) an \(n\times n\) matrix. We define its square \(A^2\) as \(A^2=AA\text{.}\)
  1. In real number algebra we know that \(a^2=0\implies a=0\text{.}\) By contrast, show that there are infinitely many \(2\times 2\) matrices \(A\) satisfying \(A^2=\boldzero_{2\times 2}\text{.}\)
    Optional: can you describe in a parametric manner the set of all matrices \(A\) satisfying \(A^2=\boldzero_{2\times 2}\text{?}\)
  2. In real number algebra we know that \(a^2=a\implies a=0 \text{ or } a=1\text{.}\) By contrast, show that there are infinitely many \(2\times 2\) matrices \(A\) satisfying \(A^2=A\text{.}\)
  3. In real number algebra we have the identity \((x+y)^2=x^2+2xy+y^2\text{.}\) Show that two \(n\times n\) matrices \(A\text{,}\) \(B\) satisfy
    \begin{equation*} (A+B)^2=A^2+2AB+B^2 \end{equation*}
    if and only if \(AB=BA\text{.}\)
Hint.
For (a) set \(A=\abcdmatrix{a}{b}{c}{d}\text{,}\) compute \(A^2\text{,}\) set this matrix equal to \(\boldzero_{2\times 2}\text{,}\) and try and find some solutions to the corresponding (nonlinear) system of four equations in the unknowns \(a,b,c,d\text{.}\)
Similar hint for (b), only now set \(A^2=A\text{.}\)
8.
Consider the matrix equation \(A\, \begin{bmatrix} 1\amp 1\\ 1\amp 1 \end{bmatrix}=\begin{bmatrix} 1\amp 1\\ 1\amp 1 \end{bmatrix}\text{.}\)
(a)
The following chain of implications is invalid.
\begin{align*} A\, \begin{bmatrix} 1\amp 1\\ 1\amp 1 \end{bmatrix}=\begin{bmatrix} 1\amp 1\\ 1\amp 1 \end{bmatrix}\amp\implies A\, \begin{bmatrix} 1\amp 1\\ 1\amp 1 \end{bmatrix}=I_2\, \begin{bmatrix} 1\amp 1\\ 1\amp 1 \end{bmatrix} \\ \amp\implies A=I_2 \text{.} \end{align*}
For each implication in the chain, explain why it is valid or invalid.
(b)
Find all \(A\) satisfying
\begin{equation*} A\, \begin{bmatrix} 1\amp 1\\ 1\amp 1 \end{bmatrix}=\begin{bmatrix} 1\amp 1\\ 1\amp 1 \end{bmatrix}\text{.} \end{equation*}
Hint.
Write \(A=\begin{bmatrix} a\amp b\\ c\amp d \end{bmatrix}\) and set up a system of linear equations in the unknowns \(a,b,c,d\text{.}\)