The invertibility theorem

Section 2.4 The invertibility theorem

We saw in Example 2.3.4 that verifying directly whether a matrix is invertible, using only Definition 2.3.1, can be quite an involved task. The goal of this section is to make this less onerous by developing some equivalent methods of testing invertibility. Our work culminates in Theorem 2.4.5 and Theorem 2.4.10, which draw connections between invertibility, solutions to linear systems, and row echelon forms of a square matrix. Not surprisingly, our old friend Gaussian elimination emerges as the fundamental computational tool.

Subsection 2.4.1 Elementary matrices

We begin with a treatment of elementary matrices, which serve as the basic building blocks for invertible matrices, and provide a crucial link between row reduction and matrix multiplication.

Definition 2.4.1. Elementary matrices.

An \(m\times m\) matrix \(E\) is elementary if multiplying any \(m\times n\) matrix \(A\) on the left by \(E\) performs one of our row operations on \(A\text{.}\)

We have different types of elementary matrices depending on the type of row operation they perform, and we denote these with an elaboration of our earlier row operation notation:

A scaling elementary matrix\(\underset{c\,r_i}{E}\) is a matrix such that multiplying a matrix \(A\) on the left by \(\underset{cr_i}{E}\) scales the \(i\)-th row of \(A\) by \(c\text{.}\)
A row swap elementary matrix\(\underset{r_i\leftrightarrow r_j}{E}\) is a matrix such that multiplying a matrix \(A\) on the left by \(\underset{r_i\leftrightarrow r_j}{E}\) swaps the \(i\)-th and \(j\)-th rows of \(A\text{.}\)
A row addition elementary matrix\(\underset{r_i+c\,r_j}{E}\) is a matrix such that multiplying a matrix \(A\) on the left by \(\underset{r_i+c\,r_j}{E}\) replaces the \(i\)-th row of \(A\) with \(r_i+c\,r_j\text{.}\)

Naturally, the row method of multiplication is the key to connecting a given row operation with a particular elementary matrix. Theorem 2.4.2 shows that once you fix the dimension, an elementary matrix is uniquely defined by the row operation it performs.

Theorem 2.4.2. Elementary matrix formulas.

Fix an integer \(n\geq 2\text{.}\) The three types of \(n\times n\) elementary matrices can be described as follows:

The \(n\times n \) scaling matrix \(\underset{c\,r_i}{E}\) is the result of scaling the \(i\)-th row of \(I_m\) by \(c\text{.}\)
The \(n\times n \) row swap matrix \(\underset{r_i\leftrightarrow r_j}{E}\) is the result of swapping the \(i\)-th and \(j\)-th rows of \(I_m\text{.}\)
The \(n\times n \) row addition matrix \(\underset{r_i+c\,r_j}{E}\) is the result of replacing the \(i\)-th row of \(I_m\) with the sum of its \(i\)-th row and \(c\) times its \(j\)-th row.

Proof.

First we show that if \(E\) is one of the \(m\times m\) elementary matrices, then it must assume one of the forms described above. Indeed, since multiplying on the left by \(E\) performs a certain row operation, and since \(E=E\,I_m\text{,}\) we see that \(E\) itself is the result of performing this particular row operation on the \(m\times m\) identity matrix. Thus \(E\) is one of the three types of matrices described above, obtained by performing an elementary row operation on \(I_m\text{.}\)

Next, we must show that any of the \(m\times m\) matrices \(E\) described above is indeed elementary in the sense of Definition 2.4.1: that is, we must show that multiplying any \(m\times n\) matrix \(A\) on the left by \(E\) performs the relevant row operation on \(A\text{.}\) This is now a direct consequence of Theorem 2.1.26.

For example, take \(E=\underset{c\,r_i}{E}\text{.}\) For \(j\ne i\text{,}\) the \(j\)-th row of \(E\,A\) is given by the \(j\)-th row of \(E\) times \(A\text{.}\) Since the \(j\)-th row of \(E\) in this case has a one in the \(j\)-th entry and zeros elsewhere, the product of this row and \(A\) is just the \(j\)-th row of \(A\text{.}\) Similarly, the \(i\)-th row of \(E\, A\) in this case is \(c\) times the \(i\)-th row of \(A\text{.}\) Thus \(E\) leaves all the rows of \(A\) except for the \(i\)-th one, which is scaled by \(c\text{.}\)

Elementary matrices provide us a way of understanding row reduction as a series of matrix multiplications (on the left). Recall that row operations on linear systems are useful in so far as they preserve the set of solutions, and that this is the result of each operation being in some sense “reversible”. (See Exercise 1.1.3.7.) In terms of matrix multiplication, this reversible attribute is reflected in the fact that elementary matrices are invertible.

Theorem 2.4.3. Inverses of elementary matrices.

Fix \(n\text{.}\) All elementary matrices are invertible, and their inverses are elementary matrices. In fact, we have the following formulas:

\begin{align*} \underset{cr_i}{E}^{-1}\amp = \underset{\frac{1}{c}r_i}{E}\\ \underset{r_i\leftrightarrow r_j}{E}^{-1}\amp = \underset{r_i\leftrightarrow r_j}{E}\\ \underset{r_i+c\,r_j}E^{-1}\amp = \underset{r_i-cr_j}E \end{align*}

Proof.

These formulas all follow easily from Theorem 2.1.26, and the fact that the proposed inverse elementary matrix performs the “reverse”, or inverse, of the row operation corresponding to the given elementary matrix.

Example 2.4.4.

Fix \(n=3\text{.}\) Verify that the following pairs of \(3\times 3\) elementary matrices are indeed inverses of one another.

\begin{equation*} \underset{2r_3}{E},\ \underset{\frac{1}{2}r_3}{E} \end{equation*}
\begin{equation*} \underset{r_1\leftrightarrow r_3}{E}, \ \underset{r_1\leftrightarrow r_3}{E} \end{equation*}
\begin{equation*} \underset{r_1+3r_3}{E},\ \underset{r_1-3r_3}{E} \end{equation*}

Solution.

We have

\begin{equation*} \underset{2r_3}{E}=\begin{bmatrix}1\amp 0\amp 0\\0\amp 1\amp 0\\ 0\amp 0\amp 2 \end{bmatrix} \end{equation*}

and

\begin{equation*} \underset{\frac{1}{2}r_3}{E}=\begin{bmatrix}1\amp 0\amp 0\\0\amp 1\amp 0\\ 0\amp 0\amp 1/2 \end{bmatrix}\text{.} \end{equation*}

You can verify for yourself that

\begin{equation*} \underset{2r_3}{E}\ \underset{\frac{1}{2}r_3}{E}=\underset{\frac{1}{2}r_3}{E}\ \underset{2r_3}{E}=I_3\text{.} \end{equation*}
We have

\begin{equation*} \underset{r_1\leftrightarrow r_3}{E}=\begin{bmatrix}0\amp 0\amp 1\\0\amp 1\amp 0\\ 1\amp 0\amp 0 \end{bmatrix}\text{.} \end{equation*}

You can verify for yourself that

\begin{equation*} \underset{r_1\leftrightarrow r_3}{E}\,\underset{r_1\leftrightarrow r_3}{E}=I_3\text{.} \end{equation*}
We have

\begin{equation*} \underset{r_1+3r_3}{E}=\begin{bmatrix}1\amp 0\amp 3\\0\amp 1\amp 0\\ 0\amp 0\amp 1 \end{bmatrix} \end{equation*}

and

\begin{equation*} \underset{r_1-3r_3}{E}=\begin{amatrix}[rrr]1\amp 0\amp -3\\0\amp 1\amp 0\\ 0\amp 0\amp 1 \end{amatrix}\text{.} \end{equation*}

You can verify for yourself that

\begin{equation*} \underset{r_1+3r_3}{E}\,\underset{r_1-3r_3}{E}=\underset{r_1-3r_3}{E}\,\underset{r_1+3r_3}{E}=I_3\text{.} \end{equation*}

Subsection 2.4.2 Interlude on matrix equations

We take a moment to make the following simple, somewhat overdue observation. Namely, we can represent a system of linear equations

\begin{equation} \eqsys\tag{2.4.1} \end{equation}

as a single matrix equation

\begin{equation} \genmatrix\begin{bmatrix}x_1\\x_2\\ \vdots \\ x_n \end{bmatrix} =\begin{bmatrix}b_1\\ b_2 \\ \vdots \\ b_m \end{bmatrix}\text{,}\tag{2.4.2} \end{equation}

\begin{equation*} A\boldx=\boldb\text{,} \end{equation*}

where \(A=[a_{ij}]\text{,}\) \(\boldx=[x_i]\text{,}\) \(\boldb=[b_i]\text{.}\)

Indeed if you expand out the left-hand side of (2.4.2) into an \(m\times 1\) column vector, using the definition of matrix multiplication, and then invoke the definition of matrix equality, then you obtain the linear system (2.4.1).

By the same token, an \(n\)-tuple \((s_1,s_2,\dots, s_n)\) is a solution to the system of equations \((*)\) if and only if its corresponding column vector \(\underset{n\times 1}{\bolds}=[s_i]\) is a solution to the matrix equation \(A\boldx=\boldb\text{.}\)

We have thus recast the problem of solving linear systems to the problem of solving a certain matrix equation of the form

\begin{equation*} A\underset{n\times 1}{\boldx}=\underset{m\times 1}{\boldb} \end{equation*}

for the unknown column vector \(\boldx\text{.}\) In particular, a homogeneous linear system

\begin{equation*} \homsys \end{equation*}

can be represented as a matrix equation of the form

\begin{equation*} A\underset{n\times 1}{\boldx}=\underset{m\times 1}{\boldzero}\text{.} \end{equation*}

Lastly, the use of Gaussian elimination to solve a linear system can now be understood in an algebraic way using matrix multiplication.

In more detail, suppose our given linear system has augmented matrix \(\begin{amatrix}[c|c]A\amp \boldb \end{amatrix}\) that row reduces to the row echelon matrix \(\begin{amatrix}[c|c] U \amp \boldb' \end{amatrix}\) after performing a sequence of row operations. Denote the \(i\)-th row operation \(\rho_i\text{,}\) and denote by \(\rho_i(B)\) the result of applying \(\rho_i\) to a matrix \(B\text{.}\)

Our sequence of operations \(\rho_i\) translates to the following sequence of matrix equations:

\begin{equation*} \begin{array}{c} A\boldx=\boldb\\ \rho_1(A)\boldx=\rho_1(\boldb)\\ \rho_2(\rho_1(A))\boldx=\rho_2(\rho_1(\boldb))\\ \vdots \\ \rho_r(\rho_{r-1}(\dots \rho_2(\rho_1(A))))\boldx=\rho_r(\rho_{r-1}(\dots \rho_2(\rho_1(\boldb)))) \\ U\boldx=\boldb'\end{array}\text{.} \end{equation*}

Let \(\rho_i\) correspond to the elementary matrix \(E_i\text{.}\) Then we represent this same sequence using matrix multiplication:

\begin{equation} \begin{array}{c} A\boldx=\boldb\\ E_1A\boldx=E_1\boldb\\ E_2E_1A\boldx=E_2E_1\boldb\\ \vdots \\ E_rE_{r-1}\cdots E_2E_1A\boldx=E_rE_{r-1}\cdots E_2E_1\boldb \\ U\boldx=\boldb' \end{array}\text{.}\tag{2.4.3} \end{equation}

This depiction of row reduction in terms of successive left-multiplication by elementary matrices will be useful to us in many ways. In particular, it follows from this discussion that two matrices \(A\) and \(B\) are row equivalent if and only if we have

\begin{equation} B=E_rE_{r-1}\cdots E_2E_1A\tag{2.4.4} \end{equation}

for some collection of elementary matrices \(E_i\text{.}\)

Subsection 2.4.3 The invertibility theorem

We are now in position to prove our first big theorem.

Theorem 2.4.5. Invertibility theorem.

Let \(A=[a_{ij}]\) be an \(n\times n\) matrix. The following statements are equivalent.

\(A\) is invertible.
The matrix equation

\begin{equation*} A\underset{n\times 1}{\boldx}=\underset{n\times 1}{\boldb} \end{equation*}

has a unique solution for any column vector \(\boldb\text{.}\)
The matrix equation

\begin{equation*} A\underset{n\times 1}{\boldx}=\underset{n\times 1}{\boldzero} \end{equation*}

has a unique solution: namely, \(\boldx=\boldzero_{n\times 1}\text{.}\)
\(A\) is row equivalent to \(I_n\text{,}\) the \(n\times n\) identity matrix.
\(A\) is a product of elementary matrices.

Proof.

Recall that to show two statements \(\mathcal{P}\) and \(\mathcal{Q}\) are equivalent, we must show two implications: \(\mathcal{P}\implies \mathcal{Q}\text{,}\) and \(\mathcal{Q}\implies \mathcal{P}\text{.}\) Instead of doing this for each possible pair of sentences above, we ease our work load by instead showing the following cycle of implications:

\begin{equation*} (1)\implies(2)\implies(3)\implies(4)\implies(5)\implies (1)\text{.} \end{equation*}

Since implication is transitive, starting at any point in our cycle and making our way around the chain of implications, we see that any one of the propositions implies any other proposition.

\((1)\implies (2)\).

Suppose \(A^{-1}\) exists. Given any column vector \(\boldb\text{,}\) we have

\begin{align*} A\boldx=\boldb \iff \boldx=A^{-1}\boldb \amp (\knowl{./knowl/th_inverse_cancel.html}{\text{Theorem 2.3.3}})\text{,} \end{align*}

which shows that \(\boldx=A^{-1}\boldb\) is the unique solution to \(A\boldx=\boldb\text{.}\)

\((2)\implies (3)\).

Clearly, if \(A\boldx=\boldb\) has a unique solution for any choice of \(\boldb\text{,}\) then it has a unique solution for the particular choice \(\boldb=\boldzero\text{.}\) Since \(\boldx=\boldzero_{n\times 1}\) is clearly a solution to the equation, it must be the only solution.

\((3)\implies (4)\).

Row reduce \(A\) to a matrix \(U\) in reduced row echelon form using Gauss-Jordan elimination. (See Theorem 1.2.10.) Since the set of solutions to \(A\boldx=\boldzero\) is identical to the set of solutions to \(U\boldx=\boldzero\) (apply Theorem 1.1.13 to their corresponding linear systems), we see that \(\boldzero\) is the only solution to \(U\boldx=\boldzero\text{.}\) Procedure 1.3.5 now implies \(U\) has a leading one in each column. Since \(U\) is \(n\times n\) and in reduced row echelon form, it follows that \(U\) must be the identity matrix. (Convince yourself of this.) Thus \(A\) is row equivalent to \(U=I\text{,}\) the identity matrix.

\((4)\implies (5)\).

If \(A\) is row equivalent to \(I\text{,}\) then according to our discussion after (2.4.3), we have \(I=E_rE_{r-1}\cdots E_1A\) for some collection of elementary matrices \(E_i\text{.}\) Since elementary matrices are invertible we can multiply both sides of this equation by

\begin{equation*} (E_rE_{r-1}\cdots E_1)^{-1}=E_1^{-1}E_{2}^{-1}\cdots E_r^{-1} \end{equation*}

to conclude

\begin{equation*} A=E_1^{-1}E_{2}^{-1}\cdots E_r^{-1}\text{.} \end{equation*}

Since inverses of elementary matrices are elementary (Theorem 2.4.3), we conclude that \(A\) is a product of elementary matrices.

\((5)\implies (1)\).

If \(A\) is a product of elementary matrices, then it is a product of invertible matrices. Since products of invertible matrices are invertible, we conclude that \(A\) is invertible.

Remark 2.4.6.

The invertibility theorem has an immediate application to linear systems where the number of equations is equal to the number of unknowns. In this special situation, the system is equivalent to a matrix equation of the form

\begin{equation*} A\boldx=\boldb\text{,} \end{equation*}

where \(A\) is a square matrix. According to the theorem, if we know \(A\) is invertible, then the matrix equation, and hence the linear system, has a unique solution: namely, \(\boldx=A^{-1}\boldb\text{.}\)

What if \(A\) is not invertible? Then the theorem only tells us that there is some column vector \(\boldc\text{,}\) not necessarily the given \(\boldb\text{,}\) such that the equation

\begin{equation*} A\boldx=\boldc \end{equation*}

does not have a unique solution. In other words, the theorem alone doesn’t allow us to conclude whether the given \(A\boldx=\boldb\) has a solution, and we must resort to our usual Gaussian elimination procedure to answer this question.

The family of triangular matrices (upper, lower, and diagonal) defined below provides an easy testing ground for our new invertibility theorem.

Definition 2.4.7. Diagonal and triangular matrices.

Let \(A=[a_{ij}]\) be \(n\times n\text{.}\)

For each \(1\leq i\leq n\) the entry \(a_{ii}\) is called the \(i\)-th diagonal entry of \(A\text{,}\) and subarray of \(A\) consisting of \(a_{11}, a_{22}, \dots, a_{nn}\) is called the diagonal. An off-diagonal entry of \(A\) is any entry not among the diagonal entries.
The matrix \(A\) is diagonal if all off-diagonal entries are zero: i.e., if \(a_{ij}=0\) for all \(1\leq i,j\leq n\) with \(i\ne j\text{.}\)
The matrix \(A\) is upper triangular if all entries below the diagonal are zero: i.e., if \(a_{ij}=0\) for all \(1 leq j\lt i\text{.}\)
The matrix \(A\) is lower triangular if all entries above the diagonal are zero: i.e., if \(a_{ij}=0\) for all \(1\leq i\lt j \text{.}\)
The matrix \(A\) is triangular if \(A\) is upper triangular or lower triangular.

Example 2.4.8. Triangular \(3\times 3\) matrices.

The set \(W_1\) of all diagonal \(3\times 3\) matrices can be described as

\begin{equation*} W_1=\left\{ \begin{amatrix}[ccc]r\amp 0\amp 0\\ 0\amp s\amp 0 \\ 0\amp 0\amp t \end{amatrix}\colon r,s,t\in \R\right\}\text{.} \end{equation*}

The set \(W_2\) of all upper triangular \(3\times 3\) matrices can be described as

\begin{equation*} W_2=\left\{ \begin{amatrix}[ccc]r\amp s\amp t\\ 0\amp u\amp v \\ 0\amp 0\amp w \end{amatrix}\colon r,s,t,u,v,w\in \R\right\}\text{.} \end{equation*}

The set \(W_3\) of all lower triangular \(3\times 3\) matrices can be described as

\begin{equation*} W_3=\left\{ \begin{amatrix}[ccc]r\amp 0\amp 0\\ s\amp t\amp 0 \\ u\amp v\amp w \end{amatrix}\colon r,s,t,u,v,w\in \R\right\}\text{.} \end{equation*}

Theorem 2.4.9. Invertibility of triangular matrices.

Let \(A=[a_{ij}]\) be a triangular \(n\times n\) matrix. Then \(A\) is invertible if and only if \(a_{ii}\ne 0\) for all \(1\leq i\leq n\text{.}\)

In other words, \(A\) is invertible if and only if the diagonal entries of \(A\) are all nonzero.

Proof.

In this proof we will make use of the following easy consequence of Theorem 2.3.14: namely, \(A\) is invertible if and only if \(A^T\) is invertible. (The forward implication follows directly from the theorem; the reverse direction follows from the theorem applied to \(B=A^T\text{.}\))

Case: \(A\) is upper triangular.

If \(a_{ii}\ne 0\) for all \(i\text{,}\) then it is easy to see that we can row reduce \(A\) first to a row echelon matrix with leading ones in every diagonal entry, and then further to the identity matrix. Thus \(A\) is row equivalent to \(I\) in this case, and we conclude from statement (4) of Theorem 2.4.5 that \(A\) is invertible.

For the other implication, we show that if it is not the case that \(a_{ii}\ne 0\) for all \(i\text{,}\) then there is a nonzero solution \(\boldx\ne \boldzero\) to the matrix equation \(A\boldx=\boldzero\text{.}\) If this is the case, then since we have two distinct solutions to \(A\boldx=\boldzero\text{,}\) \(A\) is not invertible by Statement (3) of Theorem 2.4.5.

To this end, assume it is not the case that \(a_{ii}\ne 0\) for all \(i\text{.}\) Then we can find a smallest index \(k\) such that \(a_{kk}=0\) and \(a_{ii}\ne 0\) for any \(i <k\text{.}\) It is easy to see that \(A\) is row equivalent to a matrix \(A'=[a'_{ij}]\text{,}\) satisfying \(a_{jj}=1\) for \(1\leq j\leq k-1\) and \(a_{ij}=0\) for all \(1\leq i < j\leq k-1\text{:}\) i.e., \(A'\) is “diagonal up until the \(k\)-th column”.

We now provide a nonzero solution \(\boldx=[x_i]_{n\times 1}\)to \(A'\boldx=\boldzero\text{:}\) namely, set \(x_k=1\text{,}\) \(x_{i}=0\) for all \(i>k\text{,}\) and \(x_{i}=-a'_{ik}\) for all \(i < k\text{.}\) (Verify this for yourself, using the description above of \(a'_{ij}\) for \(j\leq k-1\text{.}\)) Since \(A'\) is row equivalent to \(A\text{,}\) the linear systems corresponding to \(\begin{amatrix}[cc] A\amp \boldzero\end{amatrix}\) and \(\begin{amatrix}[cc] A' \amp \boldzero\end{amatrix}\) have the same solutions. Hence \(\boldx\) is also a nonzero solution to \(A\boldx=\boldzero\text{.}\) We conclude that \(A\) is not invertible by Statement (3) of Theorem 2.4.5. This concludes the proof of this implication.

Case: \(A\) is lower triangular.

Set \(B=A^T\text{.}\) Then \(B\) is upper triangular, and \((B)_{ii}=(A)_{ii}=a_{ii}\) for all \(i\text{.}\) Then

\begin{align*} A \text{ invertible} \amp \iff B=A^T \text{ invertible} \amp \text{(see statement at top )}\\ \amp \iff a_{ii}=0 \text{ for all } i \amp \text{(by previous case)} \text{.} \end{align*}

Subsection 2.4.4 Invertibility algorithms

The proof of the implication \((4)\implies (5)\) of Theorem 2.4.5 can be expanded into an algorithm that (1) decides whether a given matrix \(A\) is invertible, and (2) computes \(A^{-1}\) if \(A\) is invertible.

Theorem 2.4.10. Inverse algorithm.

Let \(A\) be an \(n\times n\) matrix. To test for invertibility of \(A\) proceed as follows.

Step 1: Build the \(n\times 2n\) augmented matrix \(\begin{amatrix}[c|c]A\amp I_n\end{amatrix}\) and use Gaussian elimination to row reduce to the form \(\begin{amatrix}[c|c]U\amp B\end{amatrix} \text{,}\) where \(U\) is in row echelon form.

The matrix \(A\) is invertible if and only if \(U\) has \(n\) leading ones.
Step 2: If \(U\) has \(n\) leading ones, row reduce further to a matrix of the form \(\begin{amatrix}[c|c] I_n\amp C\end{amatrix}\text{.}\) Then \(A^{-1}=C\text{.}\)

Proof.

From the proof of Theorem 2.4.5 we know \(A\) is invertible if and only if \(U\) has \(n\) leading ones. The question remains as to why reducing the augmented matrix \(\begin{amatrix}[c|c]A\amp I\end{amatrix}\) to \(\begin{amatrix}[c|c]I\amp C\end{amatrix}\) tells us that \(A^{-1}=C\text{.}\) Let \(E_1, E_2, \dots, E_r\) be the elementary matrices representing the row operations involved in this process. Then we have

\begin{equation*} E_rE_{r-1}\cdots E_1A=I_n\text{.} \end{equation*}

After a little algebra, we see that

\begin{equation*} A^{-1}=E_rE_{r-1}\cdots E_1\text{.} \end{equation*}

Since \(C\) is the result of applying same row operations to \(I_n\) we have

\begin{equation*} C=E_rE_{r-1}\cdots E_1I=E_rE_{r-1}\cdots E_1=A^{-1}\text{,} \end{equation*}

as claimed.

From the proof of Theorem 2.4.5 we also derive an algorithm for writing an invertible matrix as a product of elementary matrices. You should think of this as a first example of the great versatility of Gaussian elimination, as conveyed by our Mantra 1.2.7.

Theorem 2.4.11. Product of elementary matrices algorithm.

Let \(A\) be an \(n\times n\) matrix. To (potentially) write \(A\) as a product of elementary matrices, proceed as follows.

Step 1: Attempt to row reduce \(A\) to the identity, keeping track of your sequence of row operations in the form of elementary matrices.
Step 2: If you are able to row reduce \(A\) to \(I_n\) with a sequence of row operatons corresponding to the elementary matrices \(E_1, E_2, \dots, E_r\text{,}\) then

\begin{equation*} A=E_1^{-1}E_2^{-1}\cdots E_{r}^{-1}. \end{equation*}

Since the inverse of an elementary matrix is elementary, we have written \(A\) as a product of elementary matrices.

Proof.

See the proof of the implication \((4)\implies (5)\) in Theorem 2.4.5.

Subsection 2.4.5 In-depth example

Let

\begin{equation*} A=\begin{amatrix}[rrr]1\amp 2\amp 0\\ 1\amp 2\amp 1\\ -1\amp -1\amp 3 \end{amatrix} \text{.} \end{equation*}

Combining both algorithms we can decide whether \(A\) is invertible, and if so, compute \(A^{-1}\) and write \(A\) as a product of elementary matrices.

\begin{align*} \begin{amatrix}[rrr|rrr]1\amp 2\amp 0\amp 1\amp 0\amp 0\\ 1\amp 2\amp 1\amp 0\amp 1\amp 0\\ -1\amp -1\amp 3\amp 0\amp 0\amp 1 \end{amatrix} \amp \xrightarrow{r_2-r_1} \begin{amatrix}[rrr|rrr]1\amp 2\amp 0\amp 1\amp 0\amp 0\\ 0\amp 0\amp 1\amp -1\amp 1\amp 0\\ -1\amp -1\amp 3\amp 0\amp 0\amp 1 \end{amatrix}\\ \amp \xrightarrow{r_3+r_1} \begin{amatrix}[rrr|rrr]1\amp 2\amp 0\amp 1\amp 0\amp 0\\ 0\amp 0\amp 1\amp -1\amp 1\amp 0\\ 0\amp 1\amp 3\amp 1\amp 0\amp 1 \end{amatrix}\\ \text{ (three leading ones, thus invertible) }\amp \xrightarrow{r_2\leftrightarrow r_3} \begin{amatrix}[rrr|rrr]\boxed{1}\amp 2\amp 0\amp 1\amp 0\amp 0\\ 0\amp \boxed{1}\amp 3\amp 1\amp 0\amp 1\\ 0\amp 0\amp \boxed{1}\amp -1\amp 1\amp 0 \end{amatrix}\\ \amp \xrightarrow{r_2-3r_3} \begin{amatrix}[rrr|rrr]1\amp 2\amp 0\amp 1\amp 0\amp 0\\ 0\amp 1\amp 0\amp 4\amp -3\amp 1\\ 0\amp 0\amp 1\amp -1\amp 1\amp 0 \end{amatrix}\\ \amp \xrightarrow{r_1-2r_2} \begin{amatrix}[rrr|rrr]1\amp 0\amp 0\amp -7\amp 6\amp -2\\ 0\amp 1\amp 0\amp 4\amp -3\amp 1\\ 0\amp 0\amp 1\amp -1\amp 1\amp 0 \end{amatrix} \end{align*}

According to Theorem 2.4.10, the computation shows that \(A\) is invertible and

\begin{equation*} A^{-1}=\begin{amatrix}[rrr]-7\amp 6\amp -2\\ 4\amp -3\amp 1\\ -1\amp 1\amp 0 \end{amatrix}\text{.} \end{equation*}

Next, representing our row operations as elementary matrices, we see that

\begin{equation*} E_5E_4E_3E_2E_1A=I\text{,} \end{equation*}

where

\begin{align*} E_1\amp =\begin{amatrix}[rrr] 1\amp 0\amp 0\\ -1\amp 1\amp 0\\ 0\amp 0\amp 1\end{amatrix} \amp E_2\amp =\begin{amatrix}[rrr]1\amp 0\amp 0\\ 0\amp 1\amp 0\\ 1\amp 0\amp 1\end{amatrix}\\ E_3\amp =\begin{amatrix}[rrr]1\amp 0\amp 0\\ 0\amp 0\amp 1\\ 0\amp 1\amp 0 \end{amatrix} \amp E_4\amp =\begin{amatrix}[rrr] 1\amp 0\amp 0\\ 0\amp 1\amp -3\\ 0\amp 0\amp 1\end{amatrix}\\ E_5\amp =\begin{amatrix}[rrr]1\amp -2\amp 0\\ 0\amp 1\amp 0\\ 0\amp 0\amp 1\end{amatrix}\text{.} \end{align*}

We conclude that

\begin{align*} A \amp =E_1^{-1}E_2^{-1}E_3^{-1}E_4^{-1}E_5^{-1}\\ \amp= \begin{amatrix}[rrr] 1\amp 0\amp 0\\ 1\amp 1\amp 0\\ 0\amp 0\amp 1\end{amatrix} \begin{amatrix}[rrr]1\amp 0\amp 0\\ 0\amp 1\amp 0\\ -1\amp 0\amp 1\end{amatrix} \begin{amatrix}[rrr]1\amp 0\amp 0\\ 0\amp 0\amp 1\\ 0\amp 1\amp 0 \end{amatrix} \begin{amatrix}[rrr] 1\amp 0\amp 0\\ 0\amp 1\amp 3\\ 0\amp 0\amp 1\end{amatrix} \begin{amatrix}[rrr]1\amp 2\amp 0\\ 0\amp 1\amp 0\\ 0\amp 0\amp 1\end{amatrix}\text{.} \end{align*}

Video example: inverse algorithm.

Figure 2.4.12. Video: inverse algorithm

Subsection 2.4.6 Some theoretical loose ends

The two inveribility algorithms above are nice examples of how a theoretical result like our invertibility theorem can pay some serious computational dividends. Namely, thanks to the theory we have discovered a method of computing the inverse of a matrix that essentially boils down to row reduction.

We finish this section with a number of theoretical implications that tie up some loose ends. The results below are all consequences in some way of Theorem 2.4.5. The fist shows that in fact one one of the defining equalities \(AB=I\) or \(BA=I\) suffices to define the inverse of a matrix.

Corollary 2.4.13. Left-inverse if and only if right-inverse.

Let \(A\) and \(B\) be \(n\times n\text{.}\) Then

\(BA=I_n\implies AB=I_n\text{.}\)
\(AB=I_n\implies BA=I_n\text{.}\)

In plain English: a matrix \(B\) is a left-inverse of \(A\) if and only if it is a right-inverse of \(A\text{.}\)

Proof.

It is enough to prove the first implication: the second then follows by exchanging the roles of \(A\) and \(B\text{.}\)

Suppose \(BA=I_n\text{.}\) We first show that \(A\) is invertible. We have

\begin{align*} A\boldx=\boldzero\amp\implies BA\boldx=\boldzero \\ \amp\implies I\boldx=\boldzero \\ \amp \implies \boldx=\boldzero \text{.} \end{align*}

By Statement (3) of Theorem 2.4.5, we conclude that \(A\) is invertible.

Now that we know \(A^{-1}\) exists we have

\begin{align*} BA=I_n\amp \implies BAA^{-1}=IA^{-1} \\ \amp \implies B=A^{-1}\\ \amp \implies AB=AA^{-1}=I_n \text{.} \end{align*}

As a further consequence of Theorem 2.4.5, we can at last strengthen the implication

\begin{equation*} A, B \text{ invertible }\implies AB \text{ invertibe} \end{equation*}

to an equivalence.

Corollary 2.4.14. Invertibility of product equivalence.

Let \(A\) and \(B\) be \(n\times n\text{.}\) Then \(AB\) is invertible if and only if \(A\) and \(B\) are both invertible: i.e.,

\begin{equation*} A, B \text{ invertible }\iff AB \text{ invertible} \end{equation*}

Proof.

Implication: \(A \text{ and } B \text{ invertible }\implies AB \text{ invertibe}\).

We know from Theorem 2.3.6 that if \(A\) and \(B\) are invertible, then so is \(AB\text{.}\)

Implication: \(AB\) invertible \(\implies\) \(A\) and \(B\) invertible.

Assume \(AB\) is invertible and let \(C\) be its inverse. Thus \(C(AB)=(AB)C=I_n\text{.}\) We first prove \(B\) is invertible. We have

\begin{align*} B\boldx=\boldzero\amp \implies AB\boldx=\boldzero \\ \amp \implies \boldx=\boldzero\text{.} \end{align*}

The last implication uses Statement (3) of Theorem 2.4.5 and the fact that \(AB\) is invertible. We have shown that

\begin{equation*} B\boldx=\boldzero\implies \boldx=\boldzero\text{,} \end{equation*}

and hence that \(B\) is invertible, using once again Statement (3) of Theorem 2.4.5.

Next we prove directly that \(A\) is invertible. Namely, we claim that \(A^{-1}=BC\text{.}\) Indeed, since \(C\) is the inverse of \(AB\text{,}\) we have \(A(BC)=(AB)C=I\text{.}\) Thus \(BC\) is a right-inverse of \(A\text{.}\) Corollary 2.4.13 now implies \((BC)A=I\text{,}\) and hence that \(A^{-1}=BC\text{,}\) as claimed.

This completes the proof that if \(AB\) is invertible, then \(A\) and \(B\) are invertible.

We conclude with two results related to row reduction. The first provides an equivalent formulation of row equivalence in terms of matrix arithmetic: an elaboration of our discussion in Subsection 2.4.2.

Corollary 2.4.15. Row equivalence and invertible matrices.

Let \(A\) and \(B\) be \(m\times n\) matrices. The following statements are equivalent.

The matrices \(A\) and \(B\) are row equivalent.
There are elementary matrices \(E_1, E_2, \dots, E_r\) such that \(E_rE_{r-1}\cdots E_1A=B \text{.}\)
There is an invertible \(m\times m\) matrix \(Q\) such that \(QA=B\text{.}\)

Proof.

The equivalence of (1) and (2) was shown in the course of our discussion in Subsection 2.4.2. The equivalence of (2) and (3) is a direct consequence of (5) of Theorem 2.4.5 since a matrix \(Q\) is invertible if and only if it can be written as

\begin{equation*} Q=E_rE_{r-1}\cdots E_1 \end{equation*}

for some elementary matrices \(E_i\text{.}\)

Remark 2.4.16. Properties of row equivalence.

With the help of Corollary 2.4.15, we can easily show that the row equivalence relation is reflexive, symmetric, and transitive. In other words, letting \(A\sim B\) denote that \(A\) is row equivalent to \(B\text{,}\) the following properties hold.

Reflexivity.
For any matrix \(A\text{,}\) we have \(A\sim A\text{:}\) i.e., every matrix is row equivalent to itself.
Symmetry.
For all matrices \(A\) and \(B\text{,}\) if \(A\sim B\text{,}\) then \(B\sim A\text{.}\)
Transitivity.
For all matrices \(A, B, C\text{,}\) if \(A\sim B\) and \(B\sim C\text{,}\) then \(A\sim C\text{.}\)

The proof of these facts is left as an exercise (Exercise 2.4.7.22).

Lastly, we provide at last the proof of the third statement of Theorem 1.2.10 promised back in Section 1.2. This proof is due to Thomas Yuster¹.

Corollary 2.4.17. Uniqueness of reduced row echelon form.

Any matrix \(A\) is row equivalent to a unique matrix in reduced row echelon form.

Proof.

Let \(A\) be an \(m\times n\) matrix. Using Gauss-Jordan elimination, we can row reduce \(A\) to matrix \(U\) in reduced row echelon form. Suppose \(A\) is also row equivalent to the matrix \(U'\) in reduced row echelon form. Then \(U\) and \(U'\) are row equivalent, since the row equivalence relation is symmetric and transitive (Remark 2.4.16). Thus it suffices to show that if \(U\) and \(U'\) are row equivalent matrices in reduced row echelon form, then \(U=U'\text{.}\) We do so by induction on \(n\text{.}\) The base step is trivial, since there is only one \(m\times 1\) matrix in reduced row echelon form.

For the induction step we assume that any two row equivalent \(m\times n\) matrices in reduced row echelon form are equal. Suppose by contradiction that \(U=[u_{ij}]\) and \(U'=[u'_{ij}]\) are row equivalent \(m\times (n+1)\) matrices in reduced row echelon form, and that \(U\ne U'\text{.}\) By Corollary 2.4.15 there is an invertible matrix \(Q\) such that \(QU=U'\text{.}\) The first \(n\) columns of \(U\) and \(U'\) form matrices \(V\) and \(V'\text{,}\) respectively, that are in reduced row echelon form, as one easily checks. Furthermore, \(V\) and \(V'\) are row equivalent: indeed, using Theorem 2.1.24 we see that \(QU=U'\) implies \(QV=V'\text{.}\) By the induction hypothesis we must have \(V=V'\text{,}\) and thus \(U\) and \(U'\) can only differ in their last column.

We claim that \(U\) and \(U'\) must both have a leading one in the last column. To see why, consider the matrix equation \(U\boldx=\boldzero\text{,}\) where \(\boldx=[x_j]\) is a \((n+1)\times 1 \) column vector. Since \(QU=U'\text{,}\) we have

\begin{align*} U\boldx=\boldzero \amp\implies QU\boldx=Q\boldzero \\ \amp\implies U'\boldx=\boldzero \amp (U'=QU) \text{,} \end{align*}

and thus

\begin{equation*} U\boldx=\boldzero\implies U\boldx-U'\boldx=\boldzero\implies (U-U')\boldx=\boldzero\text{.} \end{equation*}

Because \(U\) and \(U'\) differ in at most their last column, the first \(n\) columns of \(U-U'\) are zero columns, and thus we have

\begin{equation} (U-U')\boldx=[x_{n+1}(u_{i(n+1)}-u'_{i(n+1)})]=\boldzero\text{.}\tag{2.4.5} \end{equation}

Since \(U\ne U'\text{,}\) there is some \(1\leq i\leq m\) such that \(u_{i(n+1)}\ne u'_{i(n+1)}\text{.}\) Since \(x_{n+1}(u_{i(n+1)}-u'_{i(n+1)})=0\) by (2.4.5), we must have \(x_{n+1}=0\text{.}\) We have shown that for any \(\boldx=[x_j]\) satisfying \(U\boldx=\boldzero\text{,}\) we must have \(x_{n+1}=0\text{.}\) It follows from Procedure 1.3.5 that \(U\) must have a leading in its last column, since otherwise the variable \(x_{n+1}\) in the system \(U\boldx=0\) would be free, and could assume any value. To see that \(U'\) also has a leading one in the last column, we use the same argument, starting with the equation \(Q^{-1}U'=U\text{.}\)

To summarize, starting with row equivalent matrices \(U\) and \(U'\) in reduced row echelon form, and assuming by contradiction that \(U\ne U'\text{,}\) we conclude that (a) the first \(n\) columns of \(U\) and \(U'\) are equal and form matrices in reduced row echelon form, and (b) the last columns of \(U\) and \(U'\) have leading ones. Since \(U\) and \(U'\) are in reduced row echelon form, and since the first \(n\) columns of \(U\) and \(U'\) are equal, we see that the leading ones in the last columns of \(U\) and \(U'\) must occur in the same row: namely, the first zero row of \(V=V'\text{.}\) It follows that \(U=U'\text{,}\) a contradiction.

Exercises 2.4.7 Exercises

WeBWork Exercises

1.

True or false: Suppose that \(E_1\) and \(E_2\) are two elementary matrices. Then \(E_1E_2 = E_2E_1\)

.
True
False

Answer.

\(\text{False}\)

Solution.

SOLUTION: False. If

\begin{equation*} E_1 = \left[\begin{array}{ccc} 0 \amp 1 \amp 0\cr 1 \amp 0 \amp 0\cr 0 \amp 0 \amp 1 \end{array}\right] \quad \mbox{and} \quad E_2 = \left[\begin{array}{ccc} 1 \amp 0 \amp 0\cr 0 \amp 0 \amp 1\cr 0 \amp 1 \amp 0 \end{array}\right], \end{equation*}

then

\begin{equation*} E_1 E_2 = \left[\begin{array}{ccc} 0 \amp 1 \amp 0\cr 1 \amp 0 \amp 0\cr 0 \amp 0 \amp 1 \end{array}\right] \left[\begin{array}{ccc} 1 \amp 0 \amp 0\cr 0 \amp 0 \amp 1\cr 0 \amp 1 \amp 0 \end{array}\right] = \left[\begin{array}{ccc} 0 \amp 0 \amp 1\cr 1 \amp 0 \amp 0\cr 0 \amp 1 \amp 0 \end{array}\right], \end{equation*}

but

\begin{equation*} E_2 E_1 = \left[\begin{array}{ccc} 1 \amp 0 \amp 0\cr 0 \amp 0 \amp 1\cr 0 \amp 1 \amp 0 \end{array}\right] \left[\begin{array}{ccc} 0 \amp 1 \amp 0\cr 1 \amp 0 \amp 0\cr 0 \amp 0 \amp 1 \end{array}\right] = \left[\begin{array}{ccc} 0 \amp 1 \amp 0\cr 0 \amp 0 \amp 1\cr 1 \amp 0 \amp 0 \end{array}\right] \neq E_1 E_2. \end{equation*}

2.

Let

\begin{equation*} A=\left(\begin{array}{ccc} 1 \amp 8 \amp 0 \cr 1 \amp 8 \amp 8 \cr 1 \amp -2 \amp 9 \end{array}\right) \end{equation*}

Then

\begin{equation*} A^{-1}=\left(\begin{array}{ccc} a_{11} \amp a_{12} \amp a_{13} \cr a_{21} \amp a_{22} \amp a_{23} \cr a_{31} \amp a_{32} \amp a_{33} \end{array}\right) \end{equation*}

where \(a_{11}=\) , \(a_{12}=\) , \(a_{13}=\) , \(a_{21}=\) , \(a_{22}=\) , \(a_{23}=\) , \(a_{31}=\) , \(a_{32}=\) , \(a_{33}=\) .

\(1.1\)

\(-0.9\)

\(0.8\)

\(-0.0125\)

\(0.1125\)

\(-0.1\)

\(-0.125\)

\(0.125\)

\(0\)

3.

Let

\begin{equation*} A=\left(\begin{array}{cc} 5 \amp -3 \cr 4 \amp 4 \end{array}\right) \end{equation*}

Find \(A^{-1}\) and use it to solve

\begin{equation*} AX=B \end{equation*}

where

\begin{equation*} B=\left(\begin{array}{c} 3 \cr 3 \end{array}\right) \end{equation*}

\begin{equation*} X=\left(\begin{array}{c} x_{1} \cr x_{2} \end{array}\right) \end{equation*}

where \(x_{1}=\) and \(x_{2}=\) .

Answer 1.

\(0.65625\)

Answer 2.

\(0.09375\)

4.

Solve the system of equations

\begin{equation*} \begin{array}{l} 5x+3y = -29 \\ 3x+2y = -18 \\ \end{array} \end{equation*}

by converting to a matrix equation and using the inverse of the coefficient matrix.

\(x=\)

\(y=\)

Answer 1.

\(-4\)

Answer 2.

\(-3\)

5.

Solve the system of equations

\begin{equation*} \begin{array}{r} 2 x - 4 y + z = 5 \\ -x + y -z= 0 \\ x - 2 y = 3 \\ \end{array} \end{equation*}

by converting to a matrix equation and using the inverse of the coefficient matrix.

\(x=\)

\(y=\)

\(z=\)

Answer 1.

\(-1\)

Answer 2.

\(-2\)

Answer 3.

\(-1\)

6.

a. The linear transformation \(T_1: R^2 \rightarrow R^2\) is given by:

\(T_1(x, y) = (2 x + 5 y, 6 x + 16 y)\text{.}\)

Find \(T_1^{-1}(x, y)\text{.}\)

\(T_1^{-1}(x, y) = (\)x + y, x + y\()\)

b. The linear transformation \(T_2: R^3 \rightarrow R^3\) is given by:

\(T_2(x, y, z) = (x + 2 z, 2 x + y, 1 y + z)\text{.}\)

Find \(T_2^{-1}(x, y, z)\text{.}\)

\(T_2^{-1}(x, y, z) = (\)x + y + z, x + y + z, x + y + z\()\)

c. Using \(T_1\) from part a, it is given that:

\(T_1(x, y) = (4, -2)\)

Find x and y.

\(x =\) \(y =\)

d. Using \(T_2\) from part b, it is given that:

\(T_2(x, y, z) = (6, -5, 2)\)

Find x, y, and z.

\(x =\) \(y =\) \(z =\)

\(8\)

\(-2.5\)

\(-3\)

\(1\)

\(0.2\)

\(0.4\)

\(-0.4\)

\(-0.4\)

\(0.2\)

\(0.8\)

\(0.4\)

\(-0.2\)

\(0.2\)

\(37\)

\(-14\)

\(-1.6\)

\(-1.8\)

\(3.8\)

Written Exercises

Linear systems and matrix equations.

Find a matrix \(A\) and column vectors \(\boldx\) and \(\boldb\) such that the given linear system is equivalent to the matrix equation \(A\boldx=\boldb\text{.}\)

7.

\begin{equation*} \begin{linsys}{3} x\amp +\amp 2y \amp -\amp z\amp =\amp 3 \\ -3x \amp \amp \amp +\amp 2z\amp = \amp 0 \end{linsys} \end{equation*}

8.

\begin{equation*} \begin{linsys}{2} 2x\amp +\amp 3y\amp =\amp 5z+10 \\ y \amp -\amp z \amp =\amp -x+2 \\ z\amp +\amp x\amp = \amp -2y-4 \end{linsys} \end{equation*}

Solving linear systems with matrix inverses.

Find an invertible matrix \(A\) and column vectors \(\boldx\) and \(\boldb\) such that the given linear system is equivalent to the matrix equation \(A\boldx=\boldb\text{.}\) Compute \(A^{-1}\) and use this to find the unique solution to the linear system. (See Remark 2.4.6.)

9.

\begin{equation*} \begin{linsys}{2} 7x\amp +\amp 2y\amp =\amp 3 \\ 2x\amp +\amp y \amp =\amp -5 \end{linsys} \end{equation*}

10.

\begin{equation*} \begin{linsys}{3} \amp \amp 2y\amp +\amp 3z \amp =\amp -2 \\ x\amp -\amp y \amp +\amp 2z \amp =\amp 1 \\ x \amp +\amp y \amp +\amp 6z \amp =\amp 1 \end{linsys} \end{equation*}

Inverse algorithm.

Use the inverse algorithm to determine whether each matrix is invertible, and to compute its inverse if possible.

You are not required to follow Gaussian elimination to the letter, and you may perform multiple operations at the same time, as long as they are independent of one another. For example, do not do \(r_2\rightarrow r_2-r+1\) and \(r_3\rightarrow r_3+2r_2\) in the same step.

11.

\(A=\begin{amatrix}[rrr]1/5\amp 1/5\amp -2/5\\ 1/5\amp 1/5\amp 1/10\\ 1/5\amp -4/5\amp 1/10 \end{amatrix}\)

Solution.

We use the inverse algorithm:

\begin{align*} \begin{amatrix}[rrr|rrr]1/5\amp 1/5\amp -2/5\amp 1\amp 0\amp 0\\ 1/5\amp 1/5\amp 1/10\amp 0\amp 1\amp 0\\ 1/5\amp -4/5\amp 1/10\amp 0\amp 0\amp 1 \end{amatrix} \amp \xrightarrow[]{r_2 - r_1} \begin{amatrix}[rrr|rrr]1/5\amp 1/5\amp -2/5\amp 1\amp 0\amp 0\\ 0\amp 0\amp 1/2\amp -1\amp 1\amp 0\\ 1/5\amp -4/5\amp 1/10\amp 0\amp 0\amp 1 \end{amatrix}\\ \amp \xrightarrow[]{r_3 - r_1} \begin{amatrix}[rrr|rrr]1/5\amp 1/5\amp -2/5\amp 1\amp 0\amp 0\\ 0\amp 0\amp 1/2\amp -1\amp 1\amp 0\\ 0\amp -1\amp 1/2\amp -1\amp 0\amp 1 \end{amatrix}\\ \amp \xrightarrow[]{5r_1} \begin{amatrix}[rrr|rrr]1\amp 1\amp -2\amp 5\amp 0\amp 0\\0\amp 0\amp 1/2\amp -1\amp 1\amp 0\\ 0\amp -1\amp 1/2\amp -1\amp 0\amp 1 \end{amatrix}\\ \amp \xrightarrow[]{r_2 \leftrightarrow r_3} \begin{amatrix}[rrr|rrr]1\amp 1\amp -2\amp 5\amp 0\amp 0\\ 0\amp -1\amp 1/2\amp -1\amp 0\amp 1\\ 0\amp 0\amp 1/2\amp -1\amp 1\amp 0 \end{amatrix}\\ \amp \xrightarrow[]{(-1)r_2} \begin{amatrix}[rrr|rrr]1\amp 1\amp -2\amp 5\amp 0\amp 0\\ 0\amp 1\amp -1/2\amp 1\amp 0\amp -1\\ 0\amp 0\amp 1/2\amp -1\amp 1\amp 0 \end{amatrix}\\ \amp \xrightarrow[]{r_1- r_2} \begin{amatrix}[rrr|rrr]1\amp 0\amp -3/2\amp 4\amp 0\amp 1\\ 0\amp 1\amp -1/2\amp 1\amp 0\amp -1\\ 0\amp 0\amp 1/2\amp -1\amp 1\amp 0 \end{amatrix}\\ \amp \xrightarrow[]{2r_3} \begin{amatrix}[rrr|rrr]1\amp 0\amp -3/2\amp 4\amp 0\amp 1\\ 0\amp 1\amp -1/2\amp 1\amp 0\amp -1\\ 0\amp 0\amp 1\amp -2\amp 2\amp 0 \end{amatrix}\\ \amp \xrightarrow[]{r_2 + \frac{1}{2}r_3} \begin{amatrix}[rrr|rrr]1\amp 0\amp -3/2\amp 4\amp 0\amp 1\\ 0\amp 1\amp 0\amp 0\amp 1\amp -1\\ 0\amp 0\amp 1\amp -2\amp 2\amp 0 \end{amatrix}\\ \amp \xrightarrow[]{r_1 + \frac{3}{2}r_3} \begin{amatrix}[rrr|rrr]1\amp 0\amp 0\amp 1\amp 3\amp 1\\ 0\amp 1\amp 0\amp 0\amp 1\amp -1\\ 0\amp 0\amp 1\amp -2\amp 2\amp 0 \end{amatrix}\text{.} \end{align*}

We conclude that

\begin{equation*} A^{-1}=\begin{amatrix}[rrr] 1\amp 3\amp 1\\ 0\amp 1\amp -1\\ -2\amp 2\amp 0\end{amatrix}\text{.} \end{equation*}

12.

\(A=\begin{amatrix}[rrr]1\amp 1\amp -2\\ 2\amp -3\amp -3\\ 1\amp -4\amp 1 \end{amatrix}\)

13.

\(A=\begin{amatrix}[rrrr]0\amp 0\amp 2\amp 0\\ 1\amp 0\amp 0\amp 1\\ 0\amp -1\amp 3\amp 0\\ 2\amp 1\amp 5\amp -3 \end{amatrix}\)

14.

\(A=\begin{amatrix}[rrrr]1\amp -1\amp -1\amp 1\\ 2\amp 1\amp -3\amp 0 \\ 3\amp -1\amp -1\amp -1 \\ 5\amp 1\amp -4\amp -2\end{amatrix}\)

15.

\(A=\begin{bmatrix}0\amp 0\amp 0\amp c_1\\ 0\amp 0\amp c_2\amp 0\\ 0\amp c_3\amp 0\amp 0\\ c_4\amp 0\amp 0\amp 0 \end{bmatrix}\text{,}\) \(c_i\ne 0\)

16.

\(A=\begin{bmatrix}c\amp 0\amp 0\amp 0\\ 1\amp c\amp 0\amp 0\\ 0\amp 1\amp c\amp 0\\ 0\amp 0\amp 1\amp c \end{bmatrix}\text{,}\) \(c\ne 0\)

Product of elementary matrices algorithm.

Each matrix below is invertible. Use the product of elementary matrices algorithm to write \(A\) as a product of elementary matrices.

Here you should perform Gaussian elimination to the letter, one row operation at a time.

17.

\(A=\begin{amatrix}[rr]0\amp 1\\2\amp -5 \end{amatrix}\)

Solution.

Row reduce \(A\) to the identity matrix:

\begin{align*} \begin{amatrix}[rr]0\amp 1\\2\amp -5 \end{amatrix} \amp \xrightarrow[]{r_1\leftrightarrow r_1} \begin{amatrix}[rr]2\amp -5 \\0\amp 1 \end{amatrix}\\ \amp \xrightarrow[]{\frac{1}{2}r_1} \begin{amatrix}[rr]1\amp -5/2\\ 0\amp 1 \end{amatrix}\\ \amp \xrightarrow[]{r_1+\frac{5}{2}r_2} \begin{amatrix}[rr]1\amp 0\\ 0\amp 1 \end{amatrix} \end{align*}

The corresponding elementary matrices are

\begin{equation*} E_1 = \begin{amatrix}[rr]0\amp 1\\ 1\amp 0 \end{amatrix} , E_2 = \begin{amatrix}[rr]1/2\amp 0\\ 0\amp 1 \end{amatrix}, E_3=\begin{amatrix}[rr]1\amp 5/2\\ 0\amp 1 \end{amatrix}\text{.} \end{equation*}

It follows that \(E_3E_2E_1A=I\text{,}\) and hence that

\begin{equation*} A=E_1^{-1}E_{2}^{-1}E_3^{-1}=\begin{amatrix}[rr]0\amp 1\\ 1\amp 0 \end{amatrix} \begin{amatrix}[rr]2\amp 0\\ 0\amp 1 \end{amatrix} \begin{amatrix}[rr]1\amp -5/2\\ 0\amp 1 \end{amatrix}\text{.} \end{equation*}

18.

\(A=\begin{amatrix}[rr]7\amp 1\\ 5\amp 1 \end{amatrix}\text{.}\)

19.

\(A=\begin{amatrix}[rrr]1\amp 1\amp 0\\ 1\amp 1\amp 1\\ 0\amp 1\amp 1 \end{amatrix}\)

20.

\(A=\begin{amatrix}[rrr]1\amp 1\amp 1\\ 1\amp -1\amp 0\\ 1\amp 0\amp -1 \end{amatrix}\text{.}\)

21.

According to Statement (2) of the invertibility theorem, a matrix \(A\) is invertible if and only if for all column vectors \(\boldb\) the matrix equation \(A\boldx=\boldb\) has a unique solution. Show that we can add the following, weaker-looking version of (2) to our list of equivalent statements:

(2’) The matrix equation \(A\boldx=\boldb\) has a solution for any column vector \(\boldb\text{.}\)

Hint.

Try to logically weave Statement (2’) into our original list of equivalent statements by (a) finding a statement from our original list that implies (2’), and (b) find a statement in our original list that is implied by (2’).

You may make use of Corollary 2.4.13 in your argument.

22. Properties of row equivalence.

Let \(A\sim B\) denote that matrix \(A\) is row equivalent to \(B\text{.}\) Use Corollary 2.4.15 to show that the relation \(A\sim B\) is reflexive, symmetric, and transitive, as described in Remark 2.4.16 .

23.

Let \(A\) be an \(m\times n\) matrix, and let \(Q\) be an invertible \(m\times m\) matrix.

Show that the two matrix equations

\begin{align*} A\underset{n\times 1}{\boldx} \amp=\underset{m\times 1}{\boldzero} \\ (QA)\underset{n\times 1}{\boldx} \amp=\underset{m\times 1}{\boldzero} \end{align*}

have the same set of solutions. In other words show that

\begin{equation*} A\boldx=\boldzero \iff (QA)\boldx=\boldzero\text{.} \end{equation*}

24.

Suppose \(A\) and \(B\) are row equivalent square matrices. Prove: \(A\) is invertible if and only if \(B\) is invertible.

25.

Use the provided information to determine whether the given square matrix \(A\) is invertible. Justify your answer using the inveribility theorem or one of its corollaries.

(a)

There are column vectors \(\boldx_1\ne \boldx_2\) such that \(A\boldx_1=A\boldx_2\text{.}\)

(b)

\(A^2\) is invertible.

(c)

\(A^r=\boldzero\) for some \(r\geq 1\text{.}\)

(d)

The sum of the columns of \(A\) is equal to the zero column vector.

26.

Answer true or false. If true, provide a proof; if false, exhibit an explicit counterexample.

(a)

The product of two \(n\times n\) elementary matrices is elementary.

(b)

The product of two \(n\times n\) elementary matrices is invertible.

(c)

The sum of two invertible \(n\times n\) matrices is invertible.

(d)

If \(A\) is a singular \(n\times n\) matrix, then the linear system \(A\boldx=\boldzero\) has infinitely many solutions.

(e)

If \(B\) is obtained from the invertible matrix \(A\) by replacing its second row with the sum of its first and second rows, then \(B\) is invertible.

(f)

If \(A\) is square matrix, and \(\boldb\) is a column vector such that the matrix equation \(A\boldx=\boldb\) has a unique solution, then \(A\) is invertible.

(g)

If \(A\) and \(B\) are row equivalent, then the matrix equations \(Ax=\boldzero\) and \(Bx=\boldzero\) have the same solution set.

(h)

If \(A\) or \(B\) is singular, then \(AB\) is singular.

Thomas Yuster. The reduced row echelon form of a matrix is unique: a simple proof, Mathematics Magazine 57 (1984), no. 2, 93-94.