Matrix arithmetic

Section 2.1 Matrix arithmetic

Matrices played a small supporting role in our discussion of linear systems in Chapter 1. In this chapter we bring them to center stage and give them a full-blown treatment as independent mathematical objects in their own right.

Like any mathematical entity worth its salt, matrices can be employed in a vast multitude of ways. As such it is important to allow matrices to transcend their humble beginnings in this course as boiled down systems of linear equations. We record this observation as another principle.

Mantra 2.1.1. Matrix mantra.

A matrix is a matrix is a matrix.

Not every matrix should be thought of as an augmented matrix associated to a linear system.

Subsection 2.1.1 The basics

Definition 2.1.2. Matrix.

A (real) matrix is a rectangular array of real numbers

\begin{equation} A=\genmatrix\text{.}\tag{2.1.1} \end{equation}

The number \(a_{ij}\) located in the \(i\)-th row and \(j\)-th column of \(A\) is called the \((i,j)\)-entry (or \(ij\)-th entry) of \(A\text{.}\)

A matrix with \(m\) rows and \(n\) columns is said to have size (or dimension) \(m\times n\text{.}\)

We will typically use capital letters near the beginning of the alphabet (e.g. \(A, B,C, D\text{,}\) etc.) to denote matrices.

The displayed matrix in (2.1.1) is costly both in the space it takes up in show, and the time it takes to write down or typeset. Accordingly we introduce two somewhat complementary forms of notation to help describe matrices.

Definition 2.1.3.

Matrix-building notation: The notation \([a_{ij}]_{m\times n}\) denotes the \(m\times n\) matrix whose \(ij\)-th entry (\(i\)-th row, \(j\)-th column) is \(a_{ij}\text{.}\) When there is no danger of confusion, this notation is often shortened to \([a_{ij}]\text{.}\)
Matrix entry notation: Given a matrix \(A\text{,}\) the notation \([A]_{ij}\) denotes the \(ij\)-th entry of \(A\text{.}\)

Thus if \(A=[a_{ij}]_{m\times n}\text{,}\) then \([A]_{ij}=a_{ij}\) for all \(1\leq i\leq m\) and \(1\leq j\leq n\text{.}\)

Remark 2.1.4.

The matrix-building notation is often used simply to give names to the entries of an arbitrary matrix. However, it can also be used to describe a matrix whose \(ij\)-th entry is given by specified rule or formula.

For example, let \(A=[a_{ij}]_{2\times 3}\text{,}\) where \(a_{ij}=(i-j)j\text{.}\) This is the \(2\times 3\) matrix whose \(ij\)-th entry is \((i-j)j\text{.}\) Thus

\begin{equation*} A=\begin{bmatrix}(1-1)1 \amp (1-2)2 \amp (1-3)3\\ (2-1)1 \amp (2-2)2 \amp (2-3)3 \end{bmatrix}=\begin{bmatrix}0 \amp -2 \amp -6\\ 1 \amp 0 \amp -3 \end{bmatrix}\text{.} \end{equation*}

In this example we have \([A]_{23}=-3\) and \([A]_{ii}=0\) for \(i=1,2\text{.}\)

In everyday language the notion of equality is taken as self-evident. Two things are equal if they are the same. What more is there to say? In mathematics, each time we introduce a new type of mathematical object (e.g., sets, functions, \(n\)-tuples, etc.) we need to spell out exactly what we mean for two things to be considered equal. We do so now with matrices.

Definition 2.1.5. Matrix equality.

Let \(A\) and \(B\) be matrices of dimension \(m\times n\) and \(m'\times n'\text{,}\) respectively. The two matrices are equal if

\(m=m'\) and \(n=n'\text{;}\)
\([A]_{ij}=[B]_{ij}\) for all \(1\leq i\leq m\) and \(1\leq j\leq n\text{.}\)

In other words, we have \(A=B\) if and only if \(A\) and \(B\) have the same shape, and each entry of \(A\) is equal to the corresponding entry of \(B\text{.}\)

Example 2.1.6.

The matrices

\begin{align*} A \amp =\begin{bmatrix}1\amp 2\amp 3\amp 4 \end{bmatrix} \amp B\amp = \begin{bmatrix} 1\\ 2\\ 3\\ 4\end{bmatrix} \end{align*}

are not equal to one another, despite their having the same entries that appear roughly in the same order. In this case equality does not hold as \(A\) and \(B\) have different shapes: \(A\) is \(1\times 4\text{,}\) and \(B\) is \(4\times 1\text{.}\)

The matrices \(A=\begin{bmatrix}1\amp 2 \\3\amp 4 \end{bmatrix}\) and \(B=\begin{bmatrix}1\amp 2\\ 5\amp 4\end{bmatrix}\) have the same dimension, but are not equal since \([A]_{21}=3\ne 5=[B]_{21}\text{.}\)

Definition 2.1.7. Square matrices, row vectors, column vectors, zero matrices.

A matrix \(A\) is square if its dimension is \(n\times n\text{.}\) The diagonal of a square matrix \(A=[a_{ij}]_{n\times n}\) consists of the entries \(a_{ii}\) for \(1\leq i\leq n\text{.}\)

A \(1\times n\) matrix

\begin{equation*} \bolda=\begin{bmatrix}a_1\amp a_2\amp \cdots \amp a_n \end{bmatrix} \end{equation*}

is called a row vector. The \(j\)-th entry of a row vector \(\bolda\) is denoted \([\bolda]_j\)

An \(n\times 1\) matrix

\begin{equation*} \boldb=\begin{bmatrix}b_1\\ b_2\\\vdots \\ b_m \end{bmatrix}\text{,} \end{equation*}

is called a column vector. The \(i\)-th entry of a column vector \(\boldb\) is denoted \([\boldb]_i\text{.}\)

The \(m\times n\) zero matrix, denoted \(\boldzero_{m\times n}\text{,}\) is the matrix of that dimension, all of whose entries are zero: i.e., \((\boldzero_{m\times n})_{ij}=0\) for all \(1\leq i\leq m\) and \(1\leq j\leq n\text{.}\)

When the actual dimension is not significant, we will often drop the subscript and write simply \(\boldzero\) for a zero matrix of suitable dimension.

Remark 2.1.8. Matrices as collections of columns/rows.

Let \(A\) be an \(m\times n\) matrix. We will often think of \(A\) as a collection of columns, in which case we write

\begin{equation} A=\begin{bmatrix}\vert \amp \vert \amp \amp \vert \\ \boldc_1 \amp \boldc_2\amp \cdots \amp \boldc_n \\ \vert \amp \vert \amp \amp \vert \end{bmatrix}\text{,}\tag{2.1.2} \end{equation}

where \(\boldc_j\) is the column vector consisting of the entries of the \(j\)-th column of \(A\text{:}\) i.e.,

\begin{equation*} \boldc_j=\begin{bmatrix}a_{1j}\\ a_{2j}\\ \vdots \\ a_{mj}\end{bmatrix}\text{.} \end{equation*}

Similarly, when we think of \(A\) as a collection of rows, we write

\begin{equation} A=\begin{bmatrix}\ -\boldr_{1}- \ \\ \ -\boldr_{1}- \ \\ \vdots \\ \ -\boldr_{m}- \ \\ \end{bmatrix}\text{,}\tag{2.1.3} \end{equation}

where \(\boldr_i\) is the row vector consisting of the entries of the \(i\)-th row of \(A\text{:}\) i.e.,

\begin{equation*} \boldr_i= \begin{bmatrix}a_{i1}\amp a_{i2}\amp\cdots\amp a_{in} \end{bmatrix}\text{.} \end{equation*}

The vertical and horizontal lines in (2.1.2) and (2.1.3) are used to emphasize that the \(\boldc_j\) are columns vectors and the \(\boldr_i\) are row vectors.

Sage example 3. Matrix entries, rows, and columns.

Sage syntax for accessing specific entries of a matrix is similar in spirit to our matrix entry notation. However, as with all things Python, we always count from 0. Thus if A is assigned to a matrix in Sage, A[i,j] is its \((i+1),(j+1)\)-th entry.

Prescribed subsets of matrix entries are obtained via slicing methods: for example, A[a:b, c:d] returns the collection of entries \([A]_{ij}\) with \(a+1\leq i\lt b+1\) and \(c+1\leq j\lt d\text{,}\) arranged as a matrix.

Leaving the left or right side of : blank in this notation removes the corresponding restriction bound (left or right) from the index in question. Thus A[2, :] returns the third row of \(A\text{,}\) and A[1:, 3] returns the portion of the fourth column of \(A\) beginning with its second entry.

Alternatively, we can obtain a list of all rows or columns of \(A\) using the the methods rows() and columns().

Use the empty cell below to try out some of these commands.

Subsection 2.1.2 Addition, subtraction and scalar multiplication

We now lay out the various algebraic operations we will use to combine and transform matrices; we refer to the use of these operations loosely as matrix arithmetic. Some of these operations resemble familiar operations from real arithmetic in terms of their notation and definition. Do not be lulled into complacency! These are new operations defined for a new class of mathematical objects, and must be treated carefully. In particular, pay close attention to (a) exactly what type of mathematical objects serve as inputs for each operation (the ingredients of the operation), and (b) what type of mathematical object is outputted.

Definition 2.1.9. Matrix addition and subtraction.

Matrix addition is the operation defined as follows: given two \(m\times n\) matrices \(A=[a_{ij}]_{m\times n}\) and \(B=[b_{ij}]_{m\times n}\text{,}\) we define their sum to be the matrix

\begin{equation*} A+B\colon =[a_{ij}+b_{ij}]_{m\times n}\text{.} \end{equation*}

In other words \(A+B\) is the \(m\times n\) matrix satisfying

\begin{equation*} [A+B]_{ij}=[A]_{ij}+[B]_{ij}=a_{ij}+b_{ij} \end{equation*}

for all \(1\leq i\leq m\) and \(1\leq j\leq n\text{.}\)

Matrix subtraction is the operation defined as follows: given two \(m\times n\) matrices \(A=[a_{ij}]_{m\times n}\) and \(B=[b_{ij}]_{m\times n}\text{,}\) we define their difference to be the matrix

\begin{equation*} A-B\colon =[a_{ij}-b_{ij}]_{m\times n}\text{.} \end{equation*}

In other words \(A-B\) is the \(m\times n\) matrix satisfying

\begin{equation*} [A-B]_{ij}=[A]_{ij}-[B]_{ij}=a_{ij}-b_{ij} \end{equation*}

for all \(1\leq i\leq m\) and \(1\leq j\leq n\text{.}\)

Remark 2.1.10.

Observe that matrix addition/subtraction is not defined for any pair of matrices. The ingredients of matrix addition (or subtraction) are two matrices of the same dimension; and the output is a third matrix of this common dimension.

Definition 2.1.11. Scalar multiplication of matrices.

Given any matrix \(A=[a_{ij}]_{m\times n}\) and any constant \(c\in \R\text{,}\) we define

\begin{equation*} cA=[ca_{ij}]\text{.} \end{equation*}

In other words, \(cA\) is the \(m\times n\) matrix obtained by “scaling” each of the entries of \(A\) by the constant \(c\text{.}\)

We call \(cA\) a scalar multiple of \(A\text{.}\) Furthermore, to help distinguish between matrices and real numbers, we will refer to elements of \(\R\) as scalars.

Remark 2.1.12.

Whereas matrix addition and subtraction closely resemble corresponding operations involving real numbers, there is no obvious real arithmetic analogue to matrix scalar multiplication. In particular, notice how matrix scalar multiplication is a sort of hybrid operation that combines mathematical objects of two very different natures: a real number (or scalar) on the one hand, and a matrix on the other.

We call the result of applying a sequence of matrix additions and scalar multiplications a linear combination of matrices.

Definition 2.1.13. Linear combination of matrices.

Given matrices \(A_1,A_2,\dots, A_r\) of the same dimension, and scalars \(c_1,c_2, \dots ,c_r\text{,}\) the expression

\begin{equation*} c_1A_1+c_2A_2\cdots +c_rA_r \end{equation*}

is called a linear combination of matrices. The scalars \(c_i\) are called the coefficients of the linear combination.

Example 2.1.14.

Let \(A=\begin{amatrix}[rrr]1\amp -1\amp 2\\ 0\amp 0\amp 1\end{amatrix}\) and \(B=\begin{amatrix}[rrr]0\amp 1\amp 1\\ -1\amp -1\amp 1\end{amatrix}\text{.}\) Compute \(2A+(-3)B\text{.}\)

Solution.

\begin{align*} 2A+(-3)B\amp= \begin{amatrix}[rrr]2\amp -2\amp 4\\ 0\amp 0\amp 2\end{amatrix}+\begin{amatrix}[rrr]0\amp -3\amp -3\\ 3\amp 3\amp -3\end{amatrix}\\ \amp=\begin{amatrix}[rrr]-2\amp 1\amp -3\\ 3\amp 3\amp -1\end{amatrix} \text{.} \end{align*}

Example 2.1.15.

Show that \(B=\begin{amatrix}3\amp -3\amp 3 \end{amatrix}\) can be expressed as a linear combination of the matrices

\begin{equation*} A_1=\begin{amatrix}[rrr]1\amp 1\amp 1\end{amatrix}, \ A_2=\begin{amatrix}[rrr]1\amp -1\amp 0\end{amatrix}, \ A_3=\begin{amatrix}[rrr]1\amp 1\amp -2\end{amatrix}\text{.} \end{equation*}

Solution.

We must solve the matrix (or row vector) equation

\begin{equation*} aA_1+bA_2+cA_3=B \end{equation*}

for the scalars \(a,b,c\text{.}\) Computing the linear combination on the left yields the matrix equation

\begin{equation*} \begin{amatrix}[rrr]a+b+c\amp a-b+c\amp a-2c\end{amatrix}=\begin{amatrix}[rrr]3\amp -3\amp 3\end{amatrix}\text{.} \end{equation*}

Using the definition of matrix equality (Definition 2.1.5), we get the system of equations

\begin{equation*} \begin{linsys}{3} 1a \amp +\amp b \amp + \amp c \amp = \amp 3\\ a \amp-\amp b\amp +\amp c\amp =\amp -3\\ a \amp \amp \amp -\amp 2c\amp =\amp 3 \end{linsys}\text{.} \end{equation*}

Using Gaussian elimination we find that there is a unique solution to this system: namely, \((a,b,c)=(1,3,-1)\text{.}\) We conclude that \(B=A_1+3A_2+(-1)A_3=A_1+3A_2-A_3\text{.}\)

Remark 2.1.16.

Let \(A_1, A_2,\dots, A_r\) be \(m\times n\) matrices, An easy induction argument on \(r\) shows that for any scalars \(c_1,c_2,\dots, c_r\) we have

\begin{equation*} [c_1A_1+c_2A_2+\cdots +c_rA_r]_{ij} =c_1[A_1]_{ij}+c_2[A_2]_{ij}+\cdots c_r[A_r]_{ij} \end{equation*}

for all \(1\leq i\leq m\text{,}\) \(1\leq j\leq n\text{.}\) (See Exercise 2.1.6.11. )

Subsection 2.1.3 Matrix multiplication

So how do we define the product of two matrices? Looking at the previous operations, you might have guessed that we should define the product of two \(m\times n\) matrices by taking the product of their corresponding entries. Not so!

Definition 2.1.17. Matrix multiplication.

Matrix multiplication is the operation defined as follows: given an \(m\times n\) matrix \(A=[a_{ij}]_{m\times n}\) and an \(n\times r\) matrix \(B=[b_{ij}]_{n\times r}\text{,}\) we define their product to be the \(m\times r\) matrix \(AB\) whose \(ij\)-th entry is given by the formula

\begin{equation*} [AB]_{ij}=a_{i1}b_{1j}+a_{i2}b_{2j}+\cdots a_{ir}b_{rj} =\sum_{\ell=1}^ra_{i\ell}b_{\ell j} \end{equation*}

for all \(1\leq i\leq m\) and \(1\leq j\leq r\text{.}\)

Figure 2.1.18. In \(C=AB\text{,}\) the \(ij\)-th entry \(c_{ij}=\sum_{k=1}^na_{ik}b_{kj}\) is computed by moving across the \(i\)-th row of \(A\) and down the \(j\)-th column of \(B\text{.}\)

Remark 2.1.19. Size and matrix multiplication.

Observe how, like addition, matrix multiplication is not defined for any pair of matrices: there must be a certain agreement in their dimensions.

In more detail, for the product of \(A_{mn}\) and \(B_{pr}\) to be defined, we need \(n=p\text{.}\) In other words we need the “inner” dimensions of \(A\) and \(B\) to be equal:

\begin{equation*} \underset{m\times \boxed{n}}{A}\hspace{5pt} \underset{\boxed{n}\times r}{B}\text{.} \end{equation*}

If this condition is met, the dimension of the resulting matrix \(AB\) is determined by the “outer” dimensions of \(A\) and \(B\text{.}\) Schematically, you can think of the inner dimensions as being “canceled out”:

\begin{equation*} \underset{\boxed{m}\times\cancel{n}}{A}\hspace{5pt}\underset{\cancel{n}\times\boxed{r}}{B}=\underset{m\times r}{AB}. \end{equation*}

Example 2.1.20. Matrix multiplication.

Consider the matrices

\begin{align*} A\amp =\begin{amatrix}[rrr] 1\amp 0\amp -3 \\ -2\amp 1\amp 1 \end{amatrix} \amp B\amp =\begin{amatrix}[rr] 0\amp -1 \\ -1\amp 2 \\ 3\amp 1 \end{amatrix}\text{.} \end{align*}

Since the “inner dimensions” of \(A\) and \(B\) agree, we can form the product matrix \(C=AB\text{,}\) which has dimension \(2\times 2\text{.}\) Let \(c_{ij}=[C]_{ij}\) for all \(1\leq i,j\leq 2\text{.}\) Using Definition 2.1.17, we compute

\begin{align*} c_{11}\amp =1\cdot 0+0\cdot(-1)+(-3)\cdot 3=-9 \\ c_{12}\amp =1\cdot(-1)+0\cdot 2+(-3)\cdot 1=-4 \\ c_{21} \amp =(-2)\cdot 0+1\cdot (-1)+1\cdot 3=2 \\ c_{22} \amp =(-2)\cdot( -1)+1\cdot 2+1\cdot 1=5\text{.} \end{align*}

We conclude that

\begin{equation*} C=\begin{bmatrix}c_{11} \amp c_{12} \\ c_{21} \amp c_{22}\end{bmatrix}= \begin{amatrix}[rr]-9\amp -4 \\ 2\amp 5 \end{amatrix} \text{.} \end{equation*}

The formula for the \(ij\)-th entry of a matrix product \(AB\) can be succinctly described as the dot product of the \(i\)-th row of \(A\) with the \(j\)-th column of \(B\text{.}\) You may have already met the dot product in the special case of \(2\)- and \(3\)-tuples; the definition generalizes easily to \(n\)-tuples for any positive integer \(n\text{.}\) We will have a lot more to say about the dot product and related operations in Chapter 4. For now we will provide an official definition so that we can conveniently describe matrix multiplication in terms of dot products.

Definition 2.1.21. Dot product.

Given \(n\)-tuples \(\boldx=(x_1,x_2,\dots, x_n)\) and \(\boldy=(y_1,y_2,\dots, y_n)\text{,}\) their dot product, denoted \(\boldx\cdot \boldy\text{,}\) is defined as

\begin{equation*} \boldx\cdot \boldy=x_1y_1+x_2y_2+\cdots +x_ny_n=\sum_{i=1}^nx_iy_i\text{.} \end{equation*}

Theorem 2.1.22.

Let \(A\) be an \(m\times n\) matrix, and let \(B\) be an \(n\times r\) matrix. For all \(1\leq i\leq m\text{,}\) let \(\boldr_i\) be the \(i\)-th row of \(A\text{;}\) and for all \(1\leq k\leq r\) let \(\boldc_j\) be the \(j\)-th column of \(B\text{.}\) For all \(1\leq i\leq m, 1\leq j\leq n\text{,}\) we have

\begin{equation*} [AB]_{ij}=\boldr_i\cdot \boldc_j\text{,} \end{equation*}

where \(\boldr_i\) and \(\boldc_j\) are treated as \(n\)-tuples. In other words, the \(ij\)-th entry of \(AB\) is the dot product of the \(i\)-th row of \(A\) and the \(j\)-th column of \(B\text{.}\)

Proof.

Fix a pair \((i,j)\) with \(1\leq i\leq m\) and \(1\leq j\leq r\text{.}\) Considered as \(n\)-tuples, the \(i\)-th row of \(A\) and \(j\)-th column of \(B\) are given as

\begin{align*} \boldr_i \amp =(a_{i1}, a_{i2},\dots, a_{in})\\ \boldc_j \amp =(b_{1j},b_{2j},\dots, b_{nj})\text{.} \end{align*}

We have

\begin{align*} [AB]_{ij} \amp = \sum_{k=1}^na_{ik}b_{kj} \amp( \knowl{./knowl/d_matrix_mult.html}{\text{2.1.17}}) \\ \amp =\boldr_i\cdot \boldc_j \amp (\knowl{./knowl/d_dot_product.html}{\text{2.1.21}}) \text{,} \end{align*}

as claimed.

Example 2.1.23. Matrix multiplication via dot product.

Consider the matrices

\begin{align*} A \amp =\begin{amatrix}[rrrr]1\amp 1\amp 1\amp 1\\ 1\amp 2\amp 1\amp 2\end{amatrix} \amp B\amp =\begin{amatrix}[rr] 1\amp -1 \\ 0\amp 1\\ 1\amp 0 \\ 0\amp 0 \end{amatrix}\text{.} \end{align*}

The two rows of \(A\) (in tuple form) are

\begin{align*} \boldr_1 \amp =(1,1,1,1) \amp \boldr_2=(1,2,1,2)\text{.} \end{align*}

The two columns of \(B\) (in tuple form) are

\begin{align*} \boldc_1 \amp =(1,0,1,0) \amp \boldc_2=(-1,1,0,0)\text{.} \end{align*}

Using the dot product description of matrix multiplication, we compute

\begin{align*} AB \amp =\begin{bmatrix}\boldr_1\cdot \boldc_1 \amp \boldr_1\cdot \boldc_2 \\ \boldr_2\cdot \boldc_1 \amp \boldr_2\cdot \boldc_2 \end{bmatrix}\\ \amp = \begin{amatrix}[rr] 2 \amp 0 \\ 2\amp 1\end{amatrix}\text{.} \end{align*}

The definition of a matrix product \(AB\) is undoubtedly more complicated than you expected, and seems to come completely out of the blue. All of this will make more sense once we begin thinking of matrices \(A\) as defining certain functions \(T_A\text{.}\) Our formula for the entries of \(AB\) is chosen precisely so that this new matrix corresponds to the composition of the functions \(T_A\) and \(T_B\text{:}\) i.e. so that

\begin{equation*} T_{AB}=T_A\circ T_B\text{.} \end{equation*}

(See Theorem 3.2.24.) Under this interpretation, the ponderous restriction on the dimensions of the ingredient matrices ensures that the two functions \(T_A\) and \(T_B\) can be composed.

Sage example 4. Matrix arithmetic.

We use + and * for matrix addition and multiplication.

As evidence of Sage’s flexibility, the same symbol * is also used for scalar multiplication.

Edit the cell below to practice these operations.

Subsection 2.1.4 Alternative methods of multiplication

In addition to the given definition of matrix multiplication, we will make heavy use of two further ways of computing matrix products, called the column and row methods of matrix multiplication.

Theorem 2.1.24. Column method of matrix multiplication.

Let \(A=[a_{i}]_{m\times n}\) and \(B=[b_{ij}]_{n\times r}\text{.}\) The column method of matrix multiplication computes \(AB\) using the two steps below.

Step 1: Let \(\boldb_j\) be the \(j\)-th column of \(B\text{,}\) considered as a column vector. Then

\begin{align*} AB \amp =A\begin{bmatrix}\vert \amp \vert \amp \amp \vert \\ \boldb_1\amp \boldb_2\amp \cdots\amp \boldb_r\\ \vert \amp \vert \amp \amp \vert \end{bmatrix}\\ \amp=\begin{bmatrix}\vert \amp \vert \amp \amp \vert \\ A\boldb_1\amp A\boldb_2\amp \cdots\amp A\boldb_r\\ \vert \amp \vert \amp \amp \vert \end{bmatrix}\text{.} \end{align*}
Step 2: Let \(\bolda_j\) be the \(j\)-th column of \(A\text{,}\) considered as a column vector. Given any column vector \(\boldb=[b_{i}]_{n\times 1}\) we have

\begin{equation*} A\,\boldb=A\,\begin{bmatrix}b_1\\b_2\\ \vdots \\b_n \end{bmatrix} = b_1\bolda_1+b_2\bolda_2+\cdots +b_n\bolda_n\text{.} \end{equation*}

Proof.

We prove the equalities in both steps separately.

Proof of Step 1.

We must show \(AB=C\text{,}\) where

\begin{equation*} C=\begin{bmatrix}\vert \amp \vert \amp \amp \vert \\ A\boldb_1\amp A\boldb_2\amp \cdots\amp A\boldb_r\\ \vert \amp \vert \amp \amp \vert \end{bmatrix}\text{.} \end{equation*}

First we show \(AB\) and \(C\) have the same size. By definition of matrix multiplication, \(AB\) is \(m\times r\text{.}\) By construction \(C\) has \(r\) columns and its \(j\)-th column is \(A\boldb_j\text{.}\) Since \(A\) and \(\boldb_j\) have size \(m\times n\) and \(n\times 1\text{,}\) respectively, \(A\boldb_j\) has size \(m\times 1\text{.}\) Thus each of the \(r\) columns of \(C\) is an \(m\times 1\) column vector. It follows that \(C\) is \(m\times r\text{,}\) as desired.

Next we show that \([AB]_{ij}=[C]_{ij}\) for all \(1\leq i\leq m\text{,}\) \(1\leq j\leq r\text{.}\) Since the \(ij\)-th entry of \(C\) is the \(i\)-th entry of the \(j\)-th column of \(C\text{,}\) we have

\begin{align*} [C]_{ij} \amp= [A\boldb_j]_{i} \\ \amp=\sum_{k=1}^n a_{ik}b_{kj} \\ \amp =[AB]_{ij}\text{.} \end{align*}

Proof of Step 2.

We must show that \(A\boldb=\boldc\text{,}\) where

\begin{equation*} \boldc=b_1\bolda_1+b_2\bolda_2+\cdots +b_n\bolda_n\text{.} \end{equation*}

The usual argument shows that both \(A\boldb\) and \(\boldc\) are \(m\times 1\) column vectors. It remains only to show that the \(i\)-th entry \([A\boldb]_i\) of the column \(A\boldb\) is equal to the \(i\)-th entry \([\boldc]_i\) of \(\boldc\) for all \(1\leq i\leq m\text{.}\) For any such \(i\) we have

\begin{align*} [\boldc]_i \amp = [b_1\bolda_1+b_2\bolda_2+\cdots +b_n\bolda_n]_i\\ \amp= b_1[\bolda_1]_i+b_2[\bolda_2]_i+\cdots +b_n[\bolda_n]_i \amp (\knowl{./knowl/rm_entry_lin_comb.html}{\text{Remark 2.1.16}})\\ \amp= b_1a_{i1}+b_2a_{i2}+\cdots +b_n\bolda_{in}\amp (\text{def. of } \bolda_j) \\ \amp= a_{i1}b_1+a_{i2}b_2+\cdots+a_{in}b_n \\ \amp =[A\boldb]_i \amp (\knowl{./knowl/d_matrix_mult.html}{\text{Definition 2.1.17}})\text{.} \end{align*}

Remark 2.1.25.

Theorem 2.1.24 amounts to a two-step process for computing an arbitrary matrix product \(AB\text{.}\)

The first statement (Step 1) tells us that the \(j\)-th column of the matrix \(AB\) can be obtained by computing the product \(A\,\boldb_j\) of \(A\) with the \(j\)-th column of \(B\text{.}\)

The second statement (Step 2) tells us that each product \(A\,\boldb_j\) can itself be computed as a certain linear combination of the columns of \(A\) with coefficients drawn from \(\boldb_j\text{.}\)

A similar remark applies to computing matrix products using the row method, as described below in Theorem 2.1.26.

Theorem 2.1.26. Row method of matrix multiplication.

Let \(A=[a_{i}]_{m\times n}\) and \(B=[b_{ij}]_{n\times r}\text{.}\) The row method of matrix multiplication computes \(AB\) using the two steps below.

Step 1: Let \(\bolda_i\) be the \(i\)-th row of \(A\text{.}\) Then

\begin{align*} AB\amp =\begin{bmatrix} \ - \bolda_1- \ \\ \ - \bolda_2- \ \\ \vdots \\ \ - \bolda_m- \ \end{bmatrix} B\\ \amp = \begin{bmatrix}\ - \bolda_1B- \ \\ \ -\bolda_2B- \ \\ \vdots \\ \ -\bolda_mB - \ \end{bmatrix}\text{.} \end{align*}
Step 2: Let \(\boldb_i\) be the \(i\)-th row of \(B\text{.}\) Given any row vector \(\bolda=[a_{j}]_{1\times n}\) we have

\begin{equation*} \bolda\, B=\begin{bmatrix}a_1\amp a_2\amp \cdots \amp a_n \end{bmatrix} B= a_1\boldb_1+a_2\boldb_2+\cdots +a_n\boldb_n\text{.} \end{equation*}

Proof.

The proof is very similar to that of Theorem 2.1.24 and is left to the reader.

Example 2.1.27. Column and row methods.

Let \(A=\begin{amatrix}[rrr] 1\amp 1 \amp -2 \\ 1\amp 3\amp 2\end{amatrix}\) and \(B=\begin{amatrix}[rc]1\amp 1\\ 0\amp 1 \\ -2\amp 1 \end{amatrix}\)

Compute \(AB\) using (a) the definition of matrix multiplication, (b) the column method, (c) the row method.

Solution.

Using the definition, we see easily that

\begin{equation*} AB=\begin{amatrix}[rr]5\amp 0\\ -3\amp 6 \end{amatrix} \end{equation*}
Let \(\bolda_1, \bolda_2, \bolda_3\) be the columns of \(A\text{,}\) and let \(\boldb_1, \boldb_2\) be the columns of \(B\text{.}\) We have

\begin{align*} AB \amp= \begin{amatrix}[cc]\vert \amp \vert \\ A\boldb_1\amp A\boldb_2 \\ \vert\amp \vert\end{amatrix} \amp \text{(Step 1)} \\ \amp= \begin{amatrix}[cc]\vert \amp \vert \\ (1\bolda_1+0\bolda_2-2\bolda_3)\amp (\bolda_1+\bolda_2+\bolda_3) \\ \vert\amp \vert\end{amatrix} \amp \text{(Step 2)}\\ \amp= \begin{amatrix}[rr]5\amp 0\\ -3\amp 6 \end{amatrix} \amp \text{(arithmetic)} \end{align*}
Now let \(\bolda_1, \bolda_2\) be the rows of \(A\text{,}\) and let \(\boldb_1, \boldb_2, \boldb_3\) be the rows of \(B\text{.}\) We have

\begin{align*} AB \amp= \begin{amatrix}[c]--\bolda_1\, B--\\ --\bolda_2\, B-- \end{amatrix}\amp \text{(Step 1)}\\ \amp= \begin{amatrix}[c]--(1\boldb_1+1\boldb_2-2\boldb_3)-- \\ --(1\boldb_1+3\boldb_2+2\boldb_3)-- \end{amatrix} \amp \text{(Step 2)} \\ \amp=\begin{amatrix}[rr]5\amp 0\\ -3\amp 6 \end{amatrix} \amp \text{(arithmetic)} \end{align*}

Sage example 5. Column and row methods.

Let’s verify the validity of the column and row methods using Sage in some specific examples. Below we generate random integer matrices \(A\) and \(B\) of dimension \(3\times 5\) and \(5\times 4\text{,}\) respectively, and compute their product \(C=AB\text{.}\)

Let’s check that the \(j\)-th column of \(C\) is equal to the product of \(A\) with the \(j\)-th column of \(B\text{.}\)

Alternatively, we can visually confirm these equalities using the display of \(C\) in the first cell above. Observe that the result of A*colsB[i] is displayed by Sage as a tuple, though technically for us this is a column vector.

Next, let’s verify that the result of multiplying \(A\) and the \(j\)-th column of \(B\) is the corresponding linear combination of the columns of \(A\) given by the coefficients of this column.

Now use the Sage cells below to demonstrate the validity of the row method for the product \(C=AB\text{.}\) Simply modify the code in the two cells above to reflect the row method, as opposed to the column method.

Video example of matrix multiplication.

Figure 2.1.28. Video: three methods of matrix multiplication

Subsection 2.1.5 Transpose of a matrix

We end this section with one last operation, matrix transposition. We will not make much use of this operation until later, but this is as good a place as any to introduce it.

Definition 2.1.29. Matrix transposition.

Given an \(m\times n\) matrix \(A=[a_{ij}]\) its transpose \(A^T\) is the matrix whose \(ij\)-entry is the \(ji\)-th entry of \(A\text{.}\) In other words, \(A^T\) is the \(n\times m\) matrix satisfying \([A^T]_{ij}=[A]_{ji}\) for all \(1\leq i\leq n\) and \(1\leq j\leq m\text{.}\)

Remark 2.1.30.

Given a matrix \(A\) we can give a column- or row-based description of \(A^T\) as follows:

\(A^T\) is the matrix whose \(i\)-th row is the \(i\)-th column of \(A\text{.}\)
\(A^T\) is the matrix whose \(j\)-th column is the \(j\)-th row of \(A\text{.}\)

Example 2.1.31.

Let \(A=\begin{bmatrix}1\amp 2\amp 3\\4\amp 5\amp 6 \end{bmatrix}\text{;}\) then \(A^T=\begin{bmatrix}1\amp 4\\2\amp 5\\3\amp 6 \end{bmatrix}\text{.}\)

Let \(B=\begin{bmatrix}1\\0\\3 \end{bmatrix}\text{,}\) then \(B^T=\begin{bmatrix}1\amp 0\amp 3 \end{bmatrix}\text{.}\)

Sage example 6. Matrix transposition.

Matrix transposition is implemented in Sage as the transpose() method. In the cell below we (a) choose random integers \(1\leq m,n\leq 6\text{,}\) (b) choose a random \(m\times n\) matrix \(A\) with integer entries, and (c) compute the transpose of \(A\text{.}\)

As usual, experiment with the Sage cell below.

Exercises 2.1.6 Exercises

WeBWork Exercises

1.

Enter T or F depending on whether the statement is true or false. (You must enter T or F -- True and False will not work.)

If A has dimensions \(m \times n\) and B has dimensions \(n \times r\text{,}\) then AB has dimensions \(m \times r\text{.}\)
If A has dimensions \(5 \times 4\) and B has dimensions \(4 \times 3\text{,}\) then the 3rd row, 4th column entry of AB is obtained by multiplying the 3rd column of A by the 4th row of B.

2.

Matrix Products: Consider the matrices

\begin{equation*} A = \begin{pmatrix}5\amp 4\amp 7\\6\amp 5\amp 4\end{pmatrix}, B = \begin{pmatrix}2\amp 1\amp 3\amp 3\\2\amp 6\amp 9\amp 6\\ 9\amp 7\amp 2\amp 8\end{pmatrix},\textrm{ and } C = \begin{pmatrix}7\amp 5\\1\amp 8\\2\amp 5\\2\amp 3\end{pmatrix} \end{equation*}

Of the possible matrix products \(ABC, ACB, BAC, BCA, CAB, CBA\text{,}\)

which make sense?

Answer.

3.

If A is \(4 \times 9,\) B is \(2 \times 4\) and C is \(9 \times 2\) then:

a) BA has dimension \(\times\)

b) CB has dimension \(\times\)

c) AC has dimension \(\times\)

\(2\)

\(9\)

\(9\)

\(4\)

\(4\)

\(2\)

4.

Determine \(x\) and \(y\) such that

\begin{equation*} \left[\begin{array}{ccc} 1\amp 0\amp 0 \cr -2\amp 4\amp -1 \end{array}\right] + \left[\begin{array}{ccc} x-y \amp -4 \amp -2 \\ 3 \amp x \amp 0 \end{array} \right] = \left[\begin{array}{ccc} 2 \amp -4 \amp -2 \\ 1 \amp 2 x +y \amp -1 \end{array}\right] \end{equation*}

\(x =\)

\(y =\)

Answer 1.

\(2.5\)

Answer 2.

\(1.5\)

5.

Determine the value(s) of \(x\) such that

\(\left[\begin{array}{ccc} x \amp 2 \amp 1\cr \end{array}\right] \left[\begin{array}{ccc} -3 \amp -6 \amp 3\cr -6 \amp 2 \amp 0\cr 3 \amp -18 \amp 2 \end{array}\right] \left[\begin{array}{c} x\cr -1\cr 2\cr \end{array}\right] = [0]\)

\(x\) =

Note: If there is more than one value separate them by commas.

Answer.

\(-2, 3\)

Written Exercises

6.

For each part below write down the most general \(3\times 3\) matrix \(A=[a_{ij}]\) satisfying the given condition (use letter names \(a,b,c\text{,}\)etc. for entries).

\(a_{ij}=a_{ji}\) for all \(i,j\text{.}\)
\(a_{ij}=-a_{ji}\) for all \(i,j\)
\(a_{ij}=0\) for \(i\ne j\text{.}\)

7.

Let

\begin{equation*} A = \begin{bmatrix}3\amp 0\\ -1\amp 2\\ 1\amp 1 \end{bmatrix} , \hspace{5pt} B = \begin{bmatrix}4\amp -1\\ 0\amp 2 \end{bmatrix} , \hspace{5pt} C = \begin{bmatrix}1\amp 4\amp 2\\ 3\amp 1\amp 5 \end{bmatrix} \end{equation*}

\begin{equation*} D = \begin{bmatrix}1\amp 5\amp 2\\ -1\amp 0\amp 1\\ 3\amp 2\amp 4 \end{bmatrix} , \hspace{5pt} E = \begin{bmatrix}6\amp 1\amp 3\\ -1\amp 1\amp 2\\ 4\amp 1\amp 3 \end{bmatrix}\text{.} \end{equation*}

Compute the following matrices, or else explain why the given expression is not well defined.

\(\displaystyle (2D^T-E)A\)
\(\displaystyle (4B)C+2B\)
\(\displaystyle B^T(CC^T-A^TA)\)

8.

Let

\begin{equation*} A = \begin{bmatrix}3\amp -2\amp 7\\ 6\amp 5\amp 4\\ 0\amp 4\amp 9 \end{bmatrix} , \hspace{5pt} B = \begin{bmatrix}6\amp -2\amp 4\\ 0\amp 1\amp 3\\ 7\amp 7\amp 5 \end{bmatrix}\text{.} \end{equation*}

Compute the following using either the row or column method of matrix multiplication. Make sure to show how you are using the relevant method.

the first column of \(AB\text{;}\)
the second row of \(BB\text{;}\)
the third column of \(AA\text{.}\)

Solution.

Using expansion by columns, the first column of \(AB\) is given by \(A\) times the first column of \(B\text{.}\) We compute

\begin{equation*} \begin{bmatrix}3\amp -2\amp 7\\ 6\amp 5\amp 4\\ 0\amp 4\amp 9 \end{bmatrix} \begin{bmatrix}6\\ 0\\ 7 \end{bmatrix} = 6 \begin{amatrix}[r]3 \\ 6 \\ 0 \end{amatrix}+0 \begin{amatrix}[r]-2 \\ 5 \\ 4 \end{amatrix}+7\begin{amatrix}[r]7 \\ 4 \\ 9 \end{amatrix}= \begin{bmatrix}67\\ 64\\ 63 \end{bmatrix} \end{equation*}

9.

Use the row or column method to quickly compute the following product:

\begin{equation*} \begin{amatrix}[rrrrr]1\amp -1\amp 1\amp -1\amp 1\\ 1\amp -1\amp 1\amp -1\amp 1\\ 1\amp -1\amp 1\amp -1\amp 1\\ 1\amp -1\amp 1\amp -1\amp 1\\ 1\amp -1\amp 1\amp -1\amp 1 \end{amatrix} \begin{amatrix}[rrrr]1\amp 1\amp 1\amp 1\\ -1\amp 0\amp 0\amp 0\\ 0\amp 1\amp 0\amp 0\\ 0\amp 0\amp 2\amp 0\\ 0\amp 0\amp 0\amp 3 \end{amatrix} \end{equation*}

Solution.

I’ll just describe the row method here.

Note that the rows of \(A\) are all identical, and equal to \(\begin{bmatrix}1 \amp -1 \amp 1 \amp -1 \amp 1 \end{bmatrix}\text{.}\) From the row method it follows that each row of \(AB\) is given by

\begin{equation*} \begin{bmatrix}1 \amp -1 \amp 1 \amp -1 \amp 1 \end{bmatrix} B\text{.} \end{equation*}

Thus the rows of \(AB\) are all identical, and the row method computes the product above by taking the corresponding alternating sum of the rows of \(B\text{:}\)

\begin{equation*} \begin{bmatrix}1 \amp -1 \amp 1 \amp -1 \amp 1 \end{bmatrix} B=\begin{bmatrix}2\amp 2\amp -1\amp 4 \end{bmatrix}\text{.} \end{equation*}

Thus \(AB\) is the the \(5\times 4\) matrix, all of whose rows are \(\begin{bmatrix}2\amp 2\amp -1\amp 4 \end{bmatrix}\text{.}\)

10.

Each of the \(3\times 3\) matrices \(B_i\) below performs a specific row operation when multiplying a \(3\times n\) matrix \(A=\begin{bmatrix}-\boldr_1-\\ -\boldr_2-\\ -\boldr_3- \end{bmatrix}\) on the left; i.e., the matrix \(B_iA\) is the result of performing a certain row operation on the matrix \(A\text{.}\) Use the row method of matrix multiplication to decide what row operation each \(B_i\) performs.

\begin{equation*} B_1=\begin{bmatrix}1\amp 0\amp 0\\ 0\amp 1\amp 0\\ -2\amp 0\amp 1 \end{bmatrix} , B_2=\begin{bmatrix}1\amp 0\amp 0\\ 0\amp \frac{1}{2}\amp 0\\ 0\amp 0\amp 1 \end{bmatrix} , B_3=\begin{bmatrix}0\amp 0\amp 1\\ 0\amp 1\amp 0\\ 1\amp 0\amp 0 \end{bmatrix}\text{.} \end{equation*}

11.

Let \(r\geq 2\) be an integer. Prove, by induction on \(r\text{,}\) that for any \(m\times n\) matrices \(A_1, A_2,\dots, A_r\) and scalars \(c_1,c_2,\dots, c_r\text{,}\) we have

\begin{equation*} [c_1A_1+c_2A_2+\cdots +c_rA_r]_{ij} =c_1[A_1]_{ij}+c_2[A_2]_{ij}+\cdots c_r[A_r]_{ij} \end{equation*}

for all \(1\leq i\leq m\text{,}\) \(1\leq j\leq n\text{.}\)