Skip to main content
Logo image

Section 3.3 Subspaces

The definition of a subspace of a vector space \(V\) is very much in the same spirit as our definition of linear transformations. It is a subset of \(V\) that in some sense respects the vector space structure: in the language of Definition 3.3.1, it is a subset that is closed under addition and closed under scalar multiplication.
In fact the connection between linear transformations and subspaces goes deeper than this. As we will see in Definition 3.4.1, a linear transformation \(T\colon V\rightarrow W\) naturally gives rise to two important subspaces: the null space of \(T\) and the image of \(T\).

Subsection 3.3.1 Definition of subspace

Definition 3.3.1. Subspace.

Let \(V\) be a vector space. A subset \(W\subseteq V\) is a subspace of \(V\) if the following conditions hold:
  1. \(W\) contains the zero vector.
    We have \(\boldzero\in W\text{.}\)
  2. \(W\) is closed under addition.
    For all \(\boldv_1,\boldv_2\in V\text{,}\) if \(\boldv_1,\boldv_2\in W\text{,}\) then \(\boldv_1+\boldv_2\in W\text{.}\) Using logical notation:
    \begin{equation*} \boldv_1,\boldv_2\in W\implies \boldv_1+\boldv_2\in W\text{.} \end{equation*}
  3. \(W\) is closed under scalar multiplication.
    For all \(c\in \R\) and \(\boldv\in V\text{,}\) if \(\boldv\in W\text{,}\) then \(c\boldv\in W\text{.}\) In logical notation:
    \begin{equation*} \boldv\in W\Rightarrow c\boldv\in W\text{.} \end{equation*}

Example 3.3.2.

Let \(V=\R^2\) and let
\begin{equation*} W=\{(t,t)\in\R^2 \colon t\in\R\}\text{.} \end{equation*}
Prove that \(W\) is a subspace.
Solution.
We must show properties (i)-(iii) hold for \(W\text{.}\)
  1. The zero element of \(V\) is \(\boldzero=(0,0)\text{,}\) which is certainly of the form \((t,t)\text{.}\) Thus \(\boldzero\in W\text{.}\)
  2. We must prove the implication \(\boldv_1, \boldv_2\in W\Rightarrow \boldv_1+\boldv_2\in W\text{.}\)
    \begin{align*} \boldv_1,\boldv_2\in W\amp \Rightarrow\amp \boldv_1=(t,t), \boldv_2=(s,s) \text{ for some \(t,s\in\R\) }\\ \amp \Rightarrow\amp \boldv_1+\boldv_2=(t+s,t+s)\\ \amp \Rightarrow\amp \boldv_1+\boldv_2\in W\text{.} \end{align*}
  3. We must prove the implication \(\boldv\in W\Rightarrow c\boldv\in W\text{,}\) for any \(c\in \R\text{.}\) We have
    \begin{align*} \boldv\in W\amp \Rightarrow\amp \boldv=(t,t)\\ \amp \Rightarrow\amp c\boldv=(ct,ct)\\ \amp \Rightarrow\amp c\boldv\in W \end{align*}

Example 3.3.3.

Let \(V=\R^n\) and let
\begin{equation*} W=\{(x,y)\in \R^2\colon x, y\geq 0\}\text{.} \end{equation*}
Is \(W\) a vector space? Decide which of the of properties (i)-(iii) in Definition 3.3.1 (if any) are satisfied by \(W\text{.}\)
Solution.
  1. Clearly \(\boldzero=(0,0)\in W\text{.}\)
  2. Suppose \(\boldv_1=(x_1,y_1), \boldv_2=(x_2,y_2)\in W\text{.}\) Then \(x_1, x_2, y_1, y_2\geq 0\text{,}\) in which case \(x_1+x_2, y_1+y_2\geq 0\text{,}\) and hence \(\boldv_1+\boldv_2\in W\text{.}\) Thus \(W\) is closed under addition.
  3. The set \(W\) is not closed under scalar multiplication. Indeed, let \(\boldv=(1,1)\in W\text{.}\) Then \((-2)\boldv=(-2,-2)\notin W\text{.}\)

Video example: deciding if \(W\subseteq V\) is a subspace.

Figure 3.3.5. Video: deciding if \(W\subseteq V\) is a subspace
Figure 3.3.6. Video: deciding if \(W\subseteq V\) is a subspace

Remark 3.3.7. Subspaces are vector spaces.

If \(W\) is a subspace of a vector space \(V\text{,}\) then it inherits a vector space structure from \(V\) by simply restricting the vector operations defined on \(V\) to the subset \(W\text{.}\)
It is important to understand how conditions (ii)-(iii) of Definition 3.3.1 come into play here. Without them we would not be able to say that restricting the vector operations of \(V\) to elements of \(W\) actually gives rise to well-defined operations on \(W\text{.}\) To be well-defined the operations must output elements that lie not just in \(V\text{,}\) but in \(W\) itself. This is precisely what being closed under addition and scalar multiplication guarantees.
Once we know restriction gives rise to well-defined operations on \(W\text{,}\) verifying the axioms of Definition 3.1.1 mostly amounts to observing that if a condition is true for all \(\boldv\) in \(V\text{,}\) it is certainly true for all \(\boldv\) in the subset \(W\text{.}\)
The “existential axioms” (iii) and (iv) of Definition 3.1.1, however, require special consideration. By definition, a subspace \(W\) contains the zero vector of \(V\text{,}\) and clearly this still acts as the zero vector when we restrict the vector operations to \(W\text{.}\) What about vector inverses? We know that for any \(\boldv\in W\) there is a vector inverse \(-\boldv\) lying somewhere in \(V\text{.}\) We must show that in fact \(-\boldv\) lies in \(W\text{:}\) i.e. we need to show that the operation of taking the vector inverse is well-defined on \(W\text{.}\) We prove this as follows:
\begin{align*} \boldv\in W \amp\implies (-1)\boldv\in W \amp (\knowl{./knowl/d_subspace.html}{\text{Definition 3.3.1}}, \text{(iii) } )\\ \amp\implies -\boldv\in W \amp (\knowl{./knowl/th_vectorspace_props.html}{\text{Theorem 3.1.13}}, (iii)) \text{.} \end{align*}
We now know how to determine whether a given subset of a vector space is in fact a subspace. We are also interested in means of constructing subspaces from some given ingredients. The result below tells us that taking the intersection of a given collection of subspaces results in a subspace. In Subsection 3.4.1 we see how a linear transformation automatically gives rise to two subspaces.
Exercise.

Remark 3.3.9. Unions of subspaces.

While the intersection of subspaces is again a subspace, the same is not true for unions of subspaces.
For example, take \(V=\R^2\text{,}\) \(W_1=\{(t,t)\colon t\in\R\}\) and \(W_2=\{(t,-t)\colon t\in\R\}\text{.}\) Then each \(W_i\) is a subspace, but their union \(W_1\cup W_2\) is not.
Indeed, observe that \(\boldw_1=(1,1)\in W_1\subset W_1\cup W_2\) and \(\boldw_2=(1,-1)\in W_2\subset W_1\cup W_2\text{,}\) but \(\boldw_1+\boldw_2=(2,0)\notin W_1\cup W_2\text{.}\) Thus \(W_1\cup W_2\) is not closed under addition. (Interestingly, it is closed under scalar multiplication.)

Subsection 3.3.2 Subspaces of \(\R^n\)

The following theorem gives a convenient method of producing a subspace \(W\) of \(\R^n\text{:}\) namely, given any \(m\times n\) matrix \(A\text{,}\) the subset \(W\) defined as
\begin{equation*} W=\{\boldx\in \R^n\colon A\boldx=\boldzero\} \end{equation*}
is guaranteed to be a subspace of \(\R^n\text{.}\) We will see in Section 3.4 that this construction is just one example of a more general subspace-building operation (see 3.4.1 and 3.4.8). We introduce the special case here for two reasons: (a) the construction allows us to easily provide examples of subspaces of \(\R^n\text{,}\) and (b) the proof of Theorem 3.3.10 is a nice example of the two-step technique.
Following the two-step technique, we first show that \(\boldzero_n \in W\text{.}\) This is clear, since \(A\boldzero_n=\boldzero_m\text{.}\) (We introduce the subscripts to distinguish between the zero vectors of the domain \(\R^n\) and \(\R^m\text{.}\))
Next, we show that for any \(\boldx_1, \boldx_2\in \R^n\) and any \(c_1, c_2\in \R\) we have
\begin{equation*} \boldx_1, \boldx_2\in W\implies c_1\boldw_1+c_2\boldw_2\in W\text{.} \end{equation*}
If \(\boldx_1, \boldx_2\in W\text{,}\) then we have \(A\boldx_1=A\boldx_2=\boldzero_m\text{,}\) by definition of \(W\text{.}\) It the follows that the vector \(c_1\boldx_1+c_2\boldx_2\) satisfies
\begin{align*} A(c_1\boldx_1+c_2\boldx_2) \amp= c_1A\boldx_1+c_2A\boldx_2 \amp (\knowl{./knowl/th_matrix_alg_props.html}{\text{2.2.1}}) \\ \amp c_1A\boldzero_m+c_2\boldzero_m \amp (\boldx_1, \boldx_2\in W) \\ \amp = \boldzero_m\text{.} \end{align*}
Thus \(c_1\boldx_1+c_2\boldx_2\in W\text{,}\) as desired.

Remark 3.3.11. Solutions to homogeneous linear systems form a subspace.

Recall from Interlude on matrix equations that the set of solutions to a matrix equation \(A\boldx=\boldb\) is the same thing as the set of solutions to the system of linear equations with augmented matrix \(\begin{amatrix}[c|c] A\amp \boldb \end{amatrix}\text{.}\) Thus, Theorem 3.3.10 implies that the set of solutions to a homogeneous system of linear equations forms a subspace.

Remark 3.3.12. Alternative subspace method.

Theorem 3.3.10 provides an alternative way of showing that a subset \(W\subseteq \R^n\text{:}\) namely, find an \(m\times n\) matrix \(A\) for which we have \(W=\{\boldx\in \R^n\colon A\boldx=\boldzero\}\text{.}\) This is often much faster than using the two-step technique.

Example 3.3.13.

Define the subset \(W\) of \(\R^3\) as
\begin{equation*} W=\{(x,y,z)\in \R^3\colon x+2y+3z=x-y-z=0\}\text{.} \end{equation*}
  1. Prove that \(W\) is a subspace by identifying it as the set of solutions to a homogeneous matrix equation.
  2. Use (a) and Gaussian elimination to compute a parametric description of \(W\text{.}\)
Solution.
  1. It is easy to see that
    \begin{equation*} W=\{\boldx\in \R^n\colon A\boldx=\boldzero\} \end{equation*}
    where
    \begin{equation*} A=\begin{amatrix}[rrr]1\amp 2\amp 3\\ 1\amp -1\amp -1 \end{amatrix}\text{.} \end{equation*}
    We conclude \(W\) is a subspace.
  2. The augmented matrix \(\begin{amatrix}[c|c]A\amp\boldzero \end{amatrix}\) row reduces to
    \begin{equation*} U=\begin{amatrix}[rrr|r]\boxed{1}\amp 2\amp 3\amp 0\\ 0 \amp \boxed{1}\amp 1\amp 0 \end{amatrix}\text{.} \end{equation*}
    Following Procedure 1.3.5 we conclude that
    \begin{equation*} W=\{(-t,-2t,t)\colon t\in \R\}\text{.} \end{equation*}
    Geometrically this is the line in \(\R^3\) passing through \((0,0,0)\) with direction vector \((1,2,-1)\text{.}\)

Example 3.3.14. Lines and planes.

Recall that a line \(\ell\) in \(\R^2\) that passes through the origin can be expressed as the set of solutions \((x_1,x_2)\in\R^2\) to an equation of the form
\begin{equation*} \ell\colon ax_1+bx_2=0\text{.} \end{equation*}
Similarly, a plane \(\mathcal{P}\) in \(\R^3\) that passes through the origin can be expressed as the the set of solutions \((x_1,x_2,x_3)\) to an equation of the form
\begin{equation*} \mathcal{P}\colon ax_1+bx_2+cx_3=0\text{.} \end{equation*}
We see immediately that both objects can be described as null spaces of a certain matrix:
\begin{equation*} \ell=\NS \begin{bmatrix} a \amp b\end{bmatrix}, \mathcal{P}=\NS \begin{bmatrix}a\amp b\amp c\end{bmatrix}\text{.} \end{equation*}
We conclude from Theorem 3.3.10 that lines in \(\R^2\text{,}\) and planes in \(\R^3\text{,}\) are subspaces, as long as they pass through the origin.
On the other hand, a line or plane that does not pass through the origin is not a subspace, since it does not contain the zero vector.
Question: How do we classify all subspaces of \(\R^2\) of \(\R^3\text{?}\) We will be able to answer this easily with dimension theory. (See Section 3.7.)

Subsection 3.3.3 Important subspaces of \(M_{nn}\)

In The invertibility theorem we met three families of square matrices: namely, the diagonal, upper triangular, and lower triangular matrices. (See Definition 2.4.7). We now introduce three more naturally occuring families. Before doing so, we give an official definition of the trace function. (See Exercise 3.2.6.11.)

Definition 3.3.15. Trace of a matrix.

Let \(A=[a_{ij}]\) be an \(n\times n\) matrix. The trace of \(A\text{,}\) denoted \(\tr A\) is defined as the sum of the diagonal entries of \(A\text{:}\) i.e.,
\begin{equation*} \tr A=a_{11}+a_{22}+\cdots +a_{nn}\text{.} \end{equation*}

Definition 3.3.16. Trace-zero, symmetric, and skew-symmetric.

Fix an integer \(n\geq 1\text{.}\)
  1. A matrix \(A\in M_{nn}\) is said to be a trace-zero matrix if \(\tr A=0\text{.}\)
  2. A matrix \(A\in M_{nn}\) is symmetric if \(A^T=A\text{:}\) equivalently, if \([A]_{ij}=[A]_{ji}\) for all \(1\leq i,j\leq n\text{.}\)
  3. A matrix \(A\in M_{nn}\) is skew-symmetric if \(A^T=-A\text{:}\) equivalently, if \([A]_{ij}=-[A]_{ji}\) for all \(1\leq i,j\leq n\text{.}\)

Example 3.3.17. Trace-zero symmetric, and skew-symmetric \(2\times 2\) matrices.

The set \(W_1\) of all trace-zero \(2\times 2\) matrices can be described as
\begin{equation*} W_1=\left\{ \begin{amatrix}[rr]a\amp b\\ c\amp -a \end{amatrix}\colon a,b,c\in \R\right\}\text{.} \end{equation*}
The set \(W_2\) of all symmetric \(2\times 2\) matrices can be described as
\begin{equation*} W_2=\left\{ \begin{amatrix}[rr]a\amp b\\ b\amp c \end{amatrix}\colon a,b,c\in \R\right\}\text{.} \end{equation*}
The set \(W_3\) of all skew-symmetric \(2\times 2\) matrices can be described as
\begin{equation*} W_3=\left\{ \begin{amatrix}[rr]0\amp a\\ -a\amp 0 \end{amatrix}\colon a,b\in \R\right\}\text{.} \end{equation*}

Remark 3.3.18.

Assume \(A\) is a skew-symmetric \(n\times n\) matrix. By definition, for all \(1\leq i\leq n\) we must have \([A]_{ii}=-[A]_{ii}\text{.}\) It follows that \([A]_{ii}=0\) for all \(1\leq i\leq n\text{.}\) Thus the diagonal entries of a skew-symmetric matrix are always equal to 0.
It will come as no surprise that all of the afore mentioned matrix families are in fact subspaces of \(M_{nn}\text{.}\)

Subsection 3.3.4 Important subspaces of \(F(X,\R)\)

Let \(X\) be an nondegenerate interval of \(\R\text{:}\) i.e., an interval containing at least two elements. Recall that \(F(X,\R)\) is the set of all functions from \(X\) to \(\R\text{.}\) This is a pretty unwieldy vector space, containing some pathological characters, and when studying functions on an interval we will often restrict our attention to certain more well-behaved subsets: e.g., continuous, differentiable, or infinitely differentiable functions. Not surprisingly, these subsets turn out to be subspaces of \(F(X,\R)\text{.}\)

Definition 3.3.20.

Let \(X\subseteq \R\) be a nondegenerate interval.
  1. We denote by \(C(X)\) the set of all continuous functions on \(X\text{:}\) i.e.,
    \begin{equation*} C(X)=\{f\in F(X,\R)\colon f \text{ is continuous on } X\}\text{.} \end{equation*}
  2. Fix \(n\geq 1\text{.}\) A function \(f\in F(X,\R)\) is \(C^n\) on X if \(f\) is \(n\)-times differentiable on \(X\) and its \(n\)-th derivative \(f^{(n)}(x)\) is continuous. The set of all \(C^n\) functions on \(X\) is denoted \(C^n(X)\text{.}\)
  3. A function \(f\in F(X,\R)\) is \(C^\infty\) on X if \(f\) is infinitely differentiable on \(X\text{.}\) The set of all \(C^\infty\) functions on \(X\) is denoted \(C^\infty(X)\text{.}\)
  4. A polynomial on \(X\) of degree at most \(n\) is a polynomial of the form \(f(x)=\anpoly\text{,}\) where \(a_i\in \R\text{.}\) (See Section 0.7, and in particular Definition 0.7.1 for more details about polynomials.) Recall that if \(a_n\ne 0\text{,}\) we call \(n\) the the degree of \(f\text{,}\) denoted \(\deg f\text{.}\)
    The set of polynomials of degree at most \(n\) on \(X\) is denoted \(P_n(X)\text{;}\) the set of all polynomials on \(X\) is denoted \(P(X)\text{.}\) When \(X=\R\text{,}\) we shorten the notation to \(P_n\) and \(P\text{.}\)
The proof amounts to the following observations:
  • The zero function \(0_X\colon X\rightarrow \R\) is an element of all of these sets: i.e. the zero function is continuous, \(C^n\text{,}\) \(C^\infty\text{,}\) a polynomial, etc..
  • If \(f\) and \(g\) both satisfy one of these properties (continuous, \(C^n\text{,}\) \(C^\infty\text{,}\) polynomial, etc.), then so does \(cf+dg\) for any \(c,d\in \R\text{.}\)
The second, “closed under linear combinations” observation is easily seen for \(P(X)\) and \(P_n(X)\) (i.e., the sum of two polynomials of degree at most \(n\) is clearly a polynomial of degree at most \(n\)); for the other spaces, this is a result of calculus properties to the effect that adding and scaling functions preserves continuity and differentiability.
Lastly, that each subset relation holds in the given chain follows from similar observations: polynomials are infinitely differentiable, differentiable functions are continuous, etc..
When working within the polynomial spaces \(P_n(X)\) or \(P(X)\text{,}\) we will constantly make use of the fact that a polynomial \(f(x)=\anpoly\) is completely determined by its coefficients \(a_i\text{,}\) and that equality between polynomials can be decided by comparing their coefficients. This is the content of Corollary 0.7.4. We restate the result here in a more convenient form.

Remark 3.3.23. Differential operators.

Let \(X\subseteq \R\) be an interval. Define \(T_1\colon C^1(X)\rightarrow C(X)\) as \(T_1(f)=f'\text{:}\) i.e., \(T_1\) takes as input a \(C^1\) function on the interval \(X\text{,}\) and returns its (first) derivative. Note that the definition of \(C^1\) ensures that \(f'\) exists and is continuous on \(X\text{:}\) hence \(f' \in C(X)\text{,}\) as claimed.
The operator \(T_1\) is a linear transformation. Indeed, given \(c,d\in \R\) and \(f,g\in C^1(X)\text{,}\) we have
\begin{align*} T_1(cf+dg) \amp = (cf+dg)' \amp \text{(by def.)} \\ \amp =(cf)'+(dg)' \amp \text{(derivative prop.)} \\ \amp =cf'+dg' \amp \text{(derivative prop.)}\\ \amp = cT_1(f)+dT_1(g)\text{.} \end{align*}
Since taking \(n\)-th derivatives amounts to composing the derivative operator \(T_1\) with itself \(n\) times, it follows from Theorem 3.2.24 that for any \(n\geq 1\) the map
\begin{align*} T_n\colon C^n(X) \amp \rightarrow C(X) \\ f \amp\mapsto f^{(n)} \text{,} \end{align*}
which takes a function \(f\) to its \(n\)-th derivative, is also linear. (Note that we are careful to pick the domain \(C^n(X)\) to guarantee this operation is well-defined!)
Lastly, by Exercise 3.2.6.17, we can add and scale these various operators to obtain more general linear transformations of the form
\begin{equation*} T(f)=c_nf^{(n)}+c_{n-1}f^{(n-1)}+\cdots c_1f'+c_0f\text{.} \end{equation*}
We call such a function a linear differential operator. Understanding the linear algebraic properties of these operators is crucial to the theory of linear differential equations, as Example 3.4.13 illustrates.

Exercises 3.3.5 Exercises

WeBWork Exercises

1.
Determine if each of the following sets is a subspace of \({\mathbb P}_{n}\text{,}\) for an appropriate value of \(n\text{.}\) Type "yes" or "no" for each answer.
Let \(W_{1}\) be the set of all polynomials of the form \(p(t)= at^{2}\text{,}\) where \(a\) is in \({\mathbb R}\text{.}\)
Let \(W_{2}\) be the set of all polynomials of the form \(p(t)= t^{2} + a\text{,}\) where \(a\) is in \({\mathbb R}\text{.}\)
Let \(W_{3}\) be the set of all polynomials of the form \(p(t)= at^{2} + at\text{,}\) where \(a\) is in \({\mathbb R}\text{.}\)

Exercise Group.

For each subset \(W\) of \(\R^2\) described below: (a) sketch \(W\) as a region of \(\R^2\text{,}\) and (b) determine whether \(W\) is a subspace. Justify your answer either with a proof or explicit counterexample.
2.
\(W=\{(x,y)\in \R^2\colon 2x+3y=0\}\)
3.
\(W=\{(x,y)\in \R^2\colon \val{x}\geq \val{y}\}\)
4.
\(W=\{(x,y)\in \R^2\colon x^2+2y^2\leq 1\}\)

Exercise Group.

Determine whether the subset \(W\) of \(M_{nn}\) described is a subspace of \(M_{nn}\text{.}\) Justify your answer either with a proof or explicit counterexample.
5.
\(W=\{A\in M_{nn}\colon \det A=0\}\)
6.
\(W=\{A\in M_{nn}\colon A_{11}=A_{nn}\}\)
7.
Fix a matrix \(B\in A\) and define \(W=\{A\in M_{nn}\colon AB=BA\}\text{,}\) the set of matrices that commute with \(B\text{.}\)

Exercise Group.

Determine whether the subset \(W\) of \(P_2\) is a subspace. Justify your answer either with a proof or explicit counterexample.
8.
\(W=\{f(x)=ax^2+bx+c \colon c=0\}\)
9.
\(W=\{f(x)=ax^2+bx+c \colon abc=0\}\)
10.
\(W=\{f(x)\in P_2 \colon xf'(x)=f(x)\)

Exercise Group.

Determine whether the subset \(W\) of \(C(\R)\) described is a subspace. Justify your answer either with a proof or explicit counterexample.
11.
\(W=\{f\in C(\R)\colon f(4)=0\} \)
12.
\(W=\{f\in C(\R)\colon f(0)=4\} \)
13.
\(W=\{f\in C(\R)\colon f(x)=f(-x)\} \)
14.
\(W=\{f\in C(\R)\colon f(x+\pi)=f(x)\} \)
15.
\(W=\{f\in C(\R)\colon f\in P \text{ and } \deg f=5\}. \)

Exercise Group.

For each given subset \(W\subseteq \R^n\text{:}\) (a) show that \(W\) is a subspace by identifying it with the set of solutions to a matrix equation, and (b) give a parametric description of \(W\text{.}\)
16.
\(W=\{(x,y,z)\colon 2x+y-z=0\}\)
17.
\(W=\{(x_1,x_2,x_3,x_4)\colon x_1-x_4=x_2+x_3=x_1+x_2+x_3=0\}\subseteq \R^4\)
18.
\(W=\{(x,y)\colon x+3y=-x+2y=0\}\subseteq \R^2\)