Skip to main content
Logo image

Section 4.1 Inner product spaces

An inner product is an additional layer of structure we can define on a vector space \(V\text{.}\) It takes a pair of elements \(\boldv, \boldw\in V\) and returns a scalar \(\langle \boldv,\boldw \rangle\in \R\text{.}\) As with the vector addition and scalar multiplication, we define inner products axiomatically, taking as our model the dot product on \(\R^2\) and \(\R^3\text{.}\) Our definition (4.1.1) simply promulgates a few important properties enjoyed by the dot product that may be familiar to you from studying calculus.
The addition of an inner product enriches the structure of a vector space considerably, and gives rise to a number of additional useful analytic tools. We highlight a few below.
Distance and angle
A notion of distance and angle between two vectors can be defined relative to a given inner product. These provide a numeric measurement of how “close” (distance) or “closely oriented” (angle) two vectors in our space are.
Orthogonality
Two vectors \(\boldv, \boldw\in V\) are orthogonal, relative to a given inner product, if \(\langle \boldv, \boldw\rangle=0\text{.}\) Orthogonality leads further to a general notion of orthogonal projection onto a subspace \(W\subseteq V\text{.}\)
Orthogonal bases
An orthogonal basis of a vector space \(V\text{,}\) relative to a given inner product, is one whose elements are pairwise orthogonal. As we will see there are many computational advantages of working with an orthogonal basis.

Subsection 4.1.1 Inner products

Definition 4.1.1. Inner product.

Let \(V\) be a vector space. An inner product on \(V\) is an operation that takes as input a pair of vectors \(\boldv, \boldw\in V\) and outputs a scalar \(\langle \boldv, \boldw \rangle \in \R\text{.}\) Using function notation:
\begin{align*} \langle \ , \rangle \colon \amp V\times V\rightarrow \R\\ (\boldv_1,\boldv_2)\amp \mapsto \langle \boldv_1,\boldv_2\rangle\text{.} \end{align*}
Furthermore, this operation must satisfy the following axioms.
  1. Symmetry.
    For all \(\boldv, \boldw\in V\) we have
    \begin{equation*} \langle \boldv, \boldw \rangle =\langle \boldw, \boldv \rangle\text{.} \end{equation*}
  2. Linearity.
    For all \(\boldv, \boldw, \boldu\in V\) and \(c, d\in \R\) we have :
    \begin{equation*} \langle c\boldv+d\boldw, \boldu \rangle =c \langle \boldv, \boldu \rangle +d \langle \boldw, \boldu \rangle\text{.} \end{equation*}
    It follows by (i) (symmetry) that
    \begin{equation*} \langle \boldu, c\boldv+d\boldw \rangle =c \langle \boldu, \boldv \rangle +d \langle \boldu, \boldw \rangle\text{.} \end{equation*}
  3. Positive definiteness.
    For all \(\boldv\in V\) we have
    \begin{align*} \langle \boldv, \boldv \rangle \amp\geq 0,\text{ and} \amp (\text{positivity})\\ \langle \boldv, \boldv \rangle \amp=0 \text{ if and only if } \boldv=\boldzero \amp (\text{definiteness}) \text{.} \end{align*}
An inner product space is a pair \((V, \langle , \rangle )\text{,}\) where \(V\) is a vector space, and \(\langle , \rangle \) is a choice of inner product on \(V\text{.}\)

Remark 4.1.2. Inner products of linear combinations.

We will have many opportunities to “expand out” an inner product of two linear combinations of vectors. Using axioms (i) and (ii) in series, this process resembles the procedure for multiplying two polynomials. For example, we have
\begin{align*} \langle c\boldv+d\boldw, e\boldv+f\boldw\rangle \amp = c \langle \boldv, e\boldv+f\boldw \rangle +d \langle \boldw, e\boldv+f\boldw \rangle \amp (\knowl{./knowl/d_innerproduct.html}{\text{4.1.1}}, \text{(ii)})\\ \amp=ce \langle \boldv, \boldv\rangle+cf\langle \boldv,\boldw \rangle +de \langle \boldw, \boldv\rangle+df\langle \boldw, \boldw\rangle \amp (\knowl{./knowl/d_innerproduct.html}{\text{4.1.1}}, \text{(ii)}) \\ \amp = ce\langle \boldv, \boldv\rangle +(cf+de)\langle \boldv, \boldw\rangle +df\langle \boldw, \boldw\rangle \amp (\knowl{./knowl/d_innerproduct.html}{\text{4.1.1}}, \text{(i)})\text{.} \end{align*}
Note how in the last step we are able to group the “cross terms”, \(\langle \boldv, \boldw\rangle=\langle \boldw, \boldv\rangle\) using the symmetry axiom.
More generally, given linear combinations
\begin{align*} \boldv \amp = c_1\boldv_1+c_2\boldv_2+\cdots +c_n\boldv_n=\sum_{i=1}^n c_i\boldv_i \\ \boldw \amp =d_1\boldv_1+d_2\boldv_2+\cdots +d_n\boldv_n=\sum_{i=1}^nd_i\boldv_i \text{,} \end{align*}
the same reasoning shows that
\begin{align*} \langle \boldv, \boldw \rangle \amp= c_1d_1 \langle \boldv_1, \boldv_1 \rangle+c_2d_2 \langle \boldv_2, \boldv_2 \rangle +\cdots+c_nd_n \langle \boldv_n, \boldv_n \rangle +\underset{\text{cross terms}}{\underbrace{(c_1d_2+c_2d_1)\langle \boldv_1,\boldv_2 \rangle +\cdots}} \\ \amp= \sum_{i=1}^nc_id_i \langle \boldv_i, \boldv_i \rangle +\underset{\text{cross terms}}{\underbrace{\sum_{1\leq i\lt j\leq n}(c_{i}d_j+c_jd_i)\langle \boldv_i, \boldv_j \rangle}} \text{.} \end{align*}
In particular, we have
\begin{equation*} \langle \boldv, \boldv\rangle =\sum_{i=1}^nc_i^2 \langle \boldv_i, \boldv_i \rangle +\sum_{1\leq i\lt j\leq n}2c_ic_j \langle \boldv_i, \boldv_j \rangle\text{.} \end{equation*}
We now present a series of important examples of inner products defined on our various inner product spaces. Each is presented as a theorem, as we must prove that the proposed operation satisfies the axios of an inner product. The first example, the weighted dot product is itself a vast generalization of the familiar dot product operations defined on \(\R^2\) and \(\R^3\text{.}\)
First we show that axioms (i) and (ii) are satsified for any choice of \(k_i\text{.}\) Let
\begin{equation*} K=\begin{amatrix}[rrrr]k_1\amp 0\amp \dots \amp 0 \\ 0\amp k_2\amp 0\amp \dots \\ \vdots \\ 0\amp \dots\amp 0\amp k_n \end{amatrix}\text{,} \end{equation*}
the diagonal matrix whose \(i\)-th diagonal entry is \(k_i\text{.}\) Then for all \(\boldx=(x_1,x_2,\dots, x_n), \boldy=(y_1,y_2,\dots, y_n)\in \R^n\) we have
\begin{equation*} \langle \boldx, \boldy \rangle=\boldx^TK\boldy=\begin{amatrix}[cccc]x_1\amp x_2\amp \dots\amp x_n \end{amatrix} K\begin{bmatrix}y_1\\ y_2\\ \vdots \\ y_n\end{bmatrix}\text{.} \end{equation*}
Here we treat \(\boldx, \boldy\) as column vectors, and we treat the resulting \(1\times 1\) matrix \(\boldx^T K\boldy\) as a scalar. Axioms (i)-(ii) now follow from various matrix properties. For linearity, for example, we have
\begin{align*} \langle c\boldx_1+d\boldx_2, \boldy \rangle \amp = (c\boldx_1+d\boldx_2)^TK\boldy\\ \amp =(c\boldx_1^T+d\boldx_2^T)K\boldy \amp (\knowl{./knowl/th_trans_props.html}{\text{2.2.11}})\\ \amp =c\boldx_1^TK\boldy+d\boldx_2^TK\boldy \\ \amp =c \langle \boldx_1, \boldy \rangle +d \langle \boldx_2,\boldy \rangle \text{.} \end{align*}
Symmetry requires a little more trickery:
\begin{align*} \langle \boldy, \boldx \rangle \amp = \boldy^TK\boldx \\ \amp = \boldy^TK^T\boldx \amp (K^T=K) \\ \amp = (\boldx^T K\boldy)^T \amp (\knowl{./knowl/th_trans_props.html}{\text{Theorem 2.2.11}})\\ \amp =\boldx^T K \boldy \amp \\ \amp = \langle \boldx, \boldy \rangle \text{.} \end{align*}
Note that \((\boldx^T K\boldy)^T=\boldx^T K\boldy\) as \(\boldx^T K\boldy\) is just a \(1\times 1\) matrix.
Lastly, we show that axiom (iii) is satisfied if and only if \(k_i>0\) for all \(i\text{.}\) To this end consider the formula
\begin{equation*} \langle \boldx, \boldx \rangle=k_1x_1^2+k_2x_2^2+\cdots k_nx_n^2\text{.} \end{equation*}
If \(k_i>0\text{,}\) then since \(x_i^2\geq 0\) for all \(i\text{,}\) we have \(\langle \boldx, \boldx \rangle\geq 0\) for any \(\boldx\text{,}\) and \(\langle \boldx, \boldx \rangle=0\) if and only if \(x_i=0\) for all \(i\) if and only if \(\boldx=\boldzero\text{.}\)
For the other direction suppose \(k_i\leq 0\) for some \(i\text{.}\) Let \(\boldx=\bolde_i\text{,}\) the \(i\)-th element of the standard basis of \(\R^n\text{.}\) Then \(\langle \boldx, \boldx \rangle=k_i\leq 0\text{:}\) a counterexample to the definiteness property of axiom (iii).

Definition 4.1.4. (Weighted) Euclidean space.

Fix a positive integer \(n\text{.}\) Euclidean \(n\)-space is the inner product space with underlying vector space \(\R^n\) and inner product given by the dot product. More generally, a weighted Eulidean space is an inner product space of the form \((\R^n,\langle\, , \rangle)\text{,}\) where \(\langle\, , \rangle\) is a weighted dot product.

Example 4.1.5. Dot product on \(\R^4\).

Let \(\boldx=(-1,2,0,1), \boldy=(1,2,1,1)\text{.}\) Then
\begin{equation*} \boldx\cdot \boldy=-1+4+0+1=4\text{,} \end{equation*}
and
\begin{equation*} \boldx\cdot\boldx=1+4+0+1=6\text{.} \end{equation*}

Example 4.1.6. Weighted dot product.

The dot product with weights \(2, 1, 3\) on \(\R^3\) is defined as
\begin{equation*} \langle \boldx, \boldy \rangle= 2x_1y_1+x_2y_2+3x_3y_3\text{.} \end{equation*}
Let \(\boldx=(-1,-1,-1)\) and \(\boldy=(1,0,1)\text{.}\) We have
\begin{equation*} \langle \boldx, \boldy \rangle =2(-1)+0-3=-5\text{,} \end{equation*}
and
\begin{equation*} \langle \boldx, \boldx \rangle =2(-1)^2+1(-1)^2+3(-1)^2=2+1+3=6\text{.} \end{equation*}

Example 4.1.7. Why the weights must be positive.

Consider the operation on \(\R^2\) defined as
\begin{equation*} \langle \boldx, \boldy \rangle =(-1)x_1y_1+2x_2y_2 \end{equation*}
where \(\boldx=(x_1,x_2), \boldy=(y_1,y_2)\text{.}\) This operation satisfies axioms (i) and (ii) of Definition 4.1.1. (See proof of Theorem 4.1.3.) However, it fails both the positivity and definiteness properties of axiom (iii):
\begin{align*} \langle (3,1), (3,1) \rangle \amp =-9+2=-7\lt 0\\ \langle (1,1/\sqrt{2}), (1,1/\sqrt{2})\rangle \amp=-1+2/2=0 \text{.} \end{align*}

Remark 4.1.8.

It is worth highlighting the observation in the proof of Theorem 4.1.3 that a dot product with weights \(k_1, k_2, \dots, k_n\) can be expressed as a matrix product:
\begin{equation*} \langle \boldx, \boldy \rangle=\sum_{i=1}^nk_ix_iy_i=\boldx^TK\boldy\text{,} \end{equation*}
where \(K\) is the diagonal \(n\times n\) matrix whose \(i\)-th diagonal entry is \(k_i\text{.}\) Here \(\boldx, \boldy\) are treated as column vectors, and we identify the resulting \(1\times 1\) matrix \(\boldx^T K \boldy\) with a scalar.
In particular for the standard dot product this matrix formula reduces to
\begin{equation*} \boldx\cdot \boldy=\boldx^T I \boldy=\boldx^T\boldy\text{.} \end{equation*}
Conversely, the dot product gives another way to formulate general matrix multiplication. as the next theorem articulates.
  1. See Remark 4.1.8 and the proof of Theorem 4.1.3.
Let \(A=[a_{ij}]_{m\times n}\) and \(B=[b_{ij}]_{n\times r}\text{.}\) Then
\begin{equation*} (AB)_{ij}=\sum_{k=1}^na_{ik}b_{kj}=\boldr_i\cdot \boldc_j\text{,} \end{equation*}
since \(\boldr_i=(a_{i1}, a_{i2}, \dots, a_{in})\) and \(\boldc_j=(b_{j1}, b_{2j},\dots, b_{nj})\text{.}\)
Next we introduce an important family of inner products defined on polynomials spaces called evaluation inner products. These are useful when we wish to compare polynomials by how they behave at a specified list of inputs.
That axioms (i)-(ii) are satisfied is left as an exercise. For axiom (iii), note that
\begin{equation*} \langle p(x), p(x) \rangle =p(c_0)^2+p(c_1)^2+\cdots +p(c_n)^2\geq 0\text{,} \end{equation*}
and we have equality if and only if \(p(c_0)=p(c_1)=\dots p(c_n)=0\text{.}\) Since a nonzero polynomial of degree \(n\) or less has at most \(n\) distinct roots, we conclude that \(p(x)=\boldzero\text{,}\) the zero polynomial.

Example 4.1.11. Evaluation at \(-1, 0, 1\).

Let \(V=P_2\text{,}\) and let \(\langle p(x),q(x) \rangle \) be the evaluation at \(-1, 0, 1\) inner product. Compute \(\langle x^2-1,x^2+2x+1 \rangle \) and \(\langle x^2-1, x^2-1 \rangle. \)
Solution.
Let \(p(x)=x^2-1\text{,}\) \(q(x)=x^2+2x+1\text{.}\) We have
\begin{equation*} \langle p(x), q(x) \rangle=p(-1)q(-1)+p(0)q(0)+p(1)q(1)=0+(-1)1+0=-1 \end{equation*}
and
\begin{equation*} \langle p(x), p(x) \rangle =p(-1)^2+p(0)^2+p(1)^2=0+(-1)^2+0=1\text{.} \end{equation*}
Our last example defines an integral inner product on the space \(C([a,b])\) of continuous functions on an interval \([a,b]\text{.}\) This inner product plays an important role in Fourier analysis, which studies the approximation of arbibitrary continuous functions with linear combinations of certain trigonometric funtions.
First observe that the integral defining the inner product always exists since the product \(fg\) is a continuous function on the closed interval \([a,b]\text{.}\)
Axioms (i)-(ii) follow directly from the definition and various properties of the integral. This is left as an exercise. As for (iii), we have
\begin{equation*} \langle f, f \rangle=\int_{a}^b f^2(x) \ dx \geq 0\text{,} \end{equation*}
since \(f^2(x)\geq 0\) for all \(x\in [a,b]\text{.}\) (This is a property of integration.) Furthermore, since \(f^2\) is continuous and \(f^2(x)\geq 0\text{,}\) we have
\begin{equation*} \langle f, f \rangle=\int_a^b f^2(x) \ dx=0 \end{equation*}
if and only if \(f^2(x)=0\) for all \(x\in [a,b]\) (a property of integrals of continuous functions) if and only if \(f(x)=0\) for all \(x\in [a,b]\) if and only if \(f=\boldzero\text{,}\) the zero function.

Example 4.1.13. Integral inner product.

Let \(V=C([0,1])\text{,}\) equipped with integral inner product. Let \(f(x)=x\text{,}\) \(g(x)=e^x\text{.}\) Compute \(\langle f,g \rangle \) and \(\langle f,f \rangle \text{.}\)
Solution.
We have
\begin{equation*} \langle f,g \rangle=\int_0^1xe^x\ dx=(xe^x\Bigr\vert_0^1-\int_0^1 e^x\ dx)=e-(e-1)=1 \end{equation*}
and
\begin{equation*} \langle f, f \rangle=\int_0^1 x^2 \ dx=\frac{1}{3}\text{.} \end{equation*}

Subsection 4.1.2 Norm and distance

As mentioned above, once an inner product is established, we can define further notions like norm (or length), distance, and angle in terms of the given inner product. When the inner product in question is the standard dot product on \(\R^2\) or \(\R^3\text{,}\) then these are precisely the familiar notions you may have met in multivariable calculus. Things get really interesting when we treat a more exotic inner product space. For example, consider \(V=C([a,b])\text{:}\) the integral inner product on \(V\) (Theorem 4.1.12) gives rise to useful notions of the length of a function \(f\in C([a,b])\text{,}\) as well as the distance or angle between two functions \(f,g\in C([a,b])\text{.}\)

Definition 4.1.14. Norm (or length) of a vector.

Let \((V, \langle , \rangle )\) be an inner product space. Given \(\boldv\in V\) we define its norm (or length), denoted \(\norm{\boldv}, \) as
\begin{equation*} \norm{\boldv}=\sqrt{\langle \boldv, \boldv \rangle }\text{.} \end{equation*}
A unit vector is a vector \(\boldv\) of length one: i.e., a vector \(\boldv\) satisfying \(\norm{\boldv}=1\text{.}\)

Example 4.1.15. Norm with respect to dot product.

Consider \(V=\R^4\) with the standard dot product. Compute \(\norm{(1,-1,-2,1)}\text{.}\)
Solution.
We have
\begin{align*} \norm{(1,-1,-2,1)}\amp =\sqrt{(1,-1,-2,1)\cdot (1,-1,-2,1)} \\ \amp=\sqrt{1+1+4+1} \\ \amp = \sqrt{7}\text{.} \end{align*}

Example 4.1.16. Norm with respect to weighted dot product.

Consider \(V=\R^3\) equipped with the dot product with weights \(1,2,3\text{.}\) Compute \(\norm{(3,1,-2)}\text{.}\)
Solution.
We have
\begin{align*} \norm{(3,1,-2)}\amp =\sqrt{\left\langle (3,1,-2), (3,1,-2)\right\rangle} \\ \amp=\sqrt{1(3^2)+2(1^2)+3((-2)^2)} \\ \amp = \sqrt{23}\text{.} \end{align*}

Example 4.1.17. Norm with respect to integral inner product.

Consider \(V=C([0,1])\) equipped with the integral inner product. Compute \(\norm{f}\text{,}\) where \(f(x)=e^x\)
Solution.
We have
\begin{align*} \norm{f}\amp =\sqrt{\langle f,f\rangle} \\ \amp=\sqrt{\int_0^1(e^x)^2\, dx} \\ \amp = \sqrt{\int_0^1e^{2x}\, dx}\\ \amp =\sqrt{(e^2-1)/2} \text{.} \end{align*}

Remark 4.1.18. Unit vectors.

Given any \(\boldv\ne \boldzero\in V\text{,}\) the vector \(\boldu=\frac{1}{\norm{\boldv}}\boldv\) is a unit vector. To verify this, let \(c=\norm{\boldv}\) and compute
\begin{align*} \norm{\boldu} \amp =\sqrt{\left\langle \frac{1}{c}\boldv,\frac{1}{c}\boldv \right\rangle }\\ \amp =\sqrt{\frac{1}{c^2}\langle \boldv,\boldv \rangle} \amp (\knowl{./knowl/d_innerproduct.html}{\text{Definition 4.1.1}}, \text{(ii)}) \\ \amp =\frac{1}{\val{c}}\sqrt{\langle \boldv,\boldv \rangle } \\ \amp =\frac{1}{c}\sqrt{\langle \boldv,\boldv \rangle } \amp (c=\norm{\boldv}\geq 0)\\ \amp =\frac{\norm{\boldv}}{\norm{\boldv}}=1\text{.} \end{align*}

Example 4.1.19. Unit vectors.

For each inner product space \((V, \langle\,, \rangle)\) and \(\boldv\in V\) compute the associated unit vector \(\boldu=\frac{1}{\norm{v}}\boldv\)
  1. \(V=\R^4\) with dot product, \(\boldv=(1,-1,2,1)\)
  2. \(V=\R^3\) with dot product with weights \(1,2,3\text{,}\) \(\boldv=(3,1,-2)\)
  3. \(V=C([0,1])\) with integral inner product, \(\boldv=e^x\)
Solution.
The norms of the vectors in each case were computed in Example 4.1.15–4.1.17. We simply scale to compute the corresponding unit vectors.
  1. \(\displaystyle \boldu=\frac{1}{\sqrt{7}}(1,-1,2,1)=(\sqrt{7}/7,-\sqrt{7}/7,2\sqrt{7}/7,\sqrt{7}/7)\)
  2. \(\displaystyle \boldu=\frac{1}{\sqrt{23}}(3,1,-2)=(3\sqrt{23}/23,\sqrt{23}/23,-2\sqrt{23}/23)\)
  3. \(\displaystyle \boldu=\frac{1}{\sqrt{(e^2-1)/2}}e^x=\frac{\sqrt{2(e^2-1)}}{e^2-1}e^x\)
Next, we define the distance between two vectors in an inner product space as the length of their vector difference.

Definition 4.1.20. Distance between vectors.

Let \((V,\langle , \rangle )\) be an inner product space. The distance between \(\boldv, \boldw\in V\), denoted \(d(\boldv, \boldw)\text{,}\) is defined as
\begin{equation*} d(\boldv, \boldw)=\norm{\boldv-\boldw}=\sqrt{\langle \boldv-\boldw,\boldv-\boldw \rangle }\text{.} \end{equation*}

Example 4.1.21.

For each inner product space \(V\text{,}\) compute the distance between the given vectors.
  1. \(V=\R^3\) with the dot product, \(\boldx=(x_1,x_2,x_3)\text{,}\) \(\boldy=(y_1,y_2,y_3)\)
  2. \(V=P_2\) with the evaluation at \(1,-1,0\) inner product, \(p_1(x)=x^2+1\text{,}\) \(p_2(x)=x+2\)
  3. \(V=C([-\pi,\pi])\) with the integral inner product, \(f(x)=\sin x+x\text{,}\) \(g(x)=x\)
Solution.
  1. We have
    \begin{align*} d(\boldx,\boldy) \amp = \norm{\boldx-\boldy} \\ \amp = \sqrt{\langle \boldx-\boldy,\boldx-\boldy\rangle} \\ \amp =\sqrt{(x_1-y_1,x_2-y_2,x_3-y_3)\cdot(x_1-y_1,x_2-y_2,x_3-y_3)} \\ \amp = \sqrt{(x_1-y_1)^2+(x_2-y_2)^2+(x_3-y_3)^2}\text{.} \end{align*}
  2. We have
    \begin{align*} d(p_1,p_2) \amp = \norm{p_1-p_2} \\ \amp = \sqrt{\langle p_1-p_2, p_1-p_2\rangle}\\ \amp = \sqrt{(p_1(1)-p_2(1))^2+(p_1(-1)-p_2(-1))^2+(p_1(0)-p_2(0))^2}\\ \amp = \sqrt{(-1)^2+(1)^2+(-1)^2}=\sqrt{3} \end{align*}
  3. We have
    \begin{align*} d(f,g) \amp = \norm{f-g} \\ \amp =\sqrt{\langle f-g, f-g \rangle} \\ \amp = \sqrt{\langle \sin x, \sin x\rangle }\\ \amp = \sqrt{\int_{-pi}^{pi}\sin^2 x\, dx}\\ \amp = \sqrt{\pi} \end{align*}
We prove (2) and leave the rest as an exercise (Exercise 4.1.5.16).
Given \(c\in \R\) and \(\boldv\in V\) we have
\begin{align*} \norm{c\boldv} \amp =\sqrt{\langle c\boldv, c\boldv\rangle}\\ \amp=\sqrt{c^2\langle \boldv, \boldv\rangle} \amp (\knowl{./knowl/d_innerproduct.html}{\text{Definition 4.1.1}})\\ \amp =\val{c}\sqrt{\langle \boldv, \boldv\rangle} \amp (\sqrt{c^2}=\val{c})\text{.} \end{align*}

Subsection 4.1.3 Cauchy-Schwarz inequality, triangle inequalities, and angles between vectors

The famous Cauchy-Schwarz inequality has a knack of cropping up all over the world of science: from properties of covariance in statistics, to the Heisenberg uncertainty principle of quantum mechanics. More directly pertinent to our discussion, the Cauchy-Schwarz inequality implies the triangle inequalities (4.1.24) and ensures that our notion of the angle between two nonzero vectors (Definition 4.1.25) is well-defined.
Fix vectors \(\boldv\) and \(\boldw\text{.}\) For any \(t\in\R\) we have by positivity
\begin{equation*} 0\leq \langle t\boldv-\boldw,t\boldv-\boldw\rangle=\langle\boldv,\boldv\rangle t^2-2\langle\boldv,\boldw\rangle t+\langle\boldw,\boldw\rangle=at^2-2bt+c\text{,} \end{equation*}
where
\begin{equation} a=\langle\boldv,\boldv\rangle, \ b=\langle\boldv,\boldw\rangle, \ c=\langle\boldw,\boldw\rangle=\norm{w}^2\text{.}\tag{4.1.1} \end{equation}
Since \(at^2-2b\,t+c\geq 0\) for all \(t\in \R\) the quadratic polynomial \(p(t)=at^2-2b\,t+c\) has at most one root. Using the quadratic formula we conclude that we must have \(4b^2-4ac\leq 0\text{,}\) since otherwise \(p(t)\) would have two distinct roots. It follows that
\begin{equation*} 4\langle \boldv, \boldw\rangle^2-4\norm{\boldv}^2\norm{\boldw}^2\leq 0\text{,} \end{equation*}
or equivalently
\begin{equation*} \langle\boldv,\boldw\rangle^2\leq \norm{\boldv}^2\norm{\boldw}^2\text{.} \end{equation*}
Taking square-roots yields the desired inequality.
The same reasoning shows that the Cauchy-Schwarz inequality is an actual equality if and only if \(p(t)=0\) for some \(t\) if and only if \(0=\langle t\boldv-\boldw,t\boldv-\boldw\rangle\) if and only if \(\boldv=t\boldw\) for some \(t\) (by positivity).
The following triangle inequalities are more or less direct consequences of the Cauchy-Schwarz inequality.
This is an elementary exercise of unpacking the definitions of norm and distance in terms of the inner product, and then applying the Cauchy-Schwarz inequality appropriately. The proof is left as an exercise.
Let \((V, \langle , \rangle )\) be an inner product space. For any nonzero vectors \(\boldv, \boldw\text{,}\) the Cauchy-Schwarz inequality tells us that
\begin{equation*} \val{\langle \boldv, \boldw \rangle }\leq \norm{\boldv}\, \norm{\boldw}\text{,} \end{equation*}
or equivalently,
\begin{equation*} -1\leq \frac{\langle \boldv, \boldw \rangle}{\norm{\boldv}\, \norm{\boldw}} \leq 1\text{.} \end{equation*}
It follows that there is a unique real number \(\theta\in [0,\pi]\) satisfying
\begin{equation*} \cos\theta=\frac{\langle \boldv, \boldw \rangle}{\norm{\boldv}\, \norm{\boldw}}\text{.} \end{equation*}
We call \(\theta\) the angle between \(\boldv\) and \(\boldw\).

Definition 4.1.25. Angle between vectors.

Let \((V,\langle , \rangle )\) be an inner product space. Given nonzero vectors \(\boldv, \boldw\in V\text{,}\) the angle between \(\boldv\) and \(\boldw\) is defined to be the unique \(\theta\in [0,\pi]\) satisfying
\begin{equation*} \cos\theta=\frac{\langle \boldv, \boldw \rangle}{\norm{\boldv}\, \norm{\boldw}}\text{.} \end{equation*}
Equivalently, we have
\begin{equation*} \theta=\arccos\left( \frac{\langle \boldv, \boldw \rangle}{\norm{\boldv}\, \norm{\boldw}} \right)\text{.} \end{equation*}

Remark 4.1.26.

Our definition of the angle between two vectors may remind you of the dot product angle formula for vectors in \(\R^3\text{:}\)
\begin{equation} \cos\theta=\frac{\boldx\cdot\boldy}{\norm{\boldx}\norm{\boldy}}\text{.}\tag{4.1.2} \end{equation}
Interestingly, whereas (4.1.2) is typically treated as a theorem, derived from properties of the dot product and the law of cosines, in a general inner product space the equation
\begin{equation*} \cos\theta =\frac{\langle \boldv, \boldw\rangle}{\norm{\boldv}\norm{\boldw}} \end{equation*}
is understood as the definition of the angle between two vectors.

Example 4.1.27.

Consider \(\R^2\) along with the dot product. Verify that our definition of the angle \(\theta\) between \((1,1)\) and \((1,0)\) is consistent with our planar geometry notion of angle.
Solution.
According to Definition 4.1.25, \(\theta\) is the unique element of \([0,\pi]\) satisfying
\begin{equation*} \cos \theta=\frac{(1,1)\cdot (1,0)}{\norm{(1,1)}\norm{(1,0)}}=\frac{1}{\sqrt{2}}=\frac{\sqrt{2}}{2}\text{.} \end{equation*}
We recognize \(\theta\) as the familiar angle \(\pi/4\text{,}\) as expected.

Example 4.1.28.

Consider \(\R^2\) with the weighted dot product
\begin{equation*} \langle (x_1,x_2), (y_1,y_2)\rangle=2x_1x_2+y_1y_2 \end{equation*}
Compute the angle \(\theta\) between \((1,1)\) and \((0,0)\) with respect to this inner product
Solution.
First compute
\begin{align*} \langle (1,1), (1,0)\rangle \amp= 2(1)+1(0)=2 \\ \norm{(1,1)}\amp =\sqrt{\langle (1,1),(1,1)\rangle} \\ \amp =\sqrt{2+1}=\sqrt{3} \\ \norm{(1,0)}\amp =\sqrt{\langle (1,0),(1,0)\rangle} \\ \amp =\sqrt{2} \end{align*}
By definition \(\theta\) is the unique value in \([0,\pi]\) satisfying
\begin{equation*} \cos\theta=\frac{\langle (1,1), (1,0)\rangle}{\norm{(1,1)}\norm{(1,0)}}=\frac{2}{\sqrt{3}\sqrt{2}}=\frac{\sqrt{6}}{3}\text{.} \end{equation*}
We see that \(\theta\) is not one of our familiar angles from the unit circle (e.g., \(\pi/6, \pi/4\text{,}\) etc.) and so express \(\theta\) in terms of the \(\arccos\) function:
\begin{equation*} \theta=\arccos(\sqrt{6}/3)\approx 35.3^\circ\text{.} \end{equation*}

Example 4.1.29.

Consider \(V=C([-\pi, \pi])\) with the integral inner product. Compute the angle \(\theta\) between \(f(x)=\sin x\) and \(g(x)=\sin 2x\) with respect to this inner product.
Solution.
First compute
\begin{align*} \langle f, g\rangle \amp=\int_{-\pi}^\pi \sin x\sin 2x\, dx \\ \amp=\frac{1}{2}\int_{-\pi}^\pi\cos(x-2x)-\cos(x+2x)\, dx \amp (\text{trig. identity})\\ \amp =\frac{1}{2}\int_{-\pi}^\pi \cos(-x)-\cos 3x\, dx\\ \amp = 0 \end{align*}
It follows that
\begin{equation*} \cos \theta=\frac{\langle f, g\rangle}{\norm{f}\norm{g}}=0\text{,} \end{equation*}
and hence that \(\theta=\pi/2\text{.}\)

Subsection 4.1.4 Choosing your inner product

Why, given a fixed vector space \(V\text{,}\) would we prefer one inner product definition to another? One way of understanding a particular choice of inner product is to ask what its corresponding notion of distance measures.

Example 4.1.30. Weighted dot product distance.

Consider \(\R^n\) with a choice of weighted dot product
\begin{equation*} \langle (x_1,x_2,\dots, x_n), (y_1,y_2,\dots, y_n)\rangle=k_1x_1y_1+k_2x_2y_2+\cdots +k_nx_ny_n, \end{equation*}
where \(k_1,k_2,\dots, k_n\) are fixed positive constants. With respect to this inner product the distance between two vectors \(\boldx=(x_1,x_2,\dots, x_n)\) and \(\boldy=(y_1,y_2,\dots, y_n)\) is
\begin{equation*} d(\boldx,\boldy)=\norm{\boldx-\boldy}=\sqrt{k_1(x_1-y_1)^2+k_2(x_2-y_2)^2+\cdots +k_n(x_n-y_n)^2}\text{.} \end{equation*}
Thus \(d(\boldx, \boldy)\) is an aggregate measure of the difference between the corresponding entries of \(\boldx\) and \(\boldy\text{,}\) as weighted by our choice of the constants \(k_i\text{.}\)
Imagine that each element of \(\boldx\in \R^n\) is a data point collected by measuring \(n\) different properties of a sample \(s\) : i.e., \(x_i\) is the measured value of property \(P_i\) on \(s\) for all \(1\leq i\leq n\text{.}\) Given samples \(s\) and \(s'\) with corresponding measurement vectors \(\boldx\) and \(\boldy\text{,}\) the weighted distance \(d(\boldx,\boldy)\) is then a quantitative way of saying how “close” the two samples are to one another. The choice of weights \(k_i\) allows us to adjust the relative influence of a given property \(P_i\) in determining this closeness. For example, the standard dot product (\(k_i=1\) for all \(i\)) yields a notion of distance that gives each property equal standing.

Example 4.1.31. Evaluation inner product distance.

Consider \(P_n\) with the evaluation inner product at a fixed choice of inputs \(x=c_0, c_1,\dots, c_n\text{.}\) Given two polynomials \(p(x), q(x)\text{,}\) the distance between them with respect to this inner product is
\begin{equation*} \norm{p(x)-q(x)}=\sqrt{(p(c_0)-q(c_0))^2+(p(c_1)-q(c_1))^2+\cdots +(p(c_n)-q(c_n))^2}\text{.} \end{equation*}
We see that with respect to this inner product, the distance between two polynomials is a measure of how much their values at the inputs \(x=c_0,c_1,\dots ,c_n\) differ. This inner product may be useful if you are interested in how a polynomial behaves at this finite list of inputs.

Example 4.1.32. Integral inner product and distance.

Take \(C[a,b]\) with the standard inner product \(\langle f, g \rangle=\int_a^b f(x)g(x) \ dx\text{.}\) Here the distance between two functions is defined as
\begin{equation*} \norm{f-g}=\sqrt{\int_a^b (f(x)-g(x))^2 \ dx}\text{,} \end{equation*}
which we can think of as an aggregate measure of the difference of values \(f(x)-g(x)\) for all \(x\in [a,b]\text{.}\) Thus \(d(f,g)\) is a global measure of the similarity between \(f\) and \(g\) that takes into account their values over the entire interval \([a,b]\text{.}\)

Exercises 4.1.5 Exercises

Webwork Exercises

1.
Find the norm of \(\vec{x}\) and the unit vector \(\vec{u}\) in the direction of \(\vec{x}\) if
\begin{equation*} \vec{x} = \left[\begin{array}{c} -5\cr -2\cr -1\cr 1 \end{array}\right]. \end{equation*}
\(\| \vec{x} \| =\) ,
\(\vec{u} =\) (4 × 1 array)
Answer.
\(\sqrt{31}\)
2.
Find the angle \(\alpha\) between the vectors
\begin{equation*} \left[\begin{array}{c} -3\cr 4\cr 1 \end{array}\right] \ \mbox{ and } \ \left[\begin{array}{c} 5\cr -3\cr -4 \end{array}\right]. \end{equation*}
\(\alpha =\) .
Answer.
\(\cos^{-1}\!\left(-0.859785\right)\)
3.
If \(f(x)\) and \(g(x)\) are arbitrary polynomials of degree at most 2, then the mapping
\begin{equation*} \langle f,g\rangle = f(-3)g(-3) + f(0)g(0) + f(2)g(2) \end{equation*}
defines an inner product in \(P_2\text{.}\) Use this inner product to find \(\langle f,g\rangle\text{,}\) \(\| f \|\text{,}\) \(\| g \|\text{,}\) and the angle \(\alpha_{f,g}\) between \(f(x)\) and \(g(x)\) for
\begin{equation*} f(x) = 2 x^2 + 6 x + 1 \ \mbox{ and } \ g(x) = 3 x^2 - 4 x + 8. \end{equation*}
\(\langle f,g\rangle =\) ,
\(\| f \| =\) ,
\(\| g \| =\) ,
\(\alpha_{f,g} =\) .
Answer 1.
\(307\)
Answer 2.
\(21.0476\)
Answer 3.
\(49.163\)
Answer 4.
\(1.26957\)
4.
If \(A\) and \(B\) are arbitrary real \(m\times n\) matrices, then the mapping
\begin{equation*} \langle A,B \rangle = {\rm trace}(A^T B) \end{equation*}
defines an inner product in \({\mathbb R}^{m\times n}\text{.}\) Use this inner product to find \(\langle A,B \rangle\text{,}\) the norms \(\|A\|\) and \(\|B\|\text{,}\) and the angle \(\alpha_{A,B}\) between \(A\) and \(B\) for
\begin{equation*} A = \left[\begin{array}{cc} -1 \amp -1\cr 1 \amp 1\cr -3 \amp -1 \end{array}\right] \ \mbox{ and } \ B = \left[\begin{array}{cc} -2 \amp 2\cr 3 \amp 3\cr 3 \amp -3 \end{array}\right]. \end{equation*}
\(\langle A,B \rangle =\) ,
\(\|A\|=\) ,
\(\|B\|=\) ,
\(\alpha_{A,B}=\) .
Answer 1.
\(0\)
Answer 2.
\(3.74166\)
Answer 3.
\(6.63325\)
Answer 4.
\(1.5708\)
5.
Use the inner product
\begin{equation*} \langle f,g \rangle = \int_0^1 f(x)g(x) \, dx \end{equation*}
in the vector space \(C^0 \lbrack 0,1\rbrack\) of continuous functions on the domain \(\lbrack 0, 1 \rbrack\) to find \(\langle f,g\rangle\text{,}\) \(\| f \|\text{,}\) \(\| g\|\text{,}\) and the angle \(\alpha_{f,g}\) between \(f(x)\) and \(g(x)\) for
\begin{equation*} f(x) = 10 x^2 - 3 \ \mbox{ and } \ g(x) = -3 x + 1. \end{equation*}
\(\langle f,g\rangle =\) ,
\(\|f\| =\) ,
\(\|g\| =\) ,
\(\alpha_{f,g}\) .
Answer 1.
\(-2.66667\)
Answer 2.
\(3\)
Answer 3.
\(1\)
Answer 4.
\(2.66571\)

6.

For each of the following operations on \(\R^2\text{,}\) determine whether it defines an inner product on \(\R^2\text{.}\) If it fails to be an inner product, identify which of the three inner product axioms (if any) it does satisfy, and provide explicit counterexamples for any axiom that fails.
  1. \(\angvec{(x_1,x_2),\ (y_1,y_2)}=x_1y_2+x_2y_1\text{.}\)
  2. \(\angvec{(x_1,x_2),\ (y_1,y_2)}=2x_1y_1+x_1y_2+x_2y_1+3x_2y_2\text{.}\)
  3. \(\angvec{(x_1,x_2), \ (y_1,y_2)}=x_1^2y_1^2+x_2^2y_2^2\text{.}\)
Hint.
The operation in (b) is an inner product. Use that fact that
\begin{equation*} \angvec{\boldx,\ \boldy}=\boldx^T \begin{amatrix}[cc]2\amp 1 \\ 1 \amp 3 \end{amatrix}\boldy\text{,} \end{equation*}
where we treat \(\boldx, \boldy\) as column vectors. This helps to prove axioms (i)-(ii). For axiom (iii), use either a “complete the square” or quadratic formula argument on the expression \(\langle \boldx, \boldx\rangle=2x_1^2+2x_1x_2+3x_2^2\text{.}\)

7.

We work within the inner product space given by \(V=P_2\) together with the evaluation at 0, 1, 2 inner product.
Let \(q(x)=x\text{.}\) Give a parametric description of the set
\begin{equation*} W=\{p(x)\in P_2\colon \langle p(x), q(x)\rangle =0\}\text{.} \end{equation*}

8.

We work in the inner product space given by \(V=C([-\pi,\pi])\) together with the integral inner product.
  1. Let \(f(x)=\cos x, g(x)=\sin x\text{.}\) Compute \(\langle f,g \rangle \) and \(\norm{g}\text{.}\)
  2. Show that if \(f(x)\) is an odd function (i.e., \(f(x)=-f(-x)\) for all \(x\)) and \(g(x)\) is an even function (\(g(-x)=g(x)\) for all \(x\)), then \(\langle f, g \rangle=0 \text{.}\) Hint: use the area interpretation of the integral and properties of even/odd functions.

.

Compute the angle between the given vectors with respect to the given inner product. The exercises are designed to be done by hand: i.e., do not express your answer in terms of \(\arccos\text{.}\)
9.
\(V=\R^4\) with the standard dot product; \(\boldv=(1,1,1,1), \boldw=(1,-1,1,1)\)
10.
\(V=\R^2\) with the dot product with weights \(1,2\text{;}\) \(\boldv=(1,0), \boldw=(-2,\sqrt{2})\)
11.
\(V=C([0,1])\) with the integral inner product; \(f(x)=1, g(x)=x\)
12.
\(V=P_2\) with evaluation at \(-1, 1\) inner product; \(p(x)=-\frac{1}{2}x+\frac{1}{2}, q(x)=2x\)

13.

Let \((V,\langle\, , \rangle)\) be an inner product space. Prove that \(\langle \boldv, \boldzero\rangle=0\) for all \(\boldv\in V\text{.}\)

14.

Let \(\boldv\) and \(\boldw\) be nonzero vectors of the inner product space \((V, \langle\, , \rangle)\text{,}\) and let \(\theta\) be the angle between them. Prove the following equivalence:
\begin{equation*} \norm{\boldv+\boldw}=\norm{\boldv}+\norm{\boldw}\text{ if and only if } \theta=0\text{.} \end{equation*}
Your proof should be a chain of equivalences with each step justified.
Hint.
The equality is true if and only if it is true after squaring both sides. (Why?) Use the definition
\begin{equation*} \norm{\boldv+\boldw}^2=\langle \boldv+\boldw, \boldv+\boldw\rangle \end{equation*}
and expand the inner product.

15.

Let \((V, \langle , \rangle )\) be an inner product space. Suppose vectors \(\boldv, \boldw\in V\) satisfy \(\norm{\boldv}=2\) and \(\norm{\boldw}=3\text{.}\) Using the Cauchy-Schwarz inequality (4.1.23) find the maximum and minimum possible values of \(\norm{\boldv-\boldw}\text{,}\) and give explicit examples where those values occur.

17.

Prove each inequality below using the Cauchy-Schwarz inequality (4.1.23) applied to a judicious choice of inner product space, and possibly a judicious choice of vector in said inner product space.
  1. For all \(f, g\in C([a,b])\)
    \begin{equation*} \left(\int_a^b f(x)g(x) \ dx\right)^2\leq \int_a^b f^2(x)\ dx\int_a^b g^2(x) \ dx\text{.} \end{equation*}
  2. For all \((x_1,x_2,\dots, x_n)\in\R^n\text{,}\)
    \begin{equation*} (x_1+x_2+\cdots +x_n)\leq\sqrt{x_1^2+x_2^2+\cdots +x_n^2}\sqrt{n}\text{.} \end{equation*}
  3. For all \(a,b,\theta\in\R\)
    \begin{equation*} (a\cos\theta+b\sin\theta)^2\leq a^2+b^2\text{.} \end{equation*}

18. Isometries of inner product spaces.

Let \((V,\angvec{ \ , })\) be an inner product space. An isometry of \(V\) is a function \(f\colon V\rightarrow V\) that preserves distance: i.e.,
\begin{equation*} d(f(\boldv), f(\boldw))=d(\boldv, \boldw) \text{ for all \(\boldv, \boldw\in V\) }\text{.} \end{equation*}
In this exercise we will show that any isometry that maps \(\boldzero\) to \(\boldzero\) is a linear transformation. This is a very useful fact. For example, it implies the linearity of many geometric transformations we have considered: rotation about the origin in \(\R^2\text{,}\) reflection through a line in \(\R^2\text{,}\) etc..
In what follows assume that \(f\) is an isometry of \(V\) satisfying \(f(\boldzero)=\boldzero\text{.}\)
  1. Prove that \(\norm{f(\boldv)}=\norm{\boldv}\text{:}\) i.e., \(f\) preserves norms.
  2. Prove \(\angvec{f(\boldv), f(\boldw)}=\angvec{\boldv, \boldw}\text{:}\) i.e., \(f\) preserves inner products. Hint: first prove that \(\angvec{\boldv, \boldw}=\frac{1}{2}(\norm{\boldv}^2+\norm{\boldw}^2-\norm{\boldv-\boldw}^2)\text{.}\)
  3. To prove \(f\) is linear it is enough to show \(f(\boldv+c\boldw)=f(\boldv)+cf(\boldw)\) for all \(\boldv, \boldw\in V\text{,}\) \(c\in \R\text{.}\) To do so, use the above parts to show that
    \begin{equation*} \norm{f(\boldv+c\boldw)-(f(\boldv)+cf(\boldw))}=0\text{.} \end{equation*}