Chapter 4 Linear Transformations
Let \(F\) be a field (e.g. \(F = \mathbb{R}, \mathbb{Q}, \mathbb{C}\) or \(\mathbb{F}_{2}\)).
Definition 4.1 An \(m \times n\)-matrix \(A\) over a field \(F\) is an array \[ A = \begin{pmatrix} a_{11} & \dots & a_{1n} \\ \vdots & & \vdots \\ a_{m1} & \dots & a_{mn} \end{pmatrix} \] with entries \(a_{ij}\) in \(F\). We use the notation \(M_{m \times n}(F)\) for the set of all \((m \times n)\)-matrices over \(F\) (see also 2.7(b)). We define addition and multiplication of matrices (and other notions) in the same way as in the case \(F = \mathbb{R}\) (as seen in Linear Algebra I).
For example:
- \(\begin{pmatrix} 1 & 1+i \\ 2 & 1-i \end{pmatrix} \begin{pmatrix} 1-i \\ 3 \end{pmatrix} = \begin{pmatrix} 1-i + 3+3i \\ 2-2i + 3-3i \end{pmatrix} = \begin{pmatrix} 4+2i \\ 5-5i \end{pmatrix}\) (matrices over \(\mathbb{C}\))
- \(\begin{pmatrix} 0 & 1 \\ 1 & 1 \end{pmatrix} \begin{pmatrix} 1 & 1 \\ 0 & 1 \end{pmatrix} = \begin{pmatrix} 0 & 1 \\ 1 & 0 \end{pmatrix}\) (as matrices over \(\mathbb{F}_{2}\))
- but \(\begin{pmatrix} 0 & 1 \\ 1 & 1 \end{pmatrix} \begin{pmatrix} 1 & 1 \\ 0 & 1 \end{pmatrix} = \begin{pmatrix} 0 & 1 \\ 1 & 2 \end{pmatrix}\) (as matrices over \(\mathbb{R}\))
Definition 4.2 Let \(V, W\) be vector spaces over a field \(F\). A map \(L: V \to W\) is called a linear transformation if the following two conditions hold:
- For all \(x,y \in V\) we have \(L(x + y) = L(x) + L(y)\) in \(W\).
- For all \(a \in F\) and \(x \in V\) we have \(L(ax) = a(L(x))\) in \(W\).
Note: Then we also have \(L( 0_{V}) = 0_{W}\) and \(L(x-y) = L(x) - L(y)\) for all \(x,y \in V\).
Proof: \(L(0_{V}) = L(0_{V} + 0_{V}) = L(0_{V}) + L(0_{V})\)
\(\implies\) \(L(0_{V}) = 0_{W}\). (by cancelling \(L(0_{V})\))
\(L(x-y) = L(x+(-1)y) = L(x) + L((-1)y) = L(x) + (-1)L(y) = L(x) - L(y)\).
(using 2.6(d)) \(\square\)
Example 4.3
Let \(A \in M_{m \times n}(F)\). Then the map \[\begin{align*} {\color{red}{L_{A}}}: &F^{n} \to F^{m} \\ &\underline{x} = \begin{pmatrix} x_{1} \\ \vdots \\ x_{n} \end{pmatrix} \mapsto A \underline{x} = \begin{pmatrix} a_{11}x_{1} + \dots + a_{1n}x_{n} \\ \vdots \\ a_{m1}x_{1} + \dots + a_{mn}x_{n} \end{pmatrix} \end{align*}\] is a linear transformation. (Compare with Lemma 5.3 in L.A.I.)
For example, if \(A = \begin{pmatrix} a & 0 \\ 0 & a \end{pmatrix} \in M_{2 \times 2}(\mathbb{R})\) for some \(a \in \mathbb{R}\) then \(L_{A} : \mathbb{R}^{2} \to \mathbb{R}^{2}\) is given by \(\underline{x} \mapsto a \underline{x}\), i.e. it is a stretch of the plane by a factor of a.
If \(A = \begin{pmatrix} \cos(\phi) & \sin(\phi) \\ -\sin(\phi) & \cos(\phi) \end{pmatrix}\) for some \(0 \leq \phi < 2\pi\) then \(L_{A}: \mathbb{R}^{2} \to \mathbb{R}^{2}\) is the clockwise rotation by the angle \(\phi\).
Proof that \(L_A\) is a linear transformation:
1/ Let \(\underline{x},\underline{y} \in F^{n}\).
\(\implies\) \(L_{A}( \underline{x}+\underline{y}) = A(\underline{x}+\underline{y}) = A \underline{x} + A \underline{y} = L_{A}(\underline{x}) + L_{A}(\underline{y})\).
2/ Let \(a \in F\) and \(\underline{x} \in F^{n}\).
\(\implies\) \(L_{A}(a \underline{x}) = A(a \underline{x}) = a (A \underline{x}) = a(L_{A}(\underline{x}))\).
(The middle equality of both chains of equalities has been proved in Linear Algebra I for \(F= \mathbb{R}\), see Thm 2.13(i) and (ii), the same proof works for any field \(F\).) \(\square\)Let \(V\) be a vector space over a field \(F\). Then the following maps are linear transformations (cf. Example 5.4(c),(d) in L.A.I.):
- id\(: V \to V\), \(x \mapsto x\) (identity)
- \(\underline{0}\)\(:V \to V\), \(x \mapsto 0_{V}\) (zero map)
- the map \(V \to V\), given by \(x \mapsto ax\), for any given \(a\in F\) fixed (stretch)
Let \(L: V \to W\) and \(M: W \to Z\) be linear transformation between vector spaces over a field \(F\). Then their composition \(M \circ L: V \to Z\) is again a linear transformation. (See also Section 5.3 of L.A.I.)
Proof that \(M\circ L\) is a linear transformation:
1/ Let \(x,y \in V\).
\(\implies\) \((M \circ L)(x+y) =\) \(M(L(x+y)) =\) \(M(L(x) + L(y))\)
\(= M(L(x)) + M(L(y)) =\) \((M \circ L)(x) + (M \circ L)(y)\).
2/ Let \(a\in F\) and \(x \in V\).
\(\implies\) \((M \circ L)(ax) =\) \(M(L(ax)) =\) \(M(a(L(x))) =\) \(a(M(L(x))) =\) \(a(M \circ L)(x)\). \(\square\)Let \(V\) be the subspace of \(\mathbb{R}^{\mathbb{R}}\) consisting of all differentiable functions. Then differentiation \(D: V \to \mathbb{R}^{\mathbb{R}}\), \(f \mapsto f'\), is a linear transformation.
Proof:
1/ Let \(f,g \in V\) \(\implies\) \(D(f+g) = (f+g)' = f' + g' = D(f) + D(g)\).
2/ Let \(a \in \mathbb{R}\) and \(f \in V\) \(\implies\) \(D(af) = (af)' = a f' = a(D(f))\).
(The middle equality in both chains of equalities has been proved in Calculus.) \(\square\)The map \(L: \mathbb{R}^{2} \to \mathbb{R}\), \(\begin{pmatrix} x_{1} \\ x_{2} \end{pmatrix} \mapsto x_{1}x_{2}\), is not a linear transformation.
Proof: Let \(a=2\) and \(\underline{x} = \begin{pmatrix} 1 \\ 1 \end{pmatrix} \in \mathbb{R}^{2}.\) Then:
\(L(a \underline{x}) = L \Bigg( \begin{pmatrix} 2 \\ 2 \end{pmatrix} \Bigg) = 4\), but \(aL( \underline{x}) = 2 \cdot 1 = 2\). \(\square\)
4.1 Matrix representation I
Proposition 4.4
(Matrix representation I)
Let \(F\) be a field.
Let \(L: F^{n} \to F^{m}\) be a linear transformation. Then there exists a unique matrix \(A \in M_{m \times n}(F)\) such that \(L = L_{A}\) (as defined in 4.3(a)). In this case we say that \(A\) represents \(L\) (with respect to the standard bases of \(F^{n}\) and \(F^{m}\)).
For example, the map \(\mathbb{R}^{3} \to \mathbb{R}^{2}, \begin{pmatrix} c_{1} \\ c_{2} \\ c_{3} \end{pmatrix} \mapsto \begin{pmatrix} 2c_{1} + c_{3} - 4c_{2} \\ c_{2} \end{pmatrix}\), is represented by \(A = \begin{pmatrix} 2 & -4 & 1 \\ 0 & 1 & 0 \end{pmatrix} \in M_{2 \times 3}( \mathbb{R})\).
Proof: Let \(\underline{e}_{1}:= \begin{pmatrix} 1 \\ 0 \\ \vdots \\ 0 \end{pmatrix} , \dots, \underline{e}_{n} := \begin{pmatrix} 0 \\ \vdots \\ 0 \\ 1 \end{pmatrix}\) denote the standard basis of \(F^{n}\).
Uniqueness: Suppose \(A \in M_{m \times n}(F) \text{ satisfies } L=L_{A}\).
\(\implies\) The \(j^{th}\) column of \(A\) is \(A \underline{e}_{j} = L_{A}( \underline{e}_{j}) = L( \underline{e}_{j}) \ \ (\text{for } j=1, \dots, n)\)
\(\implies\) \(A\) is the \((m \times n)\)-matrix with the column vector \(L( \underline{e}_{1}), \dots, L( \underline{e}_{n})\).
Existence: Let \(A\) be defined this way. We want to show \(L = L_{A}\).
Let \(\underline{c} = \begin{pmatrix} c_{1} \\ \vdots \\ c_{n} \end{pmatrix} \in F^{n} \implies \underline{c} = c_{1} \underline{e}_{1} + \dots + c_{n} \underline{e}_{n};\)
\(\implies L( \underline{c}) = L(c_{1} \underline{e}_{1}) + \dots + L(c_{n} \underline{e}_{n}) = c_{1}L( \underline{e}_{1}) + \dots + c_{n}L( \underline{e}_{n})\)
and \(L_{A}( \underline{c}) = \phantom{L(c_{1} \underline{e}_{1}) +} \dots \phantom{L(c_{1} \underline{e}_{1}) +} = c_{1}L_{A}( \underline{e}_{1}) + \dots + c_{n}L_{A}( \underline{e}_{n})\)
(because \(L\) and \(L_{A}\) are linear transformations)
\(\implies\) \(L( \underline{c}) = L_{A}( \underline{c})\) because \(L( \underline{e}_{j}) = L_{A}( \underline{e}_{j})\) for all \(j = 1, \dots, n\)).
\(\square\)
4.2 Kernel and image
Definition 4.5 Let \(L: V \to W\) be a linear transformation between vector spaces \(V,W\) over a field \(F\). Then \[ {\color{red}{\mathrm{ker}(L)}} := \{ x \in V : \ L(x)= 0_{W} \} \] is called the kernel of \(L\), and \[ {\color{red}{\mathrm{im}(L)}}:= \{ y \in W : \ \exists x \in V : \ y = L(x) \} \] is called the image of \(L\).
Remark 4.6 Let \(F\) be a field and \(A \in M_{m \times n}(F)\). Then \[ \mathrm{ker}(L_{A}) = \mathrm{N}(A) \] where \(\mathrm{N}(A)\) \(= \{ \underline{c} \in F^{n} : A \underline{c} = \underline{0} \}\) denotes the nullspace of \(A\) (see also Section 6.2 of L.A.I.) and \[ \mathrm{im}(L_{A}) = \mathrm{Col}(A) \] where \(\mathrm{Col}(A)\) denotes the column space of \(A\); i.e. \(\mathrm{Col}(A)\) \(= \mathrm{Span}(\underline{a}_{1}, \dots, \underline{a}_{n})\), where \(\underline{a}_{1}, \dots, \underline{a}_{n}\) denote the \(n\) columns of \(A\). (See also Section 6.4 of L.A.I.)
Proof:
First assertion: by definition.
Second assertion: follows from 4.9(a) applied to the standard basis of \(F^{n}\).
\(\square\)
Proposition 4.7 Let \(V\) and \(W\) be vector spaces over a field \(F\) and let \(L: V \to W\) be a linear transformation. Then:
- \(\mathrm{ker}(L)\) is a subspace of \(V\).
- \(\mathrm{im}(L)\) is a subspace of \(W\).
Proof: (a) We verify the three subspace axioms.
- We have \(0_{V} \in \mathrm{ker}(L)\) (see Note after Defn 4.2.)
- Let \(x,y \in \mathrm{ker}(L)\);
\(\implies L(x+y) = L(x) + L(y) = 0_{W} + 0_{W} = 0_{W}\);
\(\implies x+y \in \mathrm{ker}(L)\). - Let \(a \in F\) and \(x \in \mathrm{ker}(L)\);
\(\implies L(ax) = a(L(x)) = a 0_{W} = 0_{W}\);
\(\implies ax \in \mathrm{ker}(L)\).
(b) We verify the three subspace axioms.
- We have \(0_{W} = L(0_{V}) \in \mathrm{im}(L)\).
- Let \(x,y \in \mathrm{im}(L)\);
\(\implies \exists v,w \in V\) such that \(x=L(v)\) and \(y=L(w)\);
\(\implies x+y = L(v) + L(w) = L(v+w) \in \mathrm{im}(L)\). - Let \(y \in \mathrm{im}(L)\) and \(a \in F\);
\(\implies \exists x \in V\) such that \(y = L(x)\);
\(\implies ay = a(L(x)) = L(ax) \in \mathrm{im}(L)\). \(\square\)
Example 4.8 Let \(A \in M_{4 \times 4}(\mathbb{R})\) be as in 3.8(d). Find a basis of the image, \(\mathrm{im}(L_{A})\), of \(L_{A} : \mathbb{R}^{4} \to \mathbb{R}^{4}, \underline{c} \mapsto A \underline{c}\).
Solution: We perform column operations:
\[\begin{align*}
A =
\begin{pmatrix} 1 & -1 & 3 & 2 \\ 2 & -1 & 6 & 7 \\ 3 & -2 & 9 & 9 \\ -2 & 0 & -6 & -10 \end{pmatrix}
& \xrightarrow[\substack{C3 \mapsto C3 - 3C1 \\ C4 \mapsto C4 - 2C1}]{C4 \mapsto C4 + 3C2}
\begin{pmatrix} 1 & 0 & 0 & 0 \\ 2 & 1 & 0 & 3 \\ 3 & 1 & 0 & 3 \\ -2 & -2 & 0 & -6 \end{pmatrix}
\\
& \xrightarrow{C2 \mapsto C2 + C1}
\begin{pmatrix} 1 & 0 & 0 & 0 \\ 2 & 1 & 0 & 0 \\ 3 & 1 & 0 & 0 \\ -2 & -2 & 0 & 0 \end{pmatrix} =: \widetilde{A}
\end{align*}\]
\(\implies\) The two vectors \(\begin{pmatrix} 1 \\ 2 \\ 3 \\ -2 \end{pmatrix}, \begin{pmatrix} 0 \\ 1 \\ 1 \\ -2 \end{pmatrix}\) span \(\mathrm{im}(L_{A})\)
(because \(\mathrm{im}(L_{A}) = \mathrm{Col}(A) = \mathrm{Col}(\tilde{A})\) by 4.6 and 3.3(b))
\(\implies\) They form a basis on \(\mathrm{im}(L_A)\).
(because they are also L.I., as they are not multiples of each other)\(\square\)
Proposition 4.9 Let \(V\) and \(W\) be vector spaces over a field \(F\) and let \(L: V \to W\) be a linear transformation. Let \(x_{1}, \dots, x_{n} \in V\). Then:
- If \(x_{1}, \dots, x_{n} \in V\) span \(V\), then \(L(x_{1}), \dots, L(x_{n})\) span \(\textrm{im}(L)\).
- If \(L(x_{1}), \dots, L(x_{n})\) are linearly independent, then \(x_{1}, \dots, x_{n}\) are linearly independent.
Proof:
First, \(\mathrm{Span}(L(x_1),\dots,L(x_n))\subseteq \mathrm{im}(L)\) (by 3.3 Note (i)).
For the other inclusion, let \(y \in \mathrm{im}(L)\);
\(\implies \exists x \in V\) such that \(y = L(x)\)
and \(\exists a_{1}, \dots, a_{n} \in F\) such that \(x = a_{1}x_{1} + \dots + a_{n}x_{n}\) (since \(V= \mathrm{Span}(x_{1}, \dots, x_{n})\))
\(\implies y = L(x) = L(a_{1}x_{1} + \dots + a_{n}x_{n}) =\)
\(\phantom{adfasdf} a_{1}L(x_{1}) + \dots + a_{n}L(x_{n}) \in \mathrm{Span}(L(x_{1}), \dots, L(x_{n}))\);
\(\implies \mathrm{im}(L)\subseteq \mathrm{Span}(L(x_1),\dots,L(x_n))\);
\(\implies \mathrm{im}(L)=\mathrm{Span}(L(x_1),\dots,L(x_n))\). (i.e. \(L(x_{1}), \dots, L(x_{n}) \text{ span im}(L)\))Let \(a_{1} , \dots, a_{n} \in F\) such that \(a_{1}x_{1} + \dots + a_{n}x_{n} = 0_{V}\);
\(\implies 0_{W} = L(0_{V}) = L(a_{1}x_{1} + \dots + a_{n}x_{n}) = a_{1}L(x_{1}) + \dots + a_{n}L(x_{n});\)
\(\implies a_{1} = \dots = a_{n} = 0\) (since \(L(x_{1}), \dots, L(x_{n})\) are linearly independent)
\(\implies\) \(x_{1} , \dots, x_{n}\) are linearly independent. \(\square\)
Proposition 4.10
(Kernel Criterion)
Let \(V\) and \(W\) be vector spaces over a field \(F\), and let \(L:V \to W\) be a linear transformation. Then:
\[
L \text{ is injective} \iff \mathrm{ker}(L) = \{ 0_{V} \}.
\]
Proof: “\(\Longrightarrow\)”:
Let \(x\in\mathrm{ker}(L) \implies L(x)=0_W\).
We also have \(L(0_V)=0_W\).
\(\implies\) \(x=0_V\). (by injectivity)
“\(\Longleftarrow\)”:
Let \(x,y \in V\) such that \(L(x) = L(y)\);
\(\implies\) \(L(x-y) = L(x) - L(y) = 0_{W}\);
\(\implies\) \(x - y = 0_{V}\) (since \(\mathrm{ker}(L) = \{ 0_{V} \}\))
\(\implies\) \(x=y\). \(\square\)
4.3 Isomorphism
Definition 4.11 Let \(V,W\) be vector spaces over a field \(F\). A bijective linear transformation \(L: V \to W\) is called an isomorphism. The vector spaces \(V\) and \(W\) are called isomorphic if there exists an isomorphism \(L: V \to W\); we then write \(V \cong W\).
Example 4.12 (a) For any vector space \(V\) over a field \(F\), the identity \(\mathrm{id}: V \to V\) is an isomorphism.
If \(L: V \to W\) is an isomorphism then the inverse map \(L^{-1} : W \to V\) is an isomorphism as well. (See also Def 5.21 from L.A.I.)
If \(A \in M_{n \times n}(\mathbb{R})\) is invertible then \(L_{A}: \mathbb{R}^{n} \to \mathbb{R}^{n}\) is an isomorphism.
The map \(L: \mathbb{R}^{2} \to \mathbb{C}, \begin{pmatrix} a \\ b \end{pmatrix} \mapsto a + bi\), is an isomorphism between the vector spaces \(\mathbb{R}^{2}\) and \(\mathbb{C}\) over \(\mathbb{R}\).
For any \(n \in \mathbb{N}\), the map \[ L: \mathbb{R}^{n+1} \to \mathbb{P}_{n}, \ \begin{pmatrix} a_{0}\\ \vdots\\ a_{n} \end{pmatrix} \mapsto a_{0} + a_{1}t + \dots + a_{n}t^{n} \] is an isomorphism between the vector spaces \(\mathbb{R}^{n+1}\) and \(\mathbb{P}_{n}\) over \(\mathbb{R}\).
For any \(m,n \in \mathbb{N}\) we have \(\mathbb{R}^{m n} \cong M_{m \times n}(\mathbb{R})\).
Proof:
(b) and (c) see Coursework.
(d) and (e) follow from the following proposition and 3.8(c) and (d), respectively.
(f) (only in the case \(m=n=2\)) The map
\[
\mathbb{R}^4 \to M_{2\times 2}(\mathbb{R}), \qquad \begin{pmatrix}
a_1\\a_2\\a_3\\a_4
\end{pmatrix} \mapsto \begin{pmatrix}
a_1 & a_2\\
a_3 & a_4
\end{pmatrix}
\]
is clearly an isomorphism.
Proposition 4.13 Let \(V\) be a vector space over a field \(F\) with basis \(x_{1}, \dots x_{n}\). Then the map \[ L: F^{n} \to V, \ \begin{pmatrix} a_{1} \\ \vdots \\ a_{n} \end{pmatrix} \mapsto a_{1}x_{1} + \dots + a_{n}x_{n} \] is an isomorphism. (We will later use the notation \(I_{x_{1}, \dots, x_{n}}\) for the map \(L\).)
Proof:
Let \(\underline{a} = \begin{pmatrix} a_{1} \\ \vdots \\ a_{n} \end{pmatrix}\) and \(\underline{b} = \begin{pmatrix} b_{1} \\ \vdots \\ b_{n} \end{pmatrix} \in F^{n}\);
\(\implies L(\underline{a} + \underline{b}) = L \left( \begin{pmatrix} a_{1} + b_{1} \\ \vdots \\ a_{n} + b_{n} \end{pmatrix} \right)\)
\(= (a_{1} + b_{1})x_{1} + \dots + (a_{n} + b_{n})x_{n}\) (by definition of \(L\))
\(= (a_{1}x_{1} + \dots + a_{n}x_{n}) + (b_{1}x_{1} + \dots + b_{n}x_{n})\)
(by distributivity, commutativity and associativity)
\(= L(\underline{a}) + L(\underline{b}).\) (by definition of \(L\))Let \(a \in F\) and \(\underline{b} = \begin{pmatrix} b_{1} \\ \vdots \\ b_{n} \end{pmatrix} \in F^{n}\);
\(\implies L(a \underline{b}) = L \left( \begin{pmatrix} a b_{1} \\ \vdots \\ a b_{n} \end{pmatrix} \right)\)
\(= (a b_{1})x_{1} + \dots + (a b_{n})x_{n}\) (by definition of \(L\))
\(= a(b_{1}x_{1} + \dots + b_{n}x_{n})\) (using the axioms of a vector space)
\(= a(L(\underline{b}))\). (by definition of \(L\))\(\mathrm{ker}(L) = \left\{ \begin{pmatrix} a_{1} \\ \vdots \\ a_{n} \end{pmatrix} \in F^{n} : a_{1}x_{1} + \dots + a_{n}x_{n} = 0_{V} \right\} = \left\{ \begin{pmatrix} 0 \\ \vdots \\ 0 \end{pmatrix} \right\}\)
(because \(x_{1}, \dots, x_{n}\) are linearly independent)
\(\implies\) \(L\) is injective. (by 4.10)\(\mathrm{im}(L) = \mathrm{Span}(L(\underline{e}_{1}), \dots, L(\underline{e}_{n}))\) (by 4.9(a))
\(= \mathrm{Span}(x_{1}, \dots, x_{n}) = V\) (because \(x_{1}, \dots, x_{n}\) span \(V\))
\(\implies\) \(L\) is surjective. \(\square\)
Theorem 4.14 Let \(V\) and \(W\) be vector spaces over a field \(F\) of dimension \(n\) and \(m\), respectively. Then \(V\) and \(W\) are isomorphic if and only if \(n=m\).
Proof: “\(\Longleftarrow\)”:
We assume that \(n = m\).
\(\implies\) We have isomorphisms \(L_{V}: F^{n} \to V\) and \(L_{W}: F^{n} \to W\) (by 4.13)
\(\implies\) \(L_{W} \circ L_{V}^{-1}\) is an isomorphism between \(V\) and \(W\). (by 4.3(b) and 4.12(b))
“\(\Longrightarrow\)”:
We assume that \(V\) and \(W\) are isomorphic.
Let \(L: V \to W\) be an isomorphism and let \(x_{1}, \dots, x_{n}\) be a basis of \(V\).
\(\implies\) \(L(x_{1}), \dots, L(x_{n})\) span \(\mathrm{im}(L)=W\) (by 4.9(a))
and are linearly independent (by 4.9(b) applied to \(L^{-1}\) and \(L(x_{1}), \dots, L(x_{n})\))
\(\implies\) \(L(x_{1}), \dots, L(x_{n})\) form a basis of \(W\)
\(\implies\) \(n = \mathrm{dim}_{F}(W) = m\). \(\square\)
4.4 Dimension Theorem
Theorem 4.15 (Dimension Theorem) Let \(V\) be a vector space over a field \(F\) of finite dimension and let \(L: V \to W\) be a linear transformation from \(V\) to another vector space \(W\) over \(F\). Then: \[ \boxed{\mathrm{dim}_{F}(\mathrm{ker}(L)) + \mathrm{dim}_{F}(\mathrm{im}(L)) = \mathrm{dim}_{F}(V).} \]
(In the textbooks this is sometimes called the rank–nullity theorem.)
Example 4.16
\[ \begin{array}{ l | l | l | l | l } \mathbf{L: V \to W} & \mathbf{dim(ker(L))} & \mathbf{dim(im(L))} & \mathbf{dim(V)} & \mathbf{Verification} \\ \hline L_{A} \text{ for} & =2 & =2 & =4 & 2 + 2 = 4 \\ A \in M_{4 \times 4}(\mathbb{R}) & \text{by 3.8(d)} & \text{by 4.8} & & \\ \text{as in 4.8} & & & & \\ \hline L_{A} \text{ for} & =3 & =2 & =5 & 3 + 2 = 5 \\ A \in M_{3 \times 5}(\mathbb{R}) & & & & \\ \text{below} & & & & \\ \hline \text{Isomorphism} & =0 & = \text{dim}(W) & = \text{dim}(V) & 0 + \text{ dim}(W) \\ & \text{by 4.10} & & & = \text{ dim}(V) \\ & & & & \text{by 4.14} \\ \hline \text{Zero map} & = \text{dim}(V) & =0 & = \text{dim}(V) & \text{dim}(V) + 0 \\ & & & & = \text{dim}(V) \end{array} \]
Let \[ A = \begin{pmatrix} 1&-2&2&3&-1\\ -3&6&-1&1&-7\\ 2&-4&5&8&-4\end{pmatrix}. \] We want to find \(\dim_{\mathbb{R}}(\mathrm{ker}(L_A))\) and \(\dim_{\mathbb{R}}(\mathrm{im}(L_A))\) – we do this by finding bases for both \(\text{ker}(L_A)\) and \(\text{im}(L_A)\).
We find a basis of the nullspace \(N(A)\):
\[
A = \begin{pmatrix} 1&-2&2&3&-1\\ -3&6&-1&1&-7\\ 2&-4&5&8&-4\end{pmatrix}
\xrightarrow{\text{Gaussian elimination}}
\begin{pmatrix} 1&-2&0&-1&3\\ 0&0&0&0&0\\ 0&0&1&2&-2\end{pmatrix} =: \tilde A.
\]
Then
\[\begin{align*}
N(A) &= \{\underline{x}\in\mathbb{R}^5 : A\underline{x} = \underline{0}\} = \{\underline{x}\in\mathbb{R}^5: \tilde A\underline{x} = \underline{0}\} =\\
&= \left\{\begin{pmatrix}x_1\\x_2\\x_3\\x_4\\x_5\end{pmatrix} \in\mathbb{R}^5: x_1=2x_2+x_4-3x_5,\, x_3=-2x_4+2x_5\right\} = \\
&= \left\{\begin{pmatrix}2x_2+x_4-3x_5\\x_2\\-2x_4+2x_5\\x_4\\x_5\end{pmatrix}: x_2,x_4,x_5\in\mathbb{R}\right\} = \\
&= \left\{x_2\begin{pmatrix}2\\1\\0\\0\\0\end{pmatrix} + x_4\begin{pmatrix} 1\\0\\-2\\1\\0 \end{pmatrix} + x_5\begin{pmatrix} -3\\0\\2\\0\\1\end{pmatrix}: x_2,x_4,x_5\in\mathbb{R} \right\}=\\
&=\mathrm{Span}_{\mathbb{R}}\left(\begin{pmatrix}2\\1\\0\\0\\0\end{pmatrix}, \begin{pmatrix} 1\\0\\-2\\1\\0 \end{pmatrix}, \begin{pmatrix} -3\\0\\2\\0\\1\end{pmatrix}\right).
\end{align*}\]
The three vectors above are linearly independent, since if \(a_1,a_2,a_3\in\mathbb{R}\) and
\[
a_1\begin{pmatrix}2\\1\\0\\0\\0\end{pmatrix} + a_2\begin{pmatrix} 1\\0\\-2\\1\\0 \end{pmatrix} + a_3\begin{pmatrix} -3\\0\\2\\0\\1\end{pmatrix} = \begin{pmatrix}0\\0\\0\\0\\0\end{pmatrix},
\]
then \(a_1=a_2=a_3=0\) by looking at the second, fourth and fifth coordinates, respectively.
Thus these three vectors are a basis of \(N(A)\), so \(\dim_{\mathbb{R}}(\mathrm{ker}(L_A))=\mathrm{dim}_{\mathbb{R}}(N(A))=3\).
We now find a basis of the image \(\mathrm{im}(L_A)\): \[ A = \begin{pmatrix} 1&-2&2&3&-1\\ -3&6&-1&1&-7\\ 2&-4&5&8&-4\end{pmatrix} \xrightarrow{\text{column operations}} \begin{pmatrix} 1&0&0&0&0\\ -3&0&5&0&0\\ 2&0&1&0&0\end{pmatrix}. \] So the vectors \(\begin{pmatrix}1\\-3\\2\end{pmatrix}, \begin{pmatrix}0\\5\\1\end{pmatrix}\) span \(\mathrm{im}(L_A)\). Since they are obviously not multiples of each other, they are linearly independent, hence form a basis of \(\mathrm{im}(L_A)\). Consequently, \(\mathrm{dim}_{\mathbb{R}}(\mathrm{im}(L_A))=2\).
Proof of the Dimension Theorem:
Let \(x_1,\dots,x_r\) be a basis of \(\mathrm{ker}(L)\).
We extend \(x_1,\dots,x_r\) to a basis \(x_1,\dots,x_n\) of the whole \(V\) for some \(n\geq r\) (by adding L.I. vectors in \(V\) until we obtain a maximal L.I. subset) and show below that \(L(x_{r+1}),\dots,L(x_{n})\) form a basis of \(\mathrm{im}(L)\). Then we have
\[
\dim_F(\mathrm{ker}(L)) + \dim_F(\mathrm{im}(L)) = r + (n-r) = n = \dim_F(V),
\]
as we wanted to prove.
Proof that \(L(x_{r+1}),\dots, L(x_n)\) form a basis of \(\mathrm{im}(L)\):
\(L(x_{r+1}),\dots, L(x_n)\) span \(\mathrm{im}(L)\):
Let \(y\in \text{im}(L)\).
\(\implies\) \(\exists x\in V\) such that \(y=L(x)\) (by definition of \(\mathrm{im}(L)\))
and \(\exists a_1,\dots, a_n\in F\) such that \(x=a_1x_1+\dots+a_nx_n\) (since \(x_1,\dots, x_n\) span \(V\))
\(\implies\) \(y=L(x)=L(a_1x_1+\dots+a_nx_n)\)
\(=a_1L(x_1)+\dots+a_nL(x_n)\) (because \(L\) is a linear transformation)
\(=a_{r+1}L(x_{r+1})+\dots + a_nL(x_n)\) (because \(x_1,\dots,x_r\in\mathrm{ker}(L)\))
\(\in \mathrm{Span}(L(x_{r+1}),\dots, L(x_n))\)
\(\implies \mathrm{im}(L) \subseteq \mathrm{Span}(L(x_{r+1}),\dots, L(x_n))\).
We also have \(\mathrm{Span}(L(x_{r+1}),\dots, L(x_n))\subseteq \mathrm{im}(L)\) (by 3.3/Note(i))
\(\implies\) \(\mathrm{im}(L) = \mathrm{Span}(L(x_{r+1}),\dots, L(x_n))\).\(L(x_{r+1}),\dots, L(x_n)\) are linearly independent:
Let \(a_{r+1},\dots,a_n\in F\) such that \(a_{r+1}L(x_{r+1})+\dots+a_nL(x_n)=0_W\).
\(\implies\) \(L(a_{r+1}x_{r+1}+\dots+ a_nx_n)=0_W\) (because \(L\) is a linear transformation)
\(\implies\) \(a_{r+1}x_{r+1}+\dots + a_nx_n\in\mathrm{ker}(L)\) (by definition of kernel)
\(\implies\) \(\exists a_1,\dots a_r\in F\) such that \(a_{r+1}x_{r+1}+\dots+a_nx_n = a_1x_1+\dots+a_rx_r\)
(because \(x_1,\dots,x_r\) span \(\mathrm{ker}(L)\))
\(\implies\) \(a_1x_1+\dots+a_rx_r-a_{r+1}x_{r+1}-\dots-a_nx_n=0_V\)
\(\implies\) \(a_1=\dots=a_r=-a_{r+1}=\dots=-a_{n}=0\) (because \(x_1,\dots,x_n\) are linearly independent)
\(\implies\) \(a_{r+1}=\dots=a_n=0\). \(\square\)
4.5 Matrix representation II
Proposition 4.17 (Matrix representation II) Let \(V\) and \(W\) be vector spaces over a field \(F\) with bases \(x_{1} , \dots, x_{n}\) and \(y_{1}, \dots, y_{m}\), respectively. Let \(L: V \to W\) be a linear transformation. Then there exists a unique matrix \(A \in M_{m \times n}(F)\) that represents \(L\) with respect to \(x_{1}, \dots, x_{n}\) and \(y_{1}, \dots, y_{m}\). Here we say that \(A = (a_{ij}) \in M_{m \times n}(F)\) represents \(L\) with respect to \(x_{1}, \dots, x_{n}\) and \(y_{1}, \dots, y_{m}\) if for all \(c_{1}, \dots, c_{n}, d_1, \dots, d_m \in F\) we have \[ L(c_{1}x_{1} + \dots + c_{n}x_{n}) = d_{1}y_{1} + \dots + d_{m}y_{m} \quad\iff\quad \begin{pmatrix} d_{1} \\ \vdots \\ d_{m} \end{pmatrix} = A \begin{pmatrix} c_{1} \\ \vdots \\ c_{n} \end{pmatrix}. \]
Proof:
Let \(A \in M_{m \times n}(F)\). Then:
\(A\) represents \(L\) with respect to \(x_{1}, \dots, x_{n}\) and \(y_{1}, \dots, y_{m}\)
\(\stackrel{(\star)}{\iff}\) The diagram
\[\begin{array}{r c c c l} &V & \xrightarrow{ \phantom{a} L \phantom{a}} & W \\
I_{x_1,\dots,x_n} & \uparrow & & \uparrow& I_{y_1,\dots,y_m} \\
& F^{n} & \xrightarrow{\phantom{a} L_{A} \phantom{a}} & F^{m} \end{array}
\]
commutes, i.e. \(L \circ I_{x_{1}, \dots, x_{n}} = I_{y_{1}, \dots, y_{m}} \circ L_{A}\) (see proof below)
\(\iff\) \(L_{A} = I_{y_{1}, \dots, y_{m}}^{-1} \circ L \circ I_{x_{1}, \dots, x_{n}} =: M\)
\(\iff\) \(A\) represents \(M: F^{n} \to F^{m}\) (with respect to the standard bases of \(F^{n}\) and \(F^{m}\)).
Hence 4.17 follows from 4.4.
Proof of (\(\star\)):
Let \(c_{1} , \dots, c_{n} \in F\) and let \(d_{1}, \dots, d_{m} \in F\) be given by
\(A \begin{pmatrix} c_{1} \\ \vdots \\ c_{n} \end{pmatrix} = \begin{pmatrix} d_{1} \\ \vdots \\ d_{m} \end{pmatrix}\)
\(\implies\) \((L \circ I_{x_{1} , \dots, x_{n}}) \left( \begin{pmatrix} c_{1} \\ \vdots \\ c_{n} \end{pmatrix} \right) = L(c_1x_1+\dots+c_nx_n)\)
and \((I_{y_1,\dots,y_m}\circ L_A)\left(\begin{pmatrix} c_{1} \\ \vdots \\ c_{n} \end{pmatrix} \right) = I_{y_1,\dots,y_m}\left(\begin{pmatrix} d_{1} \\ \vdots \\ d_{m} \end{pmatrix}\right) = d_1x_1+\dots+d_my_m\).
Hence: \(L(c_{1}x_{1} + \dots + c_{n}x_{n}) = d_{1}y_{1} + \dots + d_{m}y_{m}\)
\(\iff\) \((L \circ I_{y_{1}, \dots, y_{m}}) \left( \begin{pmatrix} c_{1} \\ \vdots \\ c_{n} \end{pmatrix} \right) = ( I_{y_{1}, \dots, y_{m}} \circ L_{A}) \left( \begin{pmatrix} c_{1} \\ \vdots \\ c_{n} \end{pmatrix} \right)\)
Therefore: \(A\) represents \(L \iff L \circ I_{x_{1}, \dots, x_{n}} = I_{y_{1}, \dots, y_{m}} \circ L_{A}\).
\(\square\)
Note: Given \(L, x_{1}, \dots, x_{n}\) and \(y_{1}, \dots, y_{m}\) as in 4.17 we find the corresponding matrix \(A\) as follows: For each \(i = 1, \dots, n\) we compute \(L(x_{i})\), represent \(L(x_{i})\) as a linear combination of \(y_{1}, \dots, y_{m}\) and write the coefficients of this linear combination into the \(i^{th}\) column of \(A\).
Example 4.18 Find the matrix \(A \in M_{3 \times 4}(\mathbb{R})\) representing differentiation \(D: \mathbb{P}_{3} \to \mathbb{P}_{2}, f \mapsto f'\), with respect to the bases \(1,t,t^{2},t^{3}\) and \(1,t,t^{2}\) of \(\mathbb{P}_{3}\) and \(\mathbb{P}_{2}\), respectively.
Solution: We have \[\begin{align*} & D(1) = 0 = 0 + 0t + 0t^{2} \\ & D(t) = 1 = 1 + 0t + 0t^{2} \\ & D(t^{2}) = 2t = 0 + 2t + 0t^{2} \\ & D(t^{3}) = 3t^{2} = 0 + 0t + 3t^{2} \end{align*}\] \(\implies\) \(A = \begin{pmatrix} 0 & 1 & 0 & 0 \\ 0 & 0 & 2 & 0 \\ 0 & 0 & 0 & 3 \end{pmatrix}\).
Example 4.19 Let \(B := \begin{pmatrix} 1 & -1 \\ 2 & 4 \end{pmatrix} \in M_{2 \times 2}(\mathbb{R})\). Find the matrix \(A \in M_{2 \times 2}(\mathbb{R})\) representing the linear transformation \(L_{B}: \mathbb{R}^{2} \to \mathbb{R}^{2}, \underline{x} \mapsto B \underline{x}\), with respect to the basis \(\begin{pmatrix} 1 \\ -2 \end{pmatrix}, \begin{pmatrix} 1 \\ -1 \end{pmatrix}\) of \(\mathbb{R}^{2}\) (used for both source and target space).
Solution: \[\begin{align*} L_{B} \left( \begin{pmatrix} 1 \\ 2 \end{pmatrix} \right) &= \begin{pmatrix} 1 & -1 \\ 2 & 4 \end{pmatrix} \begin{pmatrix} 1 \\ -2 \end{pmatrix} = \begin{pmatrix} 3 \\ -6 \end{pmatrix} = 3 \begin{pmatrix} 1 \\ -2 \end{pmatrix} + 0 \begin{pmatrix} 1 \\ -1 \end{pmatrix}\\ L_{B} \left( \begin{pmatrix} 1 \\ -1 \end{pmatrix} \right) &= \begin{pmatrix} 1 & -1 \\ 2 & 4 \end{pmatrix} \begin{pmatrix} 1 \\ -1 \end{pmatrix} = \begin{pmatrix} 2 \\ -2 \end{pmatrix} = 0 \begin{pmatrix} 1 \\ -2 \end{pmatrix} + 2 \begin{pmatrix} 1 \\ -1 \end{pmatrix} \end{align*}\] \(\implies\) \(A = \begin{pmatrix} 3 & 0 \\ 0 & 2 \end{pmatrix}\).