Chapter 6 Diagonalisability

6.1 Eigen-things

Definition 6.1 Let V be a vector space over a field \(F\) and let \(L: V \to V\) be a linear transformation from \(V\) to itself.

For any \(\lambda \in F\) the set \[ {\color{red}{E_{\lambda}(L)}} := \{ x \in V : L(x)= \lambda x \} \] is called the eigenspace of \(L\) corresponding to \(\lambda\).
An element \(\lambda \in F\) is called an eigenvalue of \(L\) if \(E_{\lambda}(L)\) is not the zero space. In this case any vector \(x\) in \(E_{\lambda}(L)\) different from the zero vector is called an eigenvector of \(L\) with eigenvalue \(\lambda\).
Let \(A \in M_{n \times n}(F)\). The eigenspaces, eigenvalues and eigenvectors of \(A\) are, by definition, those of \(L_{A} : F^{n} \to F^{n}, \underline{x} \mapsto A \underline{x}\).

(Compare Definition 7.1 of L.A.I.)

Proposition 6.2 Let \(F\), \(V\) and \(L\) be as in Defn 6.1. Then \(E_{\lambda}(L)\) is a subspace of \(V\) for every \(\lambda \in F\).

Proof:

We have \(0_{V} \in E_{\lambda}(L)\) because \(L(0_{V}) = 0_{V} = \lambda \cdot 0_{V}\).
Let \(x,y \in E_{\lambda} (L)\)
\(\implies L(x+y) = L(x) + L(y) = \lambda x + \lambda y = \lambda(x+y)\)
\(\implies x + y \in E_{\lambda}(L)\).
Let \(a \in F\) and \(x \in E_{\lambda }(L)\)
\(\implies L(ax) = aL(x) = a(\lambda x) = (a \lambda)x = (\lambda a)x = \lambda(ax)\)
\(\implies ax \in E_{\lambda}(L)\). \(\square\)

Proposition 6.3 Let \(F\) be a field. Let \(A \in M_{n \times n} (F)\) and \(\lambda \in F\). Then: \[\lambda \text{ is an eigenvalue of } A \iff \mathrm{det}(\lambda I_{n} - A) = 0. \] (\(p_A(\lambda)\) \(:= \mathrm{det}(\lambda I_{n} - A)\) is called the characteristic polynomial of \(A\)) (See also Proposition 7.5 in L.A.I.)

Proof: \(\lambda\) is an eigenvalue of A
\(\iff \exists \underline{x} \in F^{n}, \; \underline{x} \neq \underline{0},\;\) such that \(A \underline{x} = \lambda \underline{x}\)
\(\iff \exists \underline{x} \in F^{n}, \; \underline{x} \neq \underline{0}, \;\) such that \((\lambda I_{n} - A) \underline{x} = 0\)
\(\iff N(\lambda I_{n} - A) \neq \{ \underline{0} \}\)
\(\iff\) \(\mathrm{det}(\lambda I_{n} - A) = 0\). (by 5.6 (c) \(\iff\) (d)) \(\square\)

Example 6.4 Determine the (complex!) eigenvalues of the matrix \[A:= \begin{pmatrix} 5i & 3 \\ 2 & -2i \end{pmatrix} \in M_{2x2} (\mathbb{C}) \] and a basis of the eigenspace of \(A\) for each eigenvalue of A.

Solution: \[\begin{align*} p_{A}(\lambda) &= \mathrm{det}(\lambda I_{2} - A) = \mathrm{det} \begin{pmatrix} \lambda - 5i & -3 \\ -2 & \lambda + 2i \end{pmatrix}\\ &= (\lambda - 5i)(\lambda + 2i) - 6 = \lambda^{2} -(3i)\lambda + 4 \end{align*}\] The two roots of this polynomial are \(\lambda_{1,2} = \frac{3i \pm \sqrt{-9 -16}}{2} =\frac{3i \pm 5i}{2} = 4i\) or \(-i\);
\(\implies\) eigenvalues of \(A\) are \(4i\) and \(-i\).

Basis of \(E_{4i}(A)\): Apply Gaussian elimination to \[ 4i I_{2} - A = \begin{pmatrix} -i & -3 \\ -2 & 6i \end{pmatrix} \xrightarrow{R1 \mapsto iR1} \begin{pmatrix} 1 & -3i \\ -2 & 6i \end{pmatrix} \xrightarrow{R2 \mapsto R2 + 2R1} \begin{pmatrix} 1 & -3i \\ 0 & 0 \end{pmatrix} \] \[\begin{align*}\implies E_{4i}(A) &= \left\{ \begin{pmatrix} x_{1} \\ x_{2} \end{pmatrix} \in \mathbb{C}^{2} : x_{1} = (3i)x_{2} \right\} \\ &= \left\{ \begin{pmatrix} (3i)x_{2} \\ x_{2} \end{pmatrix} : x_{2} \in \mathbb{C} \right\} = \mathrm{Span} \left(\begin{pmatrix} 3i \\ 1 \end{pmatrix} \right) \end{align*}\] \(\implies\) a basis of \(E_{4i}(A)\) is \(\begin{pmatrix} 3i \\ 1 \end{pmatrix}\) (as it is L.I.).

Basis of \(E_{-i}(A)\): Apply Gaussian elimination to \[ -i I_{2} - A = \begin{pmatrix} -6i & -3 \\ -2 & i \end{pmatrix} \xrightarrow{R1 \leftrightarrow R2} \begin{pmatrix} -2 & i \\ -6i & -3 \end{pmatrix} \xrightarrow{R2 \mapsto R2 -(3i)R1} \begin{pmatrix} -2 & i \\ 0 & 0 \end{pmatrix} \] \[\begin{align*} \implies E_{-i}(A) &= \left\{ \begin{pmatrix} x_{1} \\ x_{2} \end{pmatrix} \in \mathbb{C}^2 : x_{1} = \frac{i}{2} x_{2} \right\} \\ &=\left\{ \begin{pmatrix} \frac{i}{2} x_{2} \\ x_{2} \end{pmatrix} : x_{2} \in \mathbb{C} \right\} = \mathrm{Span} \left( \begin{pmatrix} \frac{i}{2} \\ 1 \end{pmatrix} \right) \end{align*}\] \(\implies\) a basis of \(E_{-i}(A)\) is \(\begin{pmatrix} i \\ 2 \end{pmatrix}\) (as it is L.I.).

Example 6.5 Let \(V\) be the real vector space of infinitely often differentiable functions from \(\mathbb{R}\) to \(\mathbb{R}\) and let \(D: V \rightarrow V, \ f \mapsto f',\) denote differentiation (cf. 4.3(d)). Then for every \(\lambda \in \mathbb{R}\) the eigenspace of \(D\) with eigenvalue \(\lambda\) is of dimension 1 with basis given by the function \(\exp_{\lambda}: \mathbb{R} \rightarrow \mathbb{R}, \ t \mapsto e^{\lambda t}\).

Proof: \((e^{\lambda t})' = \lambda (e^{\lambda t})\) (by the chain rule)
\(\implies \exp_{\lambda} \in E_{\lambda}(D)\).
Conversely, suppose \(f \in E_{\lambda}(D)\)
\(\implies\) \((f(t) e^{-\lambda t})' = f'(t) e^{-\lambda t} + f(t)(e^{-\lambda t})'\) (by the product rule)
\(= \lambda f(t) e^{-\lambda t} - \lambda f(t) e^{-\lambda t}\) (because \(f \in E_{\lambda}(D)\) and by the chain rule)
\(= 0\)
\(\implies\) \(f(t) e^{-\lambda t}\) is a constant, say \(a \in \mathbb{R}\) (by Calculus)
\(\implies\) \(f(t) = a e^{\lambda t}\), i.e. \(f=a\exp_{\lambda}\).
Hence \(E_{\lambda}(D) = \mathrm{Span} ( \exp_{\lambda} )\). \(\square\)

6.2 Diagonalisability

Definition 6.6

Let \(F\), \(V\) and \(L: V \rightarrow V\) be as in Defn 6.1. We say that \(L\) is diagonalisable if there exists a basis \(x_{1} , \dots, x_{n}\) of \(V\) such that the matrix \(D\) representing \(L\) with respect to this basis is a diagonal matrix.
Let \(F\) be a field. We say that a square matrix \(A \in M_{n \times n}(F)\) is diagonalisable if the linear transformation \(L_{A} : F^{n} \to F^{n}, \ \underline{x} \mapsto A \underline{x}\), is diagonalisable.

6.2.1 Diagonalisability (version 1)

Proposition 6.7 Let \(F\), \(V\) and \(L: V \rightarrow V\) be as in Defn 6.1. Then \(L\) is diagonalisable if and only if \(V\) has a basis \(x_{1}, \dots, x_{n}\) consisting of eigenvectors of \(L\).

Proof: “\(\Longrightarrow\)”:
Suppose \(\exists\) a basis \(x_{1}, \dots, x_{n}\) of \(V\) such that the matrix \(D\) representing \(L\) is diagonal, with some \(\lambda_{1}, \dots, \lambda_{n} \in F\) on the main diagonal.
\(\implies\) for any \(c_1,\dots,c_n\in F\) we have \[\begin{align*} L(c_1x_1+\dots+c_nx_n) &= (\lambda_1c_1)x_1+\dots+(\lambda_nc_n)x_n,\\ \text{because }\begin{pmatrix}\lambda_1c_1\\ \vdots \\ \lambda_nc_n\end{pmatrix} &= \begin{pmatrix} \lambda_{1} & & 0 \\ & \ddots & \\ 0 & & \lambda_{n} \end{pmatrix}\cdot \begin{pmatrix}c_1\\ \vdots\\ c_n\end{pmatrix}. \end{align*}\] \(\implies\) in particular when \(c_1=0,\dots,c_{i-1}=0,c_i=1,c_{i+1}=0,\dots,c_n=0\) for some \(i\in\{1,\dots,n\}\), we get \[ L(x_i)=\lambda_i x_i \] \(\implies\) \(x_{1}, \dots, x_{n}\) are eigenvectors of \(L\) with eigenvalues \(\lambda_{1}, \dots, \lambda_{n}\), respectively.

“\(\Longleftarrow\)”:
Let \(x_{1}, \dots, x_{n}\) be a basis of \(V\) consisting of eigenvectors of \(L\) and let \(\lambda_{i} \in F\) denote the eigenvalue corresponding to \(x_{i}\).
Define a diagonal matrix \(D\) by \(D = \begin{pmatrix} \lambda_{1} & & 0 \\ & \ddots & \\ 0 & & \lambda_{n} \end{pmatrix}\).
\(\implies\) \(D\) represents \(L\) with respect to \(x_{1}, \dots, x_{n}\) because
\[\begin{align*} L(c_{1}x_{1} + \dots + c_{n}x_{n}) &= c_{1}L(x_{1}) + \dots + c_{n}L(x_{n}) = \lambda_{1}c_{1}x_{1} + \dots + \lambda_{n}c_{n}x_{n}\\ \text{and }\begin{pmatrix} \lambda_{1}c_{1} \\ \vdots \\ \lambda_{n}c_{n} \end{pmatrix} &= D \begin{pmatrix} c_{1} \\ \vdots \\ c_{n} \end{pmatrix} \ \ \forall c_{1}, \dots, c_{n} \in F. \end{align*}\] \(\square\)

6.2.2 Diagonalisability (version 2)

Proposition 6.8 Let \(F\) be a field. Let \(A \in M_{n \times n}(F)\). Then \(A\) is diagonalisable if and only if there exists an invertible matrix \(M \in M_{n\times n}(F)\) such that \(M^{-1}AM\) is a diagonal matrix. (In this case we say that \(M\) diagonalises \(A\).)

Proof preparation: Let \(M \in M_{n \times n}(F)\) with column vectors \(\underline{x}_{1}, \dots, \underline{x}_{n}\). Suppose \(M\) is invertible. Then:
            \(\underline{x}_{i}\) is an eigenvector of \(A\) with eigenvalue \(\lambda_{i}\)
            \(\iff A \underline{x}_{i} = \lambda_{i} \underline{x}_{i}\)
            \(\iff AM \underline{e}_{i} = \lambda_{i} (M e_{i})\) (because \(x_{i} = M \underline{e}_{i}\) is the \(i^\mathrm{th}\) column of \(M\))
            \(\iff AM \underline{e}_{i} = M(\lambda_{i} \underline{e}_{i})\)
            \(\iff M^{-1}AM \underline{e}_{i} = \lambda_{i} \underline{e}_{i}\) (multiply with \(M^{-1}\))
            \(\iff i^{\mathrm{th}}\) column of \(M^{-1}AM\) is \(\begin{pmatrix} 0 \\ \vdots \\ 0 \\ \lambda_{i} \\ 0 \\ \vdots \\ 0 \end{pmatrix} \leftarrow i^{\mathrm{th}}\) place.

Proof of “\(\Longrightarrow\)”:
\(A\) is diagonalisable
\(\implies\) \(\exists\) a basis \(\underline{x}_{1}, \dots, \underline{x}_{n}\) of \(F^n\) consisting of eigenvectors of \(A\) (by 6.7)
\(\implies\) The matrix \(M\in M_{n\times n}(F)\) with columns \(\underline{x}_{1}, \dots, \underline{x}_{n}\) is invertible (by 5.6(b)\(\implies\)(a))
and \(M^{-1}AM\) is diagonal. (by “preparation” above)

Proof of “\(\Longleftarrow\)”:
There exists an invertible \(M\in M_{n\times n}(F)\) such that \(M^{-1}AM\) is diagonal
\(\implies\) the columns of \(M\) are eigenvectors of \(A\) (by “preparation” above)
and they form a basis of \(F^n\). (by 5.6(a)\(\implies\)(b) and 3.15)
\(\implies\) \(A\) is diagonalisable. (by 6.7) \(\square\)

Example 6.9 Show that the matrix \[ A:= \begin{pmatrix} 0 & -1 & 1 \\ -3 & -2 & 3 \\ -2 & -2 & 3 \end{pmatrix} \in M_{3 \times 3}(\mathbb{R}) \] is diagonalisable and find an invertible matrix \(M \in M_{3\times 3}(\mathbb{R})\) that diagonalises it.

Solution: First compute the characteristic polynomial of \(A\): \[\begin{align*} p_{A}(\lambda ) &= \mathrm{det}( \lambda I_{3} - A) = \mathrm{det} \begin{pmatrix} \lambda & 1 & -1 \\ 3 & \lambda + 2 & -3 \\ 2 & 2 & \lambda - 3 \end{pmatrix} \\ &= \lambda(\lambda +2)(\lambda -3) + 1(-3)2 + (-1)3 \cdot 2 - (-1)( \lambda + 2)2 - \lambda(-3)(2) - 1 \cdot 3 (\lambda -3) \\ &= \lambda (\lambda^{2} - \lambda - 6) - 6 - 6 + 2 \lambda + 4 + 6 \lambda - 3 \lambda + 9 \\ &= \lambda^{3} - \lambda^{2} - \lambda + 1 = \lambda^{2} (\lambda - 1) - (\lambda - 1) = (\lambda^{2} - 1)(\lambda - 1) = (\lambda - 1)^{2} (\lambda + 1). \end{align*}\] \(\implies\) Eigenvalues of \(A\) are \(1\) and \(-1\).

Basis of \(E_1 (A)\): We apply Gaussian elimination to \(1\cdot I_3-A\): \[ 1\cdot I_{3} - A = \begin{pmatrix} 1 & 1 & -1 \\ 3 & 3 & -3 \\ 2 & 2 & -2 \end{pmatrix} \xrightarrow[R3 \mapsto R3 - 2R1]{R2 \mapsto R2 - 3R1} \begin{pmatrix} 1 & 1 & -1 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \end{pmatrix} =: \tilde A \] \[\begin{align*} \implies\quad E_{1}(A) &= N(1\cdot I_3-A) = N(\tilde A) = \left\{ \begin{pmatrix} b_{1} \\ b_{2} \\b_{3} \end{pmatrix} \in \mathbb{R}^{3} : b_{1} + b_{2} - b_{3} = 0 \right\} \\ &= \left\{ \begin{pmatrix} -b_{2} + b_{3} \\ b_{2} \\ b_{3} \end{pmatrix} : b_{2}, b_{3} \in \mathbb{R} \right\} \\ &= \left\{ b_{2} \begin{pmatrix} -1 \\ 1 \\ 0 \end{pmatrix} + b_{3} \begin{pmatrix} 1 \\ 0 \\ 1 \end{pmatrix} : b_{2}, b_{3} \in \mathbb{R} \right\} = \mathrm{Span} \left( \begin{pmatrix} -1 \\ 1 \\ 0 \end{pmatrix}, \begin{pmatrix} 1 \\ 0 \\ 1 \end{pmatrix} \right) \end{align*}\] Also \(\underline{x}_{1} := \begin{pmatrix} -1 \\ 1 \\ 0 \end{pmatrix}, \underline{x}_{2} := \begin{pmatrix} 1 \\ 0 \\ 1 \end{pmatrix}\) are L.I. (as they are not multiples of each other)
\(\implies\) A basis of \(E_{1}(A)\) is \(\underline{x}_1, \underline{x}_2\).

Basis of \(E_{-1}(A)\): We apply Gaussian elimination to \((-1)I_3 -A\): \[\begin{align*} -I_3-A =& \begin{pmatrix} -1 & 1 & -1 \\ 3 & 1 & -3 \\ 2 & 2 & -4 \end{pmatrix} \xrightarrow[R3 \mapsto R3 + 2R1]{R2 \mapsto R2 + 3 R1} \begin{pmatrix} -1 & 1 & -1 \\ 0 & 4 & -6 \\ 0 & 4 & -6 \end{pmatrix}\\ & \xrightarrow[R3 \mapsto R3 - R2]{R \mapsto R1 - \frac{1}{4}R2} \begin{pmatrix} -1 & 0 & \frac{1}{2} \\ 0 & 4 & -6 \\ 0 & 0 & 0 \end{pmatrix} =: \hat{A} \end{align*}\] \[\begin{align*} \implies\quad E_{-1}(A) &= N((-1)\cdot I_3-A) = N(\hat{A}) = \\ &= \left\{ \begin{pmatrix} b_{1} \\ b_{2} \\ b_{3} \end{pmatrix} \in \mathbb{R}^{3} : -b_{1} + \frac{1}{2}b_{3} = 0 \text{ and } 4b_{2} - 6b_{3} =0 \right\} \\ &= \left\{ \begin{pmatrix} \frac{1}{2}b_{3} \\ \frac{3}{2}b_{3} \\ b_{3} \end{pmatrix} : b_{3} \in \mathbb{R} \right\} = \mathrm{Span} \left( \begin{pmatrix} \frac{1}{2} \\ \frac{3}{2} \\ 1 \end{pmatrix} \right) \end{align*}\] Also \(\underline{x}_{3} := \begin{pmatrix} \frac{1}{2} \\ \frac{3}{2} \\ 1 \end{pmatrix}\) is linearly independent (as it is not \(\underline{0}\))
\(\implies\) A basis of \(E_{-1}(A)\) is \(\underline{x}_3\).

For \(M:=( \underline{x}_{1}, \underline{x}_{2}, \underline{x}_{3}) = \begin{pmatrix} -1 & 1 & \frac{1}{2} \\ 1 & 0 & \frac{3}{2} \\ 0 & 1 & 1 \end{pmatrix}\) we have \(\mathrm{det}(M) = \frac{1}{2} + \frac{3}{2} - 1 = 1 \neq 0\)
\(\implies\) \(\underline{x}_{1}, \underline{x}_{2}, \underline{x}_{3}\) form a basis of \(\mathbb{R}^{3}\) consisting of eigenvectors of \(A\) (by 5.6 and 3.13)
\(\implies A\) is diagonalisable and \(M\) diagonalises \(A\). (by the proof of 6.8)

6.2.3 Diagonalisability (version 3)

Definition 6.10 Let \(F\) be a field. Let \(A \in M_{n \times n}(F)\) and \(\lambda \in F\) be an eigenvalue of \(A\).

The algebraic multiplicity \(a_{\lambda}(A)\) of \(\lambda\) is its multiplicity as a root of the characteristic polynomial of \(A\).
The geometric multiplicity \(g_{\lambda}(A)\) of \(\lambda\) is the dimension of the eigenspace \(E_{\lambda}(A)\).

Example 6.11 (a) In Example 6.9 we had \(p_{A}(\lambda) = ( \lambda-1)^{2}( \lambda + 1)\), so \(a_{1}(A) = 2\) and \(a_{-1}(A)=1\). Looking at the eigenspaces, we had \(g_{1}(A)=2\) and \(g_{-1}(A) = 1\).

Let \(A = \begin{pmatrix} 1 & 1 \\ 0 & 1 \end{pmatrix} \in M_{2 \times 2}({F})\) (for any field \(F\))
\(\implies\) \(p_{A}(\lambda) = (\lambda - 1)^{2}\) and a basis of \(E_{1}(A)\) is \(\begin{pmatrix} 1 \\ 0 \end{pmatrix}\)
\(\implies\) \(a_{1}(A) = 2\) but \(g_{1}(A) = 1\).

Theorem 6.12 Let \(F\) be a field. Let \(A \in M_{n \times n}(F)\). Then \(A\) is diagonalisable if and only if the characteristic polynomial of \(A\) splits into linear factors and the algebraic multiplicity equals the geometric multiplicity for each eigenvalue of \(A\).

Proof: Omitted.

Example 6.13 Determine whether the matrix \(A = \begin{pmatrix} 0 & 1 \\ -1 & 0 \end{pmatrix}\) is diagonalisable when viewed as an element of \(M_{2 \times 2}(\mathbb{R})\), of \(M_{2 \times 2}(\mathbb{C})\) and of \(M_{2 \times 2}(\mathbb{F}_{2})\). If \(A\) is diagonalisable then determine an invertible matrix \(M\) that diagonalises \(A\).

Solution: \(p_{A}(\lambda) = \mathrm{det} \begin{pmatrix} \lambda & -1 \\ 1 & \lambda \end{pmatrix} = \lambda^{2} + 1\).

For \(\mathbb{R}\):
\(\lambda^{2} + 1\) does not split into linear factors
\(\implies\) as an element of \(M_{ 2 \times 2}(\mathbb{R})\) the matrix \(A\) is not diagonalisable. (by 6.12)
(Actually \(A\) is a rotation by \(90^{\circ}\) about the origin.)

For \(\mathbb{C}\):
\(p_A(\lambda)= \lambda^{2} + 1 = (\lambda + i)(\lambda - i)\)
\(\implies\) \(a_{+i}(A) = 1\) and \(a_{-i}(A) = 1\).

Basis of \(E_{i}(A)\): We apply Gaussian elimination to \(iI_2-A\): \[ iI_{2} - A = \begin{pmatrix} i & -1 \\ 1 & i \end{pmatrix} \xrightarrow{R1 \mapsto (-i)R1} \begin{pmatrix} 1 & i \\ 1 & i \end{pmatrix} \xrightarrow{R2 \mapsto R2 - R1} \begin{pmatrix} 1 & i \\ 0 & 0 \end{pmatrix} =: \tilde A \] \[\begin{align*} \implies\quad E_{i}(A) &= N(iI_2-A) = N(\tilde A) = \left\{ \begin{pmatrix} b_{1} \\ b_{2} \end{pmatrix} \in \mathbb{C}^{2} : b_{1} + ib_{2} = 0 \right\} \\ &= \left\{ \begin{pmatrix} -ib_{2} \\ b_{2} \end{pmatrix} : b_{2} \in \mathbb{C} \right\} = \mathrm{Span} \left( \begin{pmatrix} -i \\ 1 \end{pmatrix} \right) \end{align*}\] Also \(\begin{pmatrix}-i\\1\end{pmatrix}\) is linearly independent (as it is not \(\underline{0}\))
\(\implies\) \(\begin{pmatrix}-i\\1\end{pmatrix}\) is a basis of \(E_i(A)\)
\(\implies\) \(g_{i}(A) = 1\).

Basis of \(E_{-i}(A)\): We apply Gaussian elimination to \((-i)I_2-A\): \[ -iI_{2} - A = \begin{pmatrix} -i & -1 \\ 1 & -i \end{pmatrix} \xrightarrow{R1 \mapsto iR1} \begin{pmatrix} 1 & -i \\ 1 & -i \end{pmatrix} \xrightarrow{R2 \mapsto R2 -R1} \begin{pmatrix} 1 & -i \\ 0 & 0 \end{pmatrix} =: \hat A \] \[\begin{align*} \implies\quad E_{-i}(A) &= N((-i)I_2-A) = N(\hat A) = \left\{ \begin{pmatrix} b_{1} \\ b_{2} \end{pmatrix} \in \mathbb{C}^{2} : b_{1} - i b_{2} = 0 \right\}\\ &= \left\{ \begin{pmatrix} ib_{2} \\ b_{2} \end{pmatrix} : b_{2} \in \mathbb{C} \right\} = \mathrm{Span} \left( \begin{pmatrix} i \\ 1 \end{pmatrix} \right) \end{align*}\] Also \(\begin{pmatrix}i\\1\end{pmatrix}\) is linearly independent (as it is not \(\underline{0}\))
\(\implies\) \(\begin{pmatrix}i\\1\end{pmatrix}\) is a basis of \(E_{-i}(A)\)
\(\implies\) \(g_{-i}(A) = 1\).

\(\implies\) \(A\) is diagonalisable when viewed as an element of \(M_{2 \times 2}(\mathbb{C})\) (by 6.12)
and \(M = \begin{pmatrix} -i & i \\ 1 & 1 \end{pmatrix}\) diagonalises \(A\).

For \(\mathbb{F}_{2}\): \(p_A(\lambda) = \lambda^{2} + 1 = (\lambda + 1)^{2}\) (since \(1 + 1 =0\) in \(\mathbb{F}_{2}\))
\(\implies\) \(A\) has a single eigenvalue \(1 = -1\) and \(a_{1}(A) =2\).

Basis of \(E_{1}(A)\): We apply Gaussian elimination to \(1\cdot I_2-A\): \[ I_{2} - A = \begin{pmatrix} 1 & 1 \\ 1 & 1 \end{pmatrix} \xrightarrow{R2 \mapsto R2 - R1} \begin{pmatrix} 1 & 1 \\ 0 & 0 \end{pmatrix} = \widehat{A} \] \[\begin{align*} \implies\quad E_{1}(A) &= N(I_2-A) = N(\widehat{A}) = \left\{ \begin{pmatrix} b_{1} \\ b_{2} \end{pmatrix} \in \mathbb{F}_{2}^{2} : b_{1} + b_{2} = 0 \right\}\\ &= \left\{ \begin{pmatrix} b_{2} \\ b_{2} \end{pmatrix} : b_{2} \in \mathbb{F}_{2} \right\} = \mathrm{Span} \left( \begin{pmatrix} 1 \\ 1 \end{pmatrix} \right) \end{align*}\] Also \(\begin{pmatrix}1\\1\end{pmatrix}\) is linearly independent (as it is not \(\underline{0}\))
\(\implies\) \(\begin{pmatrix}1\\1\end{pmatrix}\) is a basis of \(E_{1}(A)\)
\(\implies\) \(g_1(A) = 1\).

\(\implies\) \(A\) is not diagonalisable. (by 6.12, since \(a_1(A)=2\neq 1=g_1(A)\))

6.3 Cayley–Hamilton Theorem

Theorem 6.14 (Cayley–Hamilton Theorem) Let \(F\) be a field, let \(A \in M_{n \times n}(F)\) and let \(p_A\) be the characteristic polynomial of \(A\). Then \(p_{A}(A)\) is the zero matrix.

Example 6.15 Let \(A = \begin{pmatrix} 0 & 1 \\ -1 & 0 \end{pmatrix} \in M_{2}(F)\)
\(\implies\) \(p_{A}(\lambda) = \lambda^{2} + 1\) (see Example 6.13)
\(\implies\) \(p_{A}(A) = A^{2} + 1\cdot I_{2} = \begin{pmatrix} -1 & 0 \\ 0 & -1 \end{pmatrix} + \begin{pmatrix} 1 & 0 \\ 0 & 1 \end{pmatrix} = \begin{pmatrix} 0 & 0 \\ 0 & 0 \end{pmatrix}\).

Proof of the Cayley–Hamilton Theorem 6.14:

First Case: When \(A = \left(\begin{smallmatrix} a_{1} & & 0 \\ & \ddots & \\ 0 & & a_{n} \end{smallmatrix}\right)\) is a diagonal matrix. \[\begin{align*} &\implies&\quad p_{A}(\lambda) &= \mathrm{det}(\lambda\cdot I_n-A) = (\lambda - a_{1}) \dots (\lambda - a_{n})\\ &\implies&\quad p_{A}(A) &= (A - a_{1}I_{n}) \cdots (A - a_{n}I_{n})\\ &&&= \left( \begin{smallmatrix} 0 & & & 0 \\ & a_{2} - a_{1} & & \\ & & \ddots & \\ 0 & & & a_{n} - a_{1} \end{smallmatrix} \right) \ \left( \begin{smallmatrix} a_{1} - a_{2} & & & 0 \\ & 0 & & \\ & & \ddots & \\ 0 & & & a_{n} - a_{2} \end{smallmatrix} \right) \dots \left( \begin{smallmatrix} a_{1} - a_{n} & & & 0 \\ & \ddots& & \\ & & a_{n-1} - a_n \\ 0 & & & 0 \end{smallmatrix} \right) \\ &&&= \underline{0}, \end{align*}\] because the product of any two diagonal matrices with diagonal entries \(b_{1}, \dots, b_{n}\) and \(c_{1}, \dots, c_{n}\) respectively, is the diagonal matrix with diagonal entries \(b_{1}c_{1}, \dots, b_{n}c_{n}\).

Preparatory Step: If \(A,M,D\in M_{n\times n}(F)\) are such that \(M\) is invertible and \(D = M^{-1}AM\), then: \[\begin{align*} p_D(\lambda) &= \mathrm{det}(\lambda\cdot I_n-D) = \mathrm{det}(\lambda\cdot I_n-M^{-1}AM)\\ &= \mathrm{det}(M^{-1}(\lambda I_n)M - M^{-1}AM) = \mathrm{det}(M^{-1}(\lambda I_n-A)M)\\ &= \mathrm{det}(M)^{-1}\mathrm{det}(\lambda I_n -A)\mathrm{det}(M) = \mathrm{det}(\lambda I_n -A)\\ &= p_A(\lambda). \end{align*}\] In other words, the characteric polynomials of \(A\) and \(D\) are the same.

Another Preparatory Computation: If \(M,D\in M_{n\times n}(F)\), \(M\) invertible, and \(k\geq0\), then: \[\begin{align*} (MDM^{-1})^k &= (M D M^{-1})(M D M^{-1}) \cdots (M D M^{-1})) \\ &= M D (M^{-1} M) D (M^{-1} M) \cdots (M^{-1} M) D M^{-1}\\ &= M D^{k} M^{-1}. \end{align*}\]

Second Case: When \(A\) is a diagonalisable matrix.
\(\implies\) \(\exists M \in GL_{n}(F)\) such that \(M^{-1} AM = D\) where \(D\) is a diagonal matrix (by 6.8)
Denote \(p_{A}( \lambda ) = \lambda^{n} + a_{n-1} \lambda^{n-1} + \dots + a_{1} \lambda + a_{0}\) the characteristic polynomial of \(A\)
\(\implies\) \(p_{A}(A) = A^{n} + a_{n-1} A^{n-1} + \dots + a_{1} A + a_{0} I_{n}\)
             \(= (M D M^{-1})^{n} + a_{n-1} (M D M^{-1})^{n-1} + \dots + a_{1} (M D M^{-1}) + a_{0} I_{n}\)
             (by Preparatory Computation above)
             \(= M D^{n} M^{-1} + a_{n-1} M D^{n-1} M^{-1} + \dots + a_{1} M D M^{-1} + a_{0} M M^{-1}\)
             \(= M( D^{n} + a_{n-1} D^{n-1} + \dots + a_{1} D + a_{0} I_{n} )M^{-1}\)
             \(= M p_{A}(D) M^{-1}\)
             \(= M p_{D}(D) M^{-1}\) (by Preparatory Step above)
             \(= M \underline{0} M^{-1} = \underline{0}\). (by the First Case)

General Case: Omitted. \(\square\)