Why are Hermitian matrices diagonalizable?

“Since we are working with a Hermitian matrix, we may take an eigenbasis of the space …”

“Wait, sorry, why are Hermitian matrices diagonalizable, again?”

“Umm … it’s not quick to explain.”

This exchange happens often when I give talks about spectra of graphs and digraphs in Bojan’s graph theory meeting. I know several ways to prove that Hermitian matrices are diagonalizable, but I couldn’t think of a simple, succinct statement which gives the right intuitive thing. So, in this post, I’d like to arrive at such a one-liner.

Definitions

We consider $n\times n$ matrices with entries in $\mathbb{C}$. A matrix $H$ is Hermitian if $H^* = H$, where $H^* = \bar{H}^T$ is the conjugate transpose of $H$. A matrix $H$ is diagonalizable if $H$ is similar to a diagonal matrix: i.e. there exist invertible matrix $P$ such that $P^{-1} H P = D$ where $D$ is a diagonal matrix.

An eigenvalue of $H$ is $\lambda$ such that there exist $v \in \mathbb{C}^n$ such that $$Hv = \lambda v$$ and $v$ is said to be an eigenvector of $H$. The characteristic polynomial of $H$ is $$\phi(H,t) = \det(tI – H).$$ If $\lambda$ is a number such that $\phi(H,\lambda) = 0$, then $\lambda I – H$ has a non-trivial kernel and so there exists a vector $v$ such that $(\lambda I – H)v = 0$. Thus, every root of $\phi(H,t)$ is an eigenvalue of $H$. However, the roots of the characteristic polynomial are not the same as the multiset of eigenvalues because there is a question of multiplicity.

Geometric vs algebraic multiplicities

The geometric multiplicity, $m_g(\lambda)$, of an eigenvalue $\lambda$ of $H$ is the dimension of the subspace of $\mathbb{C}^n$ generated by all eigenvectors $H$ with eigenvalue $\lambda$ (this space is called the eigenspace of $\lambda$). The algebraic multiplicity, $m_a(\lambda)$, is the multiplicity of $\lambda$ as a root of $\phi(H,t)$.

Proposition. A Hermitian matrix $H$ is diagonalizable if and only if $m_a(\lambda) = m_g(\lambda)$ for each eigenvalue $\lambda$ of $H$.

Proof. Suppose $H$ is a $n\times n$ Hermitian matrix.

“$\Leftarrow$” It is easy to see that the characteristic polynomial has degree $n$ and hence $n$ roots. Since the algebraic and geometric multiplicities agree, we see that $H$ has $n$ orthonormal eigenvectors (details left as an exercise) which can be used to form the columns of a matrix $P$. Then $P^*P = I$ and $P^*HP$ is a diagonal matrix with the eigenvalues of $H$ on the diagonal.

“$\Rightarrow$” Let $P$ be a matrix such that $P^{-1}HP = D$, where $D$ is a diagonal matrix. Then $$HP =PD$$ and we may consider the $j$th column of both sides. On the left side, we get $Hv$ where $v$ is the $j$th column of $P$. On the right side, we get $D_{jj}v$, and so $v$ is an eigenvector of $H$ with eigenvalue $D_{jj}$. Since $H$ and $D$ are similar, $$\phi(H,t) = \phi(D,t) = \prod_{j = 1}^n (t-D_{jj}).$$ Thus the multiset $\{D_{jj}\}_{j=1}^n$ is the set of eigenvalues of $H$ with both geometric and with algebraic  multiplicities. $\Box$

Diagonalizability

We will now show that Hermitian matrices are diagonalizable by showing that every eigenvalue has the same algebraic and geometric multiplicities.

Theorem. If $H$ is a Hermitian matrix with eigenvalue $\lambda$, then $m_g(\lambda) = m_a(\lambda)$.

Proof. We take $H$ to be a $n\times n$ Hermitian matrix and $\lambda$ an eigenvalue of $H$. We proceed by induction on $n$.

If $m_a(\lambda) = 1$, we are done, since there must be exactly one eigenvector of $\lambda$. We may assume $a = m_a(\lambda) >1$. Let $x$ be an eigenvector of $H$ such that $Hx = \lambda x$. We may assume that $x$ is normalized, i.e. $x^*x = 1$. We may extend $\{x\}$ to an orthonormal basis of $\mathbb{C}^n$, say $\{x, v_2, \ldots, v_n\}$.

Let $V= \langle v_2, \ldots, v_n \rangle = \langle x \rangle ^{\perp}$. We may consider $\mathbb{C}^n$ as the direct sum $\langle x \rangle \oplus V$. Let $v \in V$. Observe that $$x^*(Hv) = x^*H^*v= (Hx)^*v = (\lambda x)^*v = \bar{\lambda} x^*v = 0.$$ Thus $Hv \in V$ and $V$ is a $H$-invariant subspace of $\mathbb{C}^n$.

Let $P$ be the unitary matrix with $\{x, v_2, \ldots, v_n\}$ as its columns. The above gives that $$P^* HP = \left( \begin{array}{cccc} \lambda & 0 & \cdots & 0 \\ 0 & & & \\ \vdots & & B & \\0 &&& \end{array} \right)$$ and $B$ is Hermitian since the left side is. We see that $$\phi(H,t) = \phi(P^* HP, t) = (t-\lambda)\phi(B,t).$$ Thus, $\lambda$ is an eigenvalue of $B$ with algebraic and geometric multiplicity  $a-1$ (by induction) and $B$ has pair-wise orthogonal eigenvectors $x’_2, \ldots, x’_a$ of $\lambda$. For $j = 2, \ldots, a$, let $x_j = P\left( \begin{array}{c} 0 \\ x’_j \end{array} \right) P^*$. It is easy to see that $x, x_2, \ldots, x_a$ are pair-wise orthogonal eigenvectors of $H$ with eigenvalue $\lambda$, which proves the theorem. $\Box$

The Take-away

There are many (mostly equivalent ways) to show this; we could have used induction to prove $H$ is diagonalizable, without consider geometric vs algebraic multiplicities, we could have proved the decomposition into Jordan blocks, or we could have proven the spectral decomposition theorem.

The crux of the proof is that, when $H$ is Hermitian, the vector space $W^{\perp}$ is $H$-invariant when $W$ is. In our proof, this allowed us to, colloquially speaking, keep pulling eigenvectors out of $\mathbb{C}^n$. In general, given an $H$-invariant subspace $W$ of $\mathbb{C}^n$, we can consider the action of $H$ (by multiplication on the left) on $W$ and find the minimal polynomial of $H$ over $W$. If $\psi_1$ and $\psi_2$ are the minimal polynomials of $H$ over $W$ and $W^{\perp}$, respectively, then $\phi(H,t) = \psi_1 \psi_2$.

Intuitively, a Hermitian matrix $H$ is diagonalizable because we can break $\mathbb{C}^n$ into $H$-invariant, pairwise orthogonal, subspaces and diagonalize $H$ over each subspace.

To see that this a property that is not true of all matrices, consider the following matrix: $$N = \left(\begin{array}{cccc} 0 & 1 & 1 & 1 \\ 0& 0& 1 & 1 \\ 0& 0 & 0 & 1 \\ 0&0&0&0 \end{array} \right).$$ For graph theorists, this is the adjacency matrix of a transitive tournament on $4$ vertices. It is also a nilpotent matrix; that is, $N^4 = 0$. The characteristic polynomial of $N$ is $t^4$ and $N$ has $0$ as an eigenvalue with algebraic multiplicity $4$. However, $N$ has only one linearly independent eigenvector $e_1$ (the elementary basis vector, $(1\,\, 0\,\, 0\,\, 0)^T$).

Here, $\langle e_1 \rangle$ is an eigenspace of $N$ and hence $N$-invariant. Observe that $e_2 \in \langle e_1 \rangle^{\perp}$ but $Ne_2 = e_1$ and so $\langle e_1 \rangle^{\perp}$ is certainly not $N$-invariant.

References

Roman, Steven. Advanced Linear Algebra. (Springer Graduate Texts in Mathematics, Vol. 135)

Prasolov, V. V. Problems and Theorems in Linear Algebra. (AMS Translations of Mathematical Monographs, Vol. 134)