Fundamental Theorem Of Algebra

Gauss (1797)

This reworked proof draws from the explanation in [3] of Gauss's initial 1797 proof and some background details. The initial sketch in that source does not detail how the two curves are constrained to provide a common solution.

Theorem. Any complex polynomial must have a complex root.

Because…

Note that $a_i, z \in \mathbb{C}$.

Any complex function $P(z) = a_n z^n + a_{n-1} z^{n-1} + \cdots + a_0$ has a complex root $z_0$.

Because…

$P(z_0) = 0$ for some $z_0 \in \mathbb{C}$

Because…

$P(z_0) = u(x_0,y_0) + \mathrm{i} v(x_0,y_0)$ with $x_0,y_0 \in \mathbb{R}$ $\,\,\ensuremath{\blacktriangleright}$

and

$u(x_0,y_0) + \mathrm{i} v(x_0,y_0) = 0$. $\,\,\ensuremath{\blacktriangleright\blacktriangleright}$

$\,\,\ensuremath{\blacktriangleright}$Because…

$P(z) = u(x,y) + \mathrm{i} v(x,y)$ for any $z\in\mathbb{C}$ for which $P(z)$ is defined.

Because…

$P(z)$ may include some terms with $\mathrm{i}$, and other terms without $\mathrm{i}$.

Because…

$z = x + \mathrm{i} y$ and for some$P(z)$ there may be terms with$\mathrm{i}$ and some without.

Because…

As an example, if $f(z) = z^2$ then $f(z)=f(x+\mathrm{i} y) = (x+\mathrm{i} y)^2 = x^2 + 2 x \mathrm{i} y + \mathrm{i}^2 y^2 = (x^2 - y^2) + \mathrm{i}(2xy)$.$\,\,\ensuremath{\blacksquare}$

$\,\,\ensuremath{\blacktriangleright\blacktriangleright}$Because…

$u(x_0, y_0) = 0$ and $v(x_0, y_0) = 0$ for some $x_0,y_0\in\mathbb{R}$.

Because…

Gauss showed $u(x,y)=0$ and $v(x,y)=0$ represent curves in the plane $\mathbb{R}^2$ and found a common solution $(x_0, y_0)$.

Because…

There are constraints on possible functions $u(x,y)$ and $v(x,y)$ if we have $P(z) = u(x,y) + \mathrm{i} v(x,y)$.

These constraints lead to the common solution $(x_0, y_0)$.$\,\,\ensuremath{\blacksquare\blacksquare}$

There's more detail in an appendix of [3], which I may rework once I get there.

Older Material

Please note that I created this entry in the process of trying to understand the theorem. What follows is not the explanation of an expert, and may be inaccurate or incomplete.

The Theorem

“A polynomial of positive degree over the field $\mathbb{C}$ of complex numbers has a root in $\mathbb{C}$.”[2]

Overview

The polynomial $x^2 + 1$ has no “root” in real numbers because $x^2 + 1 = 0$ if and only if $x^2 = -1$, and any real number squared equals $0$ or a positive number.

Expanding one’s search to the complex numbers produces an answer, the square root of $-1,$ also known as $\mathrm{i}:$ $\mathrm{i}^2 = -1,$ so $\mathrm{i}^2 + 1 = -1 + 1 = 0$. The Fundamental Theorem of Algebra says there is always a solution to a polynomial (that may have complex coefficients) if we're allowed to search through the complex numbers too.

A corollary (a relatively simple extension of the theorem) says that if the largest power in the polynomial is $n$, then there are $n$ complex solutions to a polynomial. However, some of the solutions may be repeats.

Clark[2] cites Ankeny[1] for his proof.

Groundwork

Notation and terms

Following Clark's[2] practice $fz$ is written instead of $f(z)$. It's not particularly clear to me, however, why he then uses the parenthesized notation $\phi(z)$ and $\psi(z)$, although he does write $\phi \alpha$.

(1)
\begin{align} fz = c_0 + c_1 z + \ldots + c_{n-1}z^{n-1} + c_n z^n \text{, where } n \geq 1 \end{align}
(2)
\begin{align} \bar{f}z = \bar{c}_0 + \bar{c}_1 z + \ldots + \bar{c}_{n-1}z^{n-1} + \bar{c}_n z^n \text{, where } n \geq 1 \end{align}
(3)
\begin{align} c = a + b\mathrm{i} \text{ if and only if } \bar{c}=a-b\mathrm{i} \end{align}
(4)
\begin{align} \phi = f\bar{f} = a_0 + a_1 z + \ldots + a_{2n} z^{2n} \end{align}
(5)
\begin{align} \phi = az^{2n} - \psi(z). \end{align}
(6)
\begin{align} -\psi(z) = a_0 + a_1 z + \ldots + a_m z^m, \text{ for } m<2n \end{align}

Restate theorem

If $f$ is a polynomial of positive degree over the field $\mathbb{C}$ of complex numbers then $f$ has a root in $\mathbb{C}$.

Assumption and Goal

Assume

$f$ is a polynomial of positive degree over the field $\mathbb{C}$ of complex numbers.

Deduce

$f$ has a root in $\mathbb{C}$.

Proof in outline

Level 1

  1. $f$ has a root because $\phi$ has a root
  2. $\phi$ has a root because $1/\phi$ is not analytic
  3. $1/\phi$ is not analytic because one of its line integrals gets bigger and another gets smaller as we increase their upper bound.

Level 2

  1. $f$ has a root because $\phi$ has a root
    1. $f$ has a root because $f$ or $\bar{f}$ has a root1
    2. $f$ or $\bar{f}$ has a root because $f$ or $\bar{f}$ equals $0$
    3. $f$ or $\bar{f}$ equals 0 because the product $f\bar{f}$ equals $0$
    4. The product $f\bar{f}$ equals 0 because $\phi$ = $f\bar{f}=0$
    5. $\phi$ = $f\bar{f}$ = 0 because $\phi$ has a root
  2. $\phi$ has a root because $1/\phi$ is not analytic
    1. $\phi$ has a root because $\phi \alpha = 0$ for some $\alpha$
    2. $\phi \alpha=0$ for some $\alpha$ because $1/\phi$ is undifferentiable for some $\alpha$
    3. $1/\phi$ is undifferentiable for some $\alpha$ because $1/\phi$ is not analytic
  3. $1/\phi$ is not analytic because one of its line integrals gets bigger and another gets smaller as we increase their upper bound
    1. $1/\phi$ is not analytic because $1/\phi$ does not behave like an analytic function according to Cauchy's Theorem
    2. $1/\phi$ does not behave like an analytic function according to Cauchy's Theorem because $1/\phi$ has two line integrals with the same start and end points but different values2
    3. $1/\phi$ has two line integrals with the same start and end points but different values because one line integral gets bigger and the other gets smaller as we increase their upper bound.

Lots of equations

integral_diagram_small.jpg

In order to show that $1/\phi$ is not an analytic function we have to find paths with common start and end points that produce different line integrals.

Imagine two paths that start at $-R$ on the real number line and end at $+R$, also on the real number line.

Let the first line integral just be the path on the real number line from $-R$ to $+R$:

(7)
\begin{align} \int_{-R}^{+R} \frac{dx}{\phi(x)}. \end{align}

We conclude that the absolute value of this line integral,

(8)
\begin{align} \left | \int_{-R}^{+R} \frac{dx}{\phi(x)} \right |, \end{align}

gets larger as $R$ gets larger, because the longer the path the larger the line integral.

Why?

Because $1/\phi$ never changes sign (it can't because it has no root), so the integral just adds more and more values with the same sign.

This completes our examination of Eq. (8).

Let the second line integral follow path $\Gamma$, the half-circle “above” the real number line from $-R$ to $+R.$

(9)
\begin{align} \int_{\Gamma} \frac{dz}{\phi(z)}. \end{align}

The absolute value of the line integral over this path $\Gamma$ gets smaller as $R$ gets bigger because

(10)
\begin{align} \left | \int_{\Gamma} \frac{dz}{\phi(z)} \right | \leq \frac{\pi}{|a|R^{2n-1}(1-\varepsilon)} \end{align}

because

(11)
\begin{align} \left | \int_{\Gamma} \frac{dz}{\phi(z)} \right | \leq \int_{\Gamma} \frac{|dz|}{|\phi(z)|} \leq \int_{\Gamma}\frac{|dz|}{|a|R^{2n}(1-\varepsilon)} = \frac{\pi}{|a|R^{2n-1}(1-\varepsilon)}. \end{align}

We know the rightmost relationship, an inequality, of Eq. (11)

(12)
\begin{align} \int_{\Gamma}\frac{|dz|}{|a|R^{2n}(1-\varepsilon)} = \frac{\pi}{|a|R^{2n-1}(1-\varepsilon)} \end{align}

because

(13)
\begin{align} \int_{\Gamma}\frac{|dz|}{|a|R^{2n}(1-\varepsilon)} = \frac{\pi R}{|a|R^{2n}(1-\varepsilon)} = \frac{\pi}{|a|R^{2n-1}(1-\varepsilon)}. \end{align}

because path $\Gamma$ is a semicircle with radius $R$, so the length is $\pi R$.

We know the leftmost relationship, an inequality, of Eq. (11)

(14)
\begin{align} \left | \int_{\Gamma} \frac{dz}{\phi(z)} \right | \leq \int_{\Gamma} \frac{|dz|}{|\phi(z)|}. \end{align}

because

(15)
\begin{align} \left | \int_{\Gamma} \frac{dz}{\phi(z)} \right | = \left | \lim_{\Delta s_i \rightarrow 0} \sum_{i=0}^{N} \frac{1}{\phi({z_i})} \Delta s_i \right | \leq \lim_{\Delta s_i \rightarrow 0} \sum_{i=0}^{N} \left |\frac{1}{\phi({z_i})}\right | | \Delta s_i | = \int_{\Gamma} \frac{|dz|}{|\phi(z)|} \end{align}

because

(16)
\begin{align} | t_0 + t_1 + \cdots + t_k | < |t_0| + |t_1| + \cdots + |t_k| \end{align}

where $t_i \in \mathbb{C}$.3

We know the middle relationship, an inequality, of Eq. (11),

(17)
\begin{align} \int_{\Gamma} \frac{|dz|}{|\phi(z)|} \leq \int_{\Gamma} \frac{|dz|}{|a|R^{2n}(1-\varepsilon)} \end{align}

because

(18)
\begin{align} |\phi(z)| \geq |a|R^{2n}(1-\varepsilon) \end{align}

because

(19)
\begin{align} |\phi(z)| \geq |a|R^{2n}(1-\varepsilon) \geq |a|R_\varepsilon^{2n}(1-\varepsilon) \end{align}

because $R=z$ for some $|z|$ in

(20)
\begin{align} |\phi(z)| \geq |az^{2n}|(1-\varepsilon) \geq |a|R_\varepsilon^{2n}(1-\varepsilon). \end{align}

We know the leftmost inequality of Eq. (20)

(21)
\begin{align} |\phi(z)| \geq |az^{2n}|(1-\varepsilon) \end{align}

because

(22)
\begin{align} \left | \frac{\phi(z)}{az^{2n}} \right | \geq 1 - \varepsilon. \end{align}

because

(23)
\begin{align} \left | \frac{\phi(z)}{az^{2n}} \right | \geq 1 - \left | \frac{\psi(z)}{az^{2n}} \right | \geq 1 - \varepsilon. \end{align}

We know the leftmost inequality of Eq. (23)

(24)
\begin{align} \left | \frac{\phi(z)}{az^{2n}} \right | \geq 1 - \left | \frac{\psi(z)}{az^{2n}} \right | \end{align}

because

(25)
\begin{align} |\phi(z)}| \geq |az^{2n}| - |\psi(z)| \end{align}

because

(26)
\begin{align} \phi(z) = az^{2n} - \psi(z) \end{align}

because

(27)
\begin{align} \text{deg} \phi = 2n. \end{align}

We know the rightmost inequality of Eq. (23)

(28)
\begin{align} 1 - \left | \frac{\psi(z)}{az^{2n}} \right | \geq 1 - \varepsilon \end{align}

because

(29)
\begin{align} \left | \frac{\psi(z)}{az^{2n}} \right | \leq \varepsilon \end{align}

for some $|z|$ and $\varepsilon$ because

(30)
\begin{align} \left | \frac{\psi(z)}{az^{2n}} \right | \leq \frac{|a_0| + |a_1||z| + \cdots + |a_m||z|^m}{|a||z|^{2n}} \leq \frac{|a_0| + |a_1| + \cdots + |a_m|}{|a||z|^{2n-m}} \leq \varepsilon. \end{align}

We know the leftmost inequality of Eq. (30)

(31)
\begin{align} \left | \frac{\psi(z)}{az^{2n}} \right | \leq \frac{|a_0| + |a_1||z| + \cdots + |a_m||z|^m}{|a||z|^{2n}} \end{align}

because

(32)
\begin{align} \psi(z) = -a_0 - a_1 z - \cdots -a_m z^m \leq |a_0| + |a_1||z| + \cdots + |a_m||z|^m. \end{align}

(See also Eq. (26).)

We know the middle inequality of Eq. (30)

(33)
\begin{align} \frac{|a_0| + |a_1||z| + \cdots + |a_m||z|^m}{|a||z|^{2n}} \leq \frac{|a_0| + |a_1| + \cdots + |a_m|}{|a||z|^{2n-m}} \end{align}

because4

(34)
\begin{align} \frac{|a_0| + |a_1| + \cdots + |a_m|}{|a_0| + |a_1||z| + \cdots + |a_m||z|^m} \geq \frac{|a||z|^{2n-m}}{|a||z|^{2n}} = \frac{1}{|z|^m}. \end{align}

We know the rightmost inequality of Eq. (30)

(35)
\begin{align} \frac{|a_0| + |a_1| + \cdots + |a_m|}{|a||z|^{2n-m}} \leq \varepsilon. \end{align}

because we calculate

(36)
\begin{align} \varepsilon \in (0,1) \end{align}

from the left-hand side of Eq. (35), or we start with a particular $\varepsilon$ and increase $z$ until Eq. (35) holds (which it will because the numerator is constant).

This completes our examination of Eqs. (30) and (23).

We know the rightmost inequality of Eq. (20)

(37)
\begin{align} |az^{2n}|(1-\varepsilon) \geq |a|R_\varepsilon^{2n}(1-\varepsilon) \end{align}

because we know $z$ exists, and we just define

(38)
\begin{align} R_\varepsilon = \min(z) \end{align}

such that Eq. (29)

(39)
\begin{align} \left | \frac{\psi(z)}{az^{2n}} \right | \leq \varepsilon \end{align}

holds for all $|z|\geq R_\varepsilon$.

This completes our examination of Eq. (20) and hence Eq. (11).

To do

We should also note that $\phi$ has only real coefficients (assuming it's pertinent).

Also, I should add a short blurb about the history of the theorem, and perhaps provide links to alternate proofs (eventually to be summarized in a separate but related entry).

Notes

Sources

1. Ankeny, N.C. “One more proof of the fundamental theorem of algebra.” Am. Math Monthly, 54 (1947) 464, cited in Clark.
2. Clark, A. Elements of Abstract Algebra. New York: Dover, 1984 [orig. 1971].
3. Fine, B., and G. Rosenberger. The Fundamental Theorem of Algebra. New York: Springer, 1997.
4. Dennery, P., and A. Krzywicki. Mathematics for Physicists. New York: Dover, 1996 [orig. 1967].
5. Irving, R.S. Integers, Polynomials, and Rings. New York: Springer, 2004.
6. Shankar, R. Basic Training in Mathematics: A Fitness Program for Science Students. New York: Plenum, 1995.

Advisory

Please note that I am not a mathematician and so the presentation of proofs that I make may be deeply flawed. I'm using this writing process to figure out what I'm reading. Please consult more authoritative sources as well.

Feel free to contact me by leaving a comment or sending me a private message.

Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License