Happy New Year y’all! I’m looking forward to writing some real posts soon – this one’s just for fun. (Maybe they all are?) My first effort at catching up on my blog reading was to read the last eight posts from f(t). Kate gives us a post on her favorite theorem, including the proof. It turns out to be Cantor’s diagonal proof of the nondenumerability of the reals. I really enjoyed it and got excited to catalogue some of mine.

Here’s a top-10 list. It’s a little capricious. But I’ll stand behind each of these theorems being amazing. They span a lot of levels, from early elementary school thru graduate level stuff. Partly, especially for the lower-level ones, I’m making a case that these guys deserve more respect than they typically get. Below the list I put some comments on each. I got kinda technical toward the end, and I feel a bit self-conscious about that. I’ll be excited if you read any of this but definitely feel free to stop, and then give me sh*t about it, when I get too abstruse.

I. The commutativity of multiplication. (I’m not kidding. I think this is dope. More below.)
II. a÷b (division) = a/b (fraction)
III. The Pythagorean theorem.
IV. Every rational number has a terminating or repeating decimal expansion.
V. If a is a root of a polynomial p(x), then x-a divides p(x).
VI. The fundamental theorem of calculus.
VII. Every prime of the form 4n+1 is a sum of squares in exactly one way.
VIII. Every element of SOn is a composition of rotations in orthogonal planes.
IX. A function C->C that is analytic in a neighborhood has a convergent Taylor expansion equal to itself in that neighborhood.
X. The fundamental theorem of Galois theory.

I. The commutativity of multiplication
Most students and teachers I know treat this as so trivial and obvious that I must be some sort of ridiculous person to dignify it with the word “theorem.” But seriously now. When they told you what 5×7 was, depending slightly on your teacher, it was something like “five 7’s.” (This is the language used by my first grade teacher Judy Lazrus. She was amazing, but this isn’t especially an example of that.) 7+7+7+7+7. That means 7×5 is 5+5+5+5+5+5+5. Is it obvious that these are going to come out the same?

NO, it’s NOT. Not even to me, and I’ve had a lot of time to get used to it. Certainly not to a six year old. There really is a theorem in here. If you’ve worked with kids who are first learning about multiplication, you know this.

A formal, utterly rigorous proof would begin with something like the Peano axioms and build up addition and then multiplication in a formal way from scratch. This is cool, but I have a different proof in mind. It’s less formal; less “rigorous” by professional standards; however, it’s the one that gives this theorem such import to me, it’s totally convincing, and it’s accessible to a six-year-old. You probably already know it; perhaps it’s how you convinced yourself multiplication really is commutative way back when; but I think it deserves special attention.

Represent 5×7 as an array – five rows of 7 dots:
* * * * * * *
* * * * * * *
* * * * * * *
* * * * * * *
* * * * * * *
Now interchange the roles of the rows and columns. Instead of seeing the array as a stack of rows, one on top of the other, see it as a line of columns, side by side. Each column is formed with one dot from each row, so the size of each column equals the number of rows, while the number of dots in each row becomes the number of columns. 5 rows of 7 becomes 7 columns of 5. So 5×7 = 7×5. The same argument could be used with any pair of natural numbers: mxn = nxm.

Why am I making a big deal of this? Two reasons. One is that learning multiplication’s commutativity is something that happens very early in a typical math education, and this is a chance for students who are just beginning their mathematical journey to have a taste of real mathematics – an elegant, powerful argument giving a totally convincing proof of an unexpected result. The other is that the particular method of proof here (interchanging rows and columns) is one with great power, whose usefulness extends well beyond high school. For example, this is the idea behind the proof of Burnside’s formula in group theory, or the proof in number theory that the number of partitions of n with at most k parts equals the number with each part at most k, and many other advanced results. I love the idea of putting an idea of this vast power and scope into the hands of tiny children.

II. a÷b (division) = a/b (fraction)
I can’t back this one up quite as well – I don’t even know how to state this claim in precise mathematical language, and I can’t give you anything I would consider a proof. But I know there’s something surprising in here because I’ve watched a lot of kids be surprised by it.

When I learned division in school, I learned that 10÷2 means the number of 2’s that are needed to add up to 10. 10/2 means what you get if you accumulate ten halves. These are definitely not the same thing. In the former case, I am imagining stacking up 2’s till I get to 10. The answer, five, is counting boxes of 2. I’m thinking about whole things, and the 5 is counting boxes of more than 1 whole thing. There’s certainly no fraction anywhere in sight – everything is whole. In the latter, the 5 is constructed from gluing together 10 halves. When I imagine it I can still see the glue. Such a difference in visual images – why is the number coming out the same?

I think there are a lot of ways to get convinced, and unlike my first theorem I’m not attached to a specific argument. What I feel sure of is that as kids come to learn about division, and fractions, this identity deserves a lot of thought, and offers something to get excited about.

III. The Pythagorean theorem
You know it, y’all, this thing is amazing. I think the Pythagorean theorem is for many of us one of the great missed opportunities of our math education. Because most adults can recite it like a mantra but only a very few people (one is my former roommate Thierry – big ups to your middle school math teacher, Thierry!) recall it with any of the awe it deserves.

This is the earliest example I can think of in the curriculum of a really surprising connection between algebra and geometry. Somehow right-anglyness is precisely captured by sum-of-squaresiness and vice versa. Somehow “right triangle” is a geometrical picture, and “a^2+b^2=c^2” an algebraic description, of precisely the same underlying mathematical structure. I think that’s amazing. I’m still not over it.

I have three favorite proofs:
a) Euclid’s. Not very accessible to the kiddies but has the virtue of actually partitioning the square on the hypotenuse into two parts equal in area to the squares on the legs.
b) The one where you put four copies of the triangle inside a square of side length a+b and if you arrange them one way the remaining area is c^2 and if you arrange them another way it’s a^2+b^2. Here’s a picture.
c) The one where you draw an altitude from the hypotenuse, like this, and then work from the similarity of the three triangles.

There are a zillion other proofs. There are books and books. One I’m very excited about, by Bob & Ellen Kaplan (who wrote Out of the Labyrinth, which I was also very excited about), is called Hidden Harmonies: The Lives and Times of the Pythagorean Theorem and is due out in August. Part of what I love about this theorem is the amazing diversity in the ways it is proven.

Okay now I have to speed through the rest:

IV. Every rational number has a terminating or repeating decimal expansion.
The awesome Larry Zimmerman says that a good math problem “has a future.” Meaning that the ideas in the problem have a reach that extends to other problems, possibly to other whole domains of mathematics. My excitement about the proof of commutativity of multiplication, above, has to do with this. Likewise, this theorem. First of all, it’s a handy characterization of rationals, and looks ahead in a natural way to interesting calculus-laden questions having to do with convergence of series, especially geometric series. But maybe even more important to me is the proof, which involves the division algorithm and the pigeonhole principle. Both of which spin off great showers of mathematical consequences…

V. If a is a root of a polynomial p(x), then x-a divides p(x).
One of which is this! The division algorithm generalizes in a natural way from the natural numbers…
b divided by a has a remainder less than a
…to polynomials (techically, polynomial rings over a field):
p(x) divided by q(x) has a remainder which is a polynomial of degree less than q.
Because of this, we get this awesome connection between the behavior of a polynomial as a function (for what x is p(x) equal to zero?) and its behavior as an object to be factored (what other polynomials divide p(x)?)

The converse is less startling to me – if x-a is a factor of p(x), then when x=a, x-a=0, so p(x)=0 at x=a. The interesting part is the result I stated – all we have to know is p(a)=0 to know x-a is a factor. And as I mentioned, this comes from the division algorithm:
p(x) / (x-a) has a remainder of degree lower than x-a – i.e. a constant.
So p(x) = (some polynomial)*(x-a) + const.
If p(a) = 0, then evaluating this equation at a gives
0 = (some polynomial, evaluated at a)*0 + const.
The constant must be 0, i.e. the remainder must be 0, i.e. x-a is a factor of p(x). Boom!

VI. The fundamental theorem of calculus
Oh my goodness. Don’t even get me started. How am I trying to do this quickly? If the pythagorean theorem felt like it was connecting disparate realms (the geometry of a right angle and the algebra of a^2+b^2=c^2), the fundamental theorem of calculus is connecting disparate planets. If f tells you about g’s speed, then g tells you about f’s… area? WHAT??? I need to move on because I’m going to wax poetical all night long if I don’t stop. This was the theorem that sealed the deal between me and math.

VII. Every prime of the form 4n+1 is a sum of squares in exactly one way.
VIII. Every element of SOn is a composition of rotations in orthogonal planes.
IX. A function C->C that is analytic in a neighborhood has a convergent Taylor expansion equal to itself in that neighborhood.

I have a thing for when you’re confused about some question about really familiar, concrete objects and you eventually answer it by considering it in the broader context of really fanciful, crazy objects. All three of these theorems represent for me the wierd power of complex numbers and other crazy abstractions to answer questions that are about real numbers. (In the first case, natural numbers.)

There are proofs of Fermat’s theorem about primes of the form 4n+1 being sums of squares that don’t involve complex numbers, but the proof I relate to the most certainly does. I just wrote it out but then I realized that it is kind of technical so I’m putting it at the end as an appendix. I don’t know if the theorem about SOn has proofs that don’t involve complex numbers but the proof I know certainly and powerfully does. A sketch of it is at the end too.

The Taylor expansion is a slightly different but related story for me. It used to confuse me that functions like sin x and e^x are equal to their Taylor expansion everywhere, but then there is that funky example: f(x)=e^-(1/x^2) if x isn’t 0 and f(x)=0 if x is 0
This function is continuous and differentiable for all real x. At x=0, it and all its derivatives are 0. So its Taylor expansion at 0 is identically 0. But the function itself is totally not 0 at all. What the hell?

Once again, broadening out into the complex plane explains everything. The function f(x) = e^-(1/x^2) isn’t defined at x=0. When you’re just looking at the real line, it looks like you can plug the hole with f(0)=0 and everything will be smooth and nice. But if you look at the whole complex plane, x=0 is an essential singularity of f. f’s behavior in the neighborhood of 0 is actually completely bananas. They fooled us by only showing us a tiny, well-behaved slice when they showed us the real line. If a function is really continuous and differentiable in a whole complex neighborhood, it will equal its Taylor expansion there. Whew!

X. The fundamental theorem of Galois theory
I’m not even going to state this one. I could use avoiding the technicalities as an excuse, but the real reason is that I’m avoiding spoilers. The F. T. of G. T. is the punchline of an amazing mathematical journey. The way I learned it kind of killed it. The teacher stated it, showed us examples, and then spent a few classes proving it. Then, it was applied to the problem it had originally come into being to deal with (the insolubility of the quintic). I had been told the answer before I had asked the question, as it were. I feel a sense of loss that I never got to see it shimmering in the distance as I searched. This spring I’m teaching an informal course on group theory, hopefully culminating in Galois theory, for some teacher friends. I’m gonna do my darndest to bring out the fundamental theorem in an organic, motivated and natural way. I’ll let you know how it goes.


Proof that an odd prime p is a sum of 2 squares if and only if p has the form 4n+1.

Let p be a prime. If p=a^2+b^2 then p factors in the ring of Gauss integers – it equals (a+bi)(a-bi). Likewise, the only way for p to factor in the ring of Gauss integers is (a+bi)(a-bi). (This is because if (a+bi) is a factor of p, say p=(a+bi)(c+di), then its conjugate (a-bi) is too, and (a+bi)*(a-bi) is a real factor of p^2, so it is 1, p or p^2. But if it were 1 or p^2, the factorization p=(a+bi)(c+di) would be trivial – in the first case, a+bi would be a unit, in the second case, c+di would. So if (a+bi) is really a nontrivial factor of p, then p=(a+bi)(a-bi).) So p is a sum of squares if and only if p stops being prime when you venture out into the ring of Gauss integers.

I.e. p is a sum of squares if and only if (p) is not a prime ideal of Z[x]/(x^2+1).
I.e. p is a sum of squares if and only if Z[x]/(x^2+1,p) is not an integral domain.
I.e. p is a sum of squares if and only if Fp[x]/(x^2+1) is not an integral domain.
I.e. p is a sum of squares if and only if x^2+1 factors in Fp[x].
I.e. p is a sum of squares if and only if -1 is a square in the field Fp.

Now the multiplicative group of Fp is cyclic and order p-1, and (if p>2) -1 has order 2 in this group, so if g is a generator for the group, -1=g^[(p-1)/2]. -1 is a square if and only if the exponent is even, i.e. if p-1 is a multiple of 4.

So if p is a prime >2, it is a sum of squares if and only if p-1 is a multiple of 4, in other words if p has the form 4n+1.

The uniqueness of the representation as a sum of two squares follows from the fact that Z[x]/(x^2+1) is a unique factorization domain, which comes from the fact that it has a Euclidean division algorithm (big ups to the division algorithm again!). Thus p only factors one way: p=a^2+b^2=(a+bi)(a-bi). If p=c^2+d^2 too, then p would =(c+di)(c-di) too, but this contradicts unique factorization.

Outline of proof that any element of SOn is a composition of rotations in orthogonal planes:
Take an element P of SOn and regard it as having complex entries, even though they’re really real. Then P is unitary, so by the spectral theorem for unitary operators, it has an orthonormal basis of eigenvectors (which may be complex). The eigenvalues all have absolute value 1, their product is 1, and the complex ones come in conjugate pairs because P’s entries, and hence the coefficients of its characteristic polynomial, are real. Consider the subspace A of C^n spanned by a pair of eigenvectors associated to a pair of conjugate eigenvalues. Then consider the subspace A’ of A consisting of strictly real vectors (i.e. A’ is A intersect R^n). A’ is a plane, and the restriction of the action of P to A’ is a rotation. (This step takes some doing.) Other similar subspaces constructed as A’ was are all orthogonal to it since the basis of eigenvectors is orthonormal. If there are real eigenvalues, they are all 1 or pairs of -1s since the product of all eigenvalues is det P=1. Every pair of -1s gives a rotation thru pi in yet a new orthogonal plane, and the 1’s represent fixed subspaces. Thus the net effect of P is a composition of orthogonal rotations.

One corollary is the amazing fact that the composition of two rotations in 3-space about completely different axes, is a rotation! This is a corollary because in 3-space there isn’t room for 2 orthogonal planes – that would require 4 dimensions. So every element of SO3 is a rotation in a single plane. Since SO3, as a group, is closed, composing 2 rotations must give a new element of SO3 which must also be a rotation.