Another of the many reasons I’m in grad school. I benefit as a teacher from understanding the content I teach in way more depth than I teach it. (I think everybody does, but it’s easiest to talk about myself.)

This does a number of things for me. The simplest is that it makes the content more exciting to me. Something that previously seemed routine can become pregnant with significance if I know where it’s going, and there’s a corresponding twinkle that shows up in my eye the whole time my students are dealing with it. A second benefit is that it gives me both tools and inspiration to find more different ways of explaining things. A third is that it helps me see (and therefore develop lessons aimed at) connections between different ideas.

So, this post is a catalogue of some insights that I’ve had about K-12 math that I’ve been led to by PhD study. The title of the post is a reference to Felix Klein’s classic books of the same name. The catalogue is mostly for my own benefit, and I don’t have all that much time, so I’m going to try to suppress the impulse to fully explain some of the more esoteric vocabulary, but I never want to write something here that requires expert knowledge to avoid being useless, so I’ll try to be both clear and pithy. (Wish me luck.)

Elementary level: Multiplication is function composition.

I’m developing the opinion that it’s important for especially middle and high school teachers to have this language. The upshot is that in addition to the usual models of multiplication as (a) repeated addition and (b) arrays and area of rectangles (and, if you’re lucky, (c) double number lines), multiplication is also the net effect of doing two things in a row, such as stretching (and possibly reversing) a number line.

The big thing I want to say here is that understanding this is key to understanding multiplication of signed numbers. I would go so far as to wager that anybody who feels they know intuitively why $-\cdot -=+$ understands it on some level, consciously or not.

When somebody asks me why a negative times a negative is a positive, I have often had the inclination to answer with, “well, what’s the opposite of the opposite of something?” (I have seen many teachers use metaphors with the same upshot.) The problem is that if you understand multiplication only as repeated addition and as the the area of rectangles, I’ve changed the subject with this answer. It is a complete nonsequitur. It’s probably clear why it has to do with negatives but why does it have to do with multiplication?

On the other hand, if on any level you realize that one meaning of $2\times 3$ is “double then triple”, then it’s natural for $(-2)\times(-3)$ to mean “double and oppositize, then triple and oppositize.” But for this you had to be able to see multiplication as “do something then do something else.”

Algebra I and Algebra II: Substitution is calculation inside a coordinate ring.

I just realized this today, and that’s what inspired this blog post. So far, I’m not sure the benefit of this one to my teaching beyond the twinkle it will bring to my eye, though perhaps that will become clear later. It’s certainly helping me understand something about algebraic geometry. The basic idea is this: say you’re finding the intersections of some graphs like $y=3x+5$ and $2x+y=30$. You’re like, “alright, substitute using the fact that $y=3x+5$. $2x+(3x+5)=30$, so $5x+5=30$…” and you solve that to find $x=5$, for an intersection point of $(5,20)$. A way to look at what you’re doing when you make the substitution $y=3x+5$ is that you’re working in a special algebraic system determined by the line $y=3x+5$, in particular the (tautological) fact that on this line, $y$ is exactly three $x$ plus five. In this system, polynomials in $x$ or $y$ alone work the usual way, but polynomials in $x$ and $y$ both can often be simplified using the relation $y=3x+5$ connecting $x$ and $y$. This algebraic system is called “the coordinate ring of the line $y=3x+5$.”

I can’t tell if it will even seem that I’ve said anything at all here. The point, for me, is just a sublte shift in perspective. I imagine myself sitting on the line $y=3x+5$; then this line determines an algebraic system (the coordinate ring) which, as long as I’m on the line, is the right system; and when I substitute $3x+5$ for $y$, what I’m doing is using the rules of that system.

Calculus: The chain rule is the functoriality of the derivative.

“Functoriality” is a word from category theory which I will avoid defining. The point is really about the chain rule. The main ways the derivative is presented in a first-year calculus class are as speed, or rate of change, on the one hand (like, you’re always thinking of the independent variable as time, whatever it really is), and the slope of the tangent line of a graph, on the other. There is a third way to look at it, which I learned from differential geometry. If you look at a function as a mapping from the real line to itself, then the derivative describes the factor by which it stretches small intervals. For example, $f(x)=x^2$ has a derivative of $6$ at $x=3$. What this is saying is that very small intervals around $x=3$ get mapped to intervals that are about 6 times as long. (To illustrate: the interval $[3,3.01]$ gets mapped to $[9,9.0601]$, about 6 times as long.)

Seen in this way, the strange formula $[f(g(x))]'=f'(g(x))\cdot g'(x)$ for the chain rule becomes the only sensible way it could be. The function $f(g(x))$ is the net effect of doing $g$ to $x$ and then doing $f$ to the answer $g(x)$. If I want to know how much this function stretches intervals, well, when $g$ is applied to $x$ they are stretched by a factor of $g'(x)$. Then when $f$ is applied to $g(x)$ they are stretched by a factor of $f'(g(x))$. (Note it is clear why you evaluate $f'$ at $g(x)$: that is the number to which $f$ got applied.) So you stretched first by a factor of $g'(x)$ and then by a factor of $f'(g(x))$; net effect, $f'(g(x))\cdot g'(x)$, just like the formula says.

(As an aside, for the sake of being thematic, note the role here of the fact that the multiplication comes from the composition of the two stretches – multiplication is function composition. When I say “the derivative is functorial” what I really mean is that it turns composition of functions into composition of stretches.)

Calculus: The intermediate value theorem is $f(\text{connected})=\text{connected}$. The extreme value theorem is $f(\text{compact})=\text{compact}$.

This is a good example of what I was talking about at the beginning about the twinkle in my eye, and connections between ideas. When I used to teach AP calculus, the extreme value theorem and the intermediate value theorem were things I had trouble connecting to the rest of the curriculum. They were these miscellaneous, intuitively obvious factoids about continuous functions that were stuck into the course in awkward places. They both had the same clunky hypothesis, “if $f$ is a function that is continuous on a closed interval $[a,b]$…” I didn’t do much with them, because I didn’t care about them.

I started to see a bigger picture about three years ago, in a course for calculus teachers taught by the irrepressible Larry Zimmerman. He referred to that clunky hypothesis as something to the effect of “a lilting refrain calling like a siren song.” I was also left with the image of a golden thread weaving through the fabric of calculus but I’m not sure if he said that. The point is, he made a big deal about that hypothesis, making me notice how thematic it is.

Last year when I taught a course on algebra and analysis, having benefited from this education, I made these theorems important goals of the course. But something further clicked into place this fall, when I started to need to draw on point-set topology knowledge as I studied differential geometry. Two fundamental concepts in topology are compactness and connectedness. They have technical definitions for which you can follow the links. Intuitively, connectedness is what it sounds like (all one piece), and compactness means (very loosely) that a set “ends, and reaches everywhere it heads toward.” (A closed interval is compact. The whole real line is not compact because it doesn’t end. An open interval is not compact because it wants to include its endpoints but it doesn’t. A professor of mine described compactness as, “everything that should happen [in the set] does happen.”)

Two basic theorems of point-set topology are that under a continuous mapping, the image of any connected set is connected and the image of any compact set is compact. These theorems are very general: they are true in the setting of any map between any two topological spaces. (They could be multidimensional, curved or twisted, or even more exotic…) What I realized is that the intermediate value theorem is just the theorem about connectedness specialized to the real line, and the extreme value theorem is just the theorem about compactness. What is a compact, connected subset of $\mathbb{R}$? It is precisely a closed interval. Under a continuous function, the image must therefore be compact and connected. Therefore, it must attain a maximum and minimum, because if not, the image either “doesn’t end” or “doesn’t reach its ends,” either of which would make it noncompact. And, for any two values hit by the image, it must hit every value between them; any missing value would disconnect it. So, “if $f$ is a function that is continuous on a closed interval $[a,b]$…”