I recently read Mathematical Explanation [gated], by Mark Steiner (1978). My summary follows, and my commentary follows that. I am aware that others have written things since 1978 on this topic, but I don’t have time to read them right now.
We seem to think there is a distinction between explaining a mathematical fact and merely demonstrating it to be the case. We have proofs that do both things, and perhaps a sliding scale of explanatoriness between them. One big question then is what makes a proof actually explain the thing it proves? Or at least what makes it seem that way to us?
One suggestion has been the level of generality or abstractness. Perhaps if we show a particular fact follows from some much bigger theory, the fact feels more explained. But then consider this fact:
1+2+3+…+n = n(n+1)/2
There is an inductive proof of this:
S(1) = 1(1+1)/2 = 1
S(n+1) = S(n) + (n + 1) = n(n+1)/2 + 2(n+1)/2 = (n + 1)(n+2)/2
This is not taken to be very explanatory. Whereas this is:
O O O O O
O O O O O
O O O O O
O O O O O
[the black circles make a triangle of 1+2+3+4. Any such triangle can be made into a rectangle of area n x (n+1) with another identical triangle. So the triangle is half of n(n+1).]
It seems the latter is if anything less general, yet it seems a much better explanation (I remember learning it this way as a preteen in book about fun math magic). There are other examples.
This case and others, suggest being able to visualize a proof is key to its seeming to be an explanation. Steiner discards this immediately as being too subjective, and claims there are also counterexamples.
He also quickly dismisses a third hypothesis that others have forwarded: that a proof is explanatory if it could have been used to discover the fact, rather than just to verify it. His counterexample is the Eulerian identity, which I shan’t go into here. I take it this hypothesis isn’t very plausible anyway, since often we discover a fact first then hope to explain it better.
Steiner offers his own theory: that a proof is explanatory if it makes use of a ‘characterizing property’ of an entity that is mentioned in the theorem. ‘Characterizing properties’ characterize an entity relative to other entities in some similar family. For instance, 18 might be characterized as 2*3*3, since other numbers don’t have that property. 18 might also be characterized as being one more than 17, or in a huge number of other ways.
If I understand, the idea is that if we are clear on how a result depends on a particular characterizing property, we will feel that the result has been explained. If we don’t see how something unique about the entities in question ‘caused’ the outcome, the outcome seems arbitrary. He explains further that this means we can see that if we change the properties of the entity, perhaps swapping out 18 for 20, we would get a different result.
Steiner explains how the many proofs he has presented that we have considered explanatory do in fact depend on characterizing properties, thus considers his theory to be quite supported.
Perhaps I misunderstand this notion of ‘characterizing properties’. It seems to me that of course all proofs depend on properties specific to the entities they are about (relative to whatever entities the proof is not about). So to distinguish the explanatory proofs, Steiner needs a narrower notion of a characterizing property. For instance, a property that is particularly saliently related to the entity in question. Or he needs to claim that explanatoriness requires the observer to actually notice or understand the connection between the explanatory property and the outcome. In which case the explanatoriness of a proof would be a function of the observer’s psychology as well as the proof. Any proof would be perfectly explanatory if the reader followed it carefully enough.
At any rate, he doesn’t seem to be thinking of either of those things (though again I may be misunderstanding just what he is claiming at the end here). He rather claims that the various proofs he examines do in fact rely on properties that characterize the entities involved. The class seemed to agree with me here.
My tentative theory of when we feel something has been explained, which goes for scientific explanations as well as mathematical ones, is as follows. We feel like we understand a bunch of things that we are very familiar with: chunks of matter moving through space and knocking into each other, liquids, shapes, basic agenthood, that sort of thing.
Anything that happens that only involves these things acting in their usual ways doesn’t feel like it needs any extra explanation. It is obvious. To ‘explain’ less familiar things, we can do one of two things. We can frame them in terms of something we already intuitively grasp in the above way. This is what is usually called an explanation. For instance we can think of electricity as being like water, or of the first n integers as being like bits of a triangle. Or of the mysterious murder being like a waitress putting poison in the soup. Alternatively we can just keep interacting with the entity in question until we become familiar with it’s properties, and then we think them obvious and not requiring explanation. For instance I no longer feel like I need an explanation for x^2 making a parabola shape, because I’m so familiar with it.
This arguably fits with many of the characteristics we have noted are associated with explanatoriness. Instances of generalizations that we understand feel explanatory. Pictures tend to be explanatory, especially diagrams with simple shapes. We feel like we could have discovered a thing ourselves if it follows from behavior of entities we can manipulate intuitively.
While this seems to me a decent characterization of what feels explanatory, I can’t see that it is a particularly useful category outside of psychology, for instance for use in saying what it is that science is meant to be doing. Something like unification seems more apt there, but that’s a topic for another time.
I think the explanations that feel the most explanatory are the ones we encounter closest in time to our coming into an understanding of that topic, or ones we come up with just after.
Wait a minute, doesn’t the second step of the inductive proof just assume the theorem to be proven? Or is my rusty algebra missing something? How do we get S(n) + (n + 1) = n(n+1)/2 + 2(n+1)/2 without assuming the theorem?
Paul, inductive proofs always assume the proposition holds for the inductive step. Induction looks like:
* Prove S(K) where K is some starting number (typically, 1)
* Prove that S(N) implies S(N+1)
This is the step that assumes the proposition holds, but the reasoning is not circular: since its only trying to demonstrate that if it holds for N, then it holds for N+1. Assuming P is generally taken for granted when trying to show P implies Q.
After this, then axiomatically the proposition holds for all integers greater than or equal to K.
Right, but the step I’m looking at also looks to me like it assumes S(n+1)=(n+1)(n+2)/2. That is, it assumes the truth of the proposition for the general case, not just for the case of S(n). But I’m probably misreading it.
S(n+1) = S(n) + (n+1) [definition of S(n)]
S(n) = n(n+1)/2 [inductive assumption]
S(n+1) = n(n+1)/2 + (n+1) [substitution]
S(n+1) = (n(n+1)+ 2(n+1))/2 [common denominator]
S(n+1) = (n+1)(n+2)/2 [factoring]
I agree with your theory; it matches something I’ve long thought: that (a) any mathematical proof is effectively a tautology [ x = something that is equal to x = something else that is equal to X = a third thing that is equal to x, therefore x = x ] (b) “explanation” is a concept that only makes sense given a model of human [ as opposed to a general thinking engine ].
I first came across the model that the human brain is not a general purpose computer, but in fact has specialized “coprocessors” that are optimized for things like biology, social relations, counting the number of bears that enter a cave and then leave it, etc. via Steven Pinker almost 20 years ago (Language Instinct ch 13).
That insight (although obvious in retrospect it was new to me at the time – perhaps a statement about the prevailing orthodoxy re the plasticity of the human mind and human culture of the day) made me realize that the word “explanation” really means “a tautology that chains back to some concept that we’re hardwired to find “obvious”.
– Prediction (data) seems the main value and goal. The reset seem local word play.
– There is no such thing as “science” or the scientific methodology independent of specific studies and sets of data. “Science” is a rhetorical straw man.
– Animal studies should slove these matters. Human knowledge wouldn’t seem any different from other mammals. Language and consiousness appear trivial and of little information value.
Proofs and Refutations by Imre Lakatos has a nice treatment of these issues.
I have noticed a subjective difference between types of proofs, which may be interpreted as a measure of how well the result is `explained’. The central question is, after having read and understood the proof, do counterexamples seem *impossible* or merely *prohibited*?
Some results have proofs which are so natural and inevitable that, after having seen the proof, it becomes impossible to imagine the existence of a counterexample. By this I mean, the brain regards the idea of a counterexample as an absurdity, and cannot entertain such a thing even in the abstract. These results are rarely cited when used; rather, they are incorporated into the mathematician’s understanding of the object at hand. Simple proofs and visual proofs often have this quality Examples (for me, individual results may vary):
1) If xy=0, then either x=0 or y=0.
2) A triangle is determined (up to congruence) by the lengths of its sides.
3) Any finite group of prime order is commutative.
4) The proof of the Pythagorean Theorem in which a square of size c is placed inside a square of size a+b in two different ways.
Otherwise, a result seems to only prohibit the existence of a counterexample; that is, one could imagine them existing in the abstract, but the result assures us that this is not the case. Proofs by contradiction, classification, multiple cases or the axiom of choice often feel this way. Examples (again, for me):
1) the infinitude of primes
2) (Cantor’s diagonal theorem) the impossibility of a bijection between Z (the integers) and R (the reals)
3) there are 2 groups of order 6 (up to isomorphism)
4) Euclid’s original proof of the Pythagorean Theorem
This question seems to belong to psychology. For one thing, who’s to say people find the same things explanatory? Shouldn’t we try to learn to find explanatory what is explanatory, rather than try to produce explanations that feel right?
The argument from introspection is a much more interesting argument, since it tries to appeal to the direct experience of everyman. But the argument is deeply suspect, in that it assumes that our faculty of inner observation or introspection reveals things as they really are in their innermost nature. This assumption is suspect because we already know that our other forms of observation — sight, hearing, touch, and so on — do no such thing. The red surface of an apple does not look like a matrix of molecules reflecting photons at certain critical wave-lengths, but that is what it is. The sound of a flute does not sound like a sinusoidal compression wave train in the atmosphere, but that is what it is. The warmth of the summer air does not feel like the mean kinetic energy of millions of tiny molecules, but that is what it is. If one’s pains and hopes and beliefs do not introspectively seem like electrochemical states in a neural network, that may be only because our faculty of introspection, like our other senses, is not sufficiently penetrating to reveal such hidden details. Which is just what one would expect anyway. The argument from introspection is therefore entirely without force, unless we can somehow argue that the faculty of introspection is quite different from all other forms of observation.
Pingback: The Spamlist! » Explanations of mathematical explanation