a3nm's blog

Proving vs. explaining in mathematics

— updated

Imagine a high school level maths exam in which students are requested to solve the following simple equation: x2 - 3x + 2 = 0. How would you grade the two following answers:

  1. We proved in class that such an equation has at most two solutions. [A few lines of maths in which the student computed them.] Therefore, the solutions to this equation are x = 1 and x = 2.
  2. We proved in class that such an equation has at most two solutions. We check that 12 - 3×1 + 2 = 0 and 22 - 3×2 + 2 = 0. Therefore, the solutions to this equation are x = 1 and x = 2.

Many teachers dislike the second solution, because they think that the student must have cheated. Indeed, the solution seems highly suspicious, because we cannot guess where the solutions come from. (Alright, those were easy to find, but you get the idea.) Notice however that, from a logical standpoint, the second solution is perfectly valid: it justifies the fact that they exists at most two solutions, and explicits them in a perfectly rigorous fashion.

[Interestingly, I had at least one maths teacher who preferred solutions of the second type, because he only wanted us to prove that our answers were correct, and believed that the process leading to the solution was none of his business.]

The point I would like to draw attention to is the following: whereas both solutions prove what they state, only the first one explains where it comes from. Of course, there are more complex examples of this distinction: while some math papers, books and teachers make a great job of explaining things, others just prove theorems mechanically without any effort to show what's going on behind the scenes.

This distinction isn't limited to numerical solutions to equations either. Even with no equations involved, there are proofs in which you can see that things work out as planned but can't understand why they do; proofs in which you build ridiculously complicated concepts with which you unexpectedly manage to prove simpler claims; proofs in which, to put it simply, the author writes things but does not explain what gave him the idea to do things this way. Proofs which, in a way, look like a program with no comments.

The important fundamental difference between explaining and proving is, in my opinion, the following. As far as proving is concerned, you can, theoretically, write proofs which are undoubtedly correct, using only the axioms and core deduction rules. Of course, you never do that in practice, but it means that you could reach perfection if you wanted to. However, when you explain things, you have no reason to believe that it is possible to write something which is undoubtedly understandable. To do this, you would need to make each step seem natural, that is, not only do things, but also explain, at each step, what gave you the idea to do so, and why it gave you such an idea, and so on, and so forth. [Philosophically, the idea is that you cannot understand well enough what's going on in your head when you think to explain why you are thinking in this way...]

Keep in mind that this issue of writing proofs which explain things in addition to proving them should not be confused with the well-known problem of writing proofs with the right level of detail, ie. the tantalizing fact that when you prove something, justifying every step is infeasible in practice but just writing "Trivial from the axioms." isn't acceptable either so you have to find a compromise between these two extremes. The issue I am dealing with has nothing to do with this: the second answer in the example above could formally justify every logical deduction (even going back to the axioms) and still manage to cook up arbitrary values for x and y without explaining how they were found.

It is quite sad that many math books and courses seem to be written for machines rather than humans in that they prove things but don't really try to explain what they do. In my opinion, things would be better if the two were present and clearly distinguished, with a machine-readable formal proof, and an informal discussion explaining how the proof is built and why it is built that way.

comments welcome at a3nm<REMOVETHIS>@a3nm.net