
On 12/07/07, Albert Y. C. Lai
Cale Gibbard wrote:
As someone with a background in mathematics, I'd say that the idea of mathematical symbols is to make things more concise, and easier to manipulate mechanically. I'm not entirely certain that their intent is to make things clearer, though often they can make things more precise (which is a bit of a double edged sword when it comes to clarity). I quite often try to avoid symbols as much as possible, only switching to formulas when the argument I'm making is very mechanical or computational. After all, in most circumstances, the reader is going to have to translate the symbols back into concepts and images in their head, and usually natural language is a little farther along in that process, making things easier to read.
I always have some beef with the common perception expressed in some of the above.
This is quite a long reply!
I suppose I can speak concretely by listing again the three different presentations appearing above and comparing them.
(A) ∀ pred : Nat → Prop . pred 0 → (∀ n:Nat . pred n → pred (n + 1)) → ∀ n : Nat . pred n
(B) For all statements about natural numbers, if the statement applies to 0, and if it applies to a number it applies to the next number, then it applies to all numbers.
(C) Mathematical induction
This case does call for some names of things. Here's one way I might write it: Theorem (Mathematical Induction for Naturals): Let P be a predicate on natural numbers. If P(0) is true, and whenever P(k) is true, we have that P(k+1) is true as well, then P(n) holds for every natural number n. I'm fairly sure that I prefer that to both A and B, at least for the purposes of teaching.
I have re-formatted (A) and (B) to be fair, i.e., they both receive nice formatting, and both formatting are more or less equivalent, dispelling such myths as "formal logic is unorganized" or "English is unorganized". I have formatted them to be equally well-organized.
(C) is the clearest to those people who have already learned it and haven't lost it. I grant you that if you talk to a carefully selected audience, you just say (C) and leave it at that. But if you need to explain (C), saying (C) again just won't do.
Most people claim (B) is clearer than (A) when it comes to explaining (C). One unfair factor at work is that most people have spent 10 years learning English; if they have also spent 10 years learning symbols, we would have a fairer comparison. Someone who know C but not Haskell is bound to say that C is clearer than Haskell; we who know both languages know better.
However, if they've spent 10 years on English, and, say, 6 more on learning a formal system, it seems reasonable to take advantage of both, wherever one outstrips the other.
(B) is less clear than (A) on several counts, and they are all caused by well-known unclarity problems in most natural languages such as English.
"if it applies to a number it applies to the next number" is unclear about whether "a number" is quantified by ∀ or ∃. It could be fixed but I think I know why the author didn't and most people wouldn't. If we actually try to insert "every" or "all" somewhere there, the whole of (B) sounds strange. Part of the strangeness is due to the lack of scope delimiters: Later in (B) there is another "all numbers" coming up, and two "all numbers" too close to each other is confusing. You need scope delimiters to separate them.
Just name your variables differently.
English provides just a handful of pronouns: I, we, you, he, she, it, they, this, that, these, those. The amount is further decimated when you make mathematical statements (pure or applied). Any statement that needs to refer to multiple subjects must run into juggle trouble. In (B), fortunately 95% of the time is spent around one single predicate, so you can just get by with "it". You can't pull the same trick if you are to explain something else that brings up two predicates and three numbers. To this end, an unlimited supply of variables is a great invention from formal logic. (I am already using this invention by designating certain objects as (A), (B), (C) and thereby referring to them as such. There is no more tractible way.) Of course with the use of variables comes again the issue of scope delimiters. Actually even pronouns can be helped by scope delimiters.
You can use variables to name things, and you can also use quantifiers without actually using the formal symbols for them.
English badly lacks and needs scope delimiters. Quantifiers need them, variables need them, and here is one more: nested conditionals such as (B): if if X then Y then Z X implies Y implies Z are unparsible. Perhaps you can insert a few commas (as in (B)) or alternate between the "if then" form and the "implies" form to help just this case: if X implies Y then Z but it doesn't scale. It also proves to be fragile as we insert actual X, Y, Z:
if it applies to a number implies it applies to the next number then it applies to all numbers
Now, (B) is actually parsible, but that's only because I formatted it thoughtfully. Formatting is a form of scope delimiting, and that solves the problem. But formatting is tricky, as we have all learned from Haskell indentation rules. It is best to combine explicit delimiters such as parentheses with aesthetic formatting. Both the textually inclined and the visually inclined can parse it with ease, and (A) shows it. Actually, I formatted (A) first, since that's more obvious to format; then I copied its formatting to (B).
Of course, there are nesting limitations to English, but when they get heavy, you will also find that people will automatically start finding new names for things, in order to avoid having to awkwardly insert pauses into their speech to indicate where the brackets should go. To some extent, this is where new abstractions come from, the limitations of our brains in holding onto arbitrarily nested constructions. It's also possible to make (numbered) lists of conditions or results, which makes it more apparent where each component begins and ends. There are phrases like "The following are equivalent". And if all that fails, you can of course still fall back on formal symbols, but you're probably better off making a new definition or two.
You may counter me with this part:
(A') ∀ n : Nat . pred n (B') pred applies to all numbers
Not really, I actually like that variable to be there sometimes too.
Surely the introduction of the variable n in (A') is an extra burden, and (B') streamlines it away? Yes, but (A') is not the best expression possible in formal logic anyway. Following the formal logic of HOL (hol.sf.net) and many others, ∀ is just another polymorphic operator: ∀ :: (a -> Prop) -> Prop and furthermore (A') is just syntactic sugar for ∀ (λn : Nat . pred n) which is
(A*) ∀ pred
Now it is streamlined: It only mentions the two essences of (B'): about pred, and being universally true. No unnecessary variable. If you also want it to say something about Nat, just throw in some types.
I must admit that I'm far more used to first order logic with some set theoretic axioms stacked on top of it.
Cale may counter that I have only been addressing precision, not "clarity" as he defined by ease of reading and inducing images in heads. Besides replying him with "if it doesn't have to be correct, I can write one that's even easier to read" (with apology to Gerald Weinberg), let me play along that definition just a bit. What image should I form in my head, if "a number" may be universal or existential and I don't know which to image, if "it" juggles between three subjects and I don't know which to image at which instance, and if the whole thing lacks scoping so much that it is not even parsible?
The point is that if that's really imprecise in the sense that the intended listener might mistake it to mean something else, then the text should certainly be improved, but it's easy to use quantifiers in mathematical English too. We can say "for all numbers n" or "there exists a number n", or some variation meaning one of these. We can also say things like "Let n be a number", or "Suppose there were an n such that P(n)."
As I write in another post:
"I think of formal logic as clarifying thought and semantics, cleaning up the mess caused by idiosyncracies in natural languages (both syntax and semantics) such as English. But not many people realize they are in a mess needing cleanup."
There are two ways to clean up anything. Take governments for example. If a government is corrupted, you could help it and fix it up, or you could start a revolution and remove it.
Natural languages are also corrupted when it comes to clarity of thought, reasoning, and semantics. I take the extremist approach of siding with the revolutionaries, and that is starting over with a brand new language divorcing natural words, i.e., formal logic. I do not invest hope in fixing up natural languages because of these sentiments: natural languages are too fragile to fix, natural words are too overloaded to be safe to reuse, and if you fix it up it will look just like formal logic with bloat. It is also a benefit, not a drawback, to learn a different language and broaden one's mind; it is a drawback, not a benefit, to avoid a different language and narrow one's mind. It is also a benefit, not a drawback, to keep natural languages around for its strength: poems, puns, visions, sweeping generalizations; but do let's use a different language for a different task: a logical one for a logical task.
Yeah, so let's suppose you don't have a blackboard handy, let's say you're talking to your friend on the phone. How do you convey mathematical statements? I certainly hope that you don't describe each character in turn! You'll probably render the logic into some spoken form, which is quite likely to be just the sort of thing which I'd tend to write down in common use. Don't get me wrong, when it's appropriate, I'll actually use explicit quantifier symbols (∀,∃), and symbolic forms of things like "implies", "and", "or", and so on, but usually this is when either I'm simply unravelling some definitions, or I feel that a particularly symbolic/computational approach to logic seems to be the best level at which to consider the given statements. Usually though, I like to keep the reader's mind focused on the subject matter at hand, and not the formal logic framework that we happen to be using. When you're specifically doing mathematical logic, of course it's more appropriate to use the symbolic form, since the whole point is in the manipulation and discussion of symbolic sentences. When you're doing geometry or combinatorics, it's generally inappropriate to get involved in minutiae like that too often.
Textually-inclined people feel at home with formal logic. Visually-inclined people are disadvantaged, but they are just as disadvantaged with textual natural languages. What they need are visual languages. Formal logic enjoys a more well-defined grammar and is a better basis than natural languages for developing visual languages.
I'm not entirely sure this last claim is true. For me at least, the visual language which I use to reason about mathematical objects is almost entirely divorced from the original English or formal statements, except that each of the components readily has an interpretation in terms of them. Let's take a simple example from the context of topology. Consider the perhaps still somewhat vague statement: Let S be a surface obtained by removing an open disc from each of the torus and the projective plane, and gluing a cylinder between the two circles at the boundaries of the removed discs. This is a translation into English of some shapes in my head. Note that this involves no explicit maps of any kind. I don't even really mention that there's a quotient, except for using the word 'gluing'. Let's assume we don't yet have some operation which does this all formally. You can probably still see quite clearly what I want, geometrically. You can also probably formalise it for yourself, perhaps in even more generality than this specific case. I would say that, among the right audience, it does a fairly adequate job of defining a surface up to homeomorphism, putting it in the heads of the people in the room. I can go on to talk about paths in that space fairly adequately, and even perhaps draw a picture of it without people shouting that it's not at all the picture that they had (at least insofar as I can draw a recognisable picture of the projective plane!). If I *do* happen to have the operation which does this formally, then by saying what I've said above, I convey that I want that operation to those people who are aware of it, and I convey something which is still reasonably meaningful to those who don't -- they might even be able to determine what the formal definition must be just from that. On the other hand, if I'm reasonably sure that everyone at hand is aware of the formalism, I might say: "Let S be the surface obtained by taking the connected sum of the projective plane and the torus." Or even the symbolic: "Let S = P # T", requiring a little more context, of course. These are both far more concise and far more opaque to those unfamiliar with the context. To define the connected sum more formally would take just a little more time and care, which may or may not be worthwhile depending on how detailed a discussion one wants to get into. Obviously, there are many circumstances which warrant that, and any extended period of work would. As an introduction to the whole idea of the connected sum, studying an example similar to this might be useful before moving to the general idea. Even when doing so, writing the definition of the connected sum in terms of formal logic will generally not make it clearer. In any proper treatment though, the quotient and the isomorphism between the two circles would be made explicit. I might remove the cylinder from the formalisation, since it is mostly there as a visual aid rather than something which is mathematically necessary. I might even manage to bore people by spelling the details out though. 'Oh, come on, we know what you mean already! It was enough to say "glue the circles together!"' My point is not that the formalism shouldn't be there, somewhere. It's just that writing it down is often an exercise in tedium, for any but the most primitive of tasks. If you don't believe it, have a look at metamath.org. While that level of rigour is a worthy goal as a separate project, it's not something you'll ever convince any mathematician to do on an everyday basis in their own work, unless the person in question is a hardcore logician. Even then, it's likely going to be an uphill battle. The reason is that this sort of translation into the mechanical is usually extremely boring! It can be complicated certainly, but usually long before you've finished, perhaps even before you've begun, you've convinced yourself that it's doable. The rest is busy work, and typically doesn't serve to elucidate anything apart from possibly some details about the formalism in question rather than the theorem you're embedding into it. My algebraic topology professor had a word for the sort of language which is typically used, and what most mathematicians are aiming for: rigourisable. Specifically, the goal is to pin enough details down so that anyone in the audience could sit down and, with knowledge of a particular formalism, turn it into something that a computer would accept. Humans are at present much better than machines at bridging all the small gaps in a consistent fashion (like applying the appropriate isomorphisms and so on), and there's little reason that we should suffer to communicate with each other at their level. On the other hand, for many modern papers, putting some extra details somewhere seems like something worthwhile, and you might even be able to get graduate students to do a bit of it. Writing everything into a formal, machine-checkable system still seems pretty unlikely though. There's also a good philosophical reason for working in a way which is slightly detached from any particular formalism when possible -- any particular formalism for mathematics might happen to be inconsistent. The normal level of detachment from formalism would cushion many results from the shock-waves an inconsistency would send through every branch of mathematics, as it is quite likely that many things will still have justifiable translations in terms of whatever new system people would then come up with. Things written directly at the level of the formalism would be comparatively brittle, and possibly need translation up into softer language and then back down into the new system. Anyway, that's just my own perspective on the matter. Sorry for going on for so long in this somewhat off-topic discussion. :)