Re: [Haskell-cafe] Very freaky

13 Jul 2007

      On 12/07/07, Albert Y. C. Lai  wrote:
...
Cale Gibbard wrote:
...
As someone with a background in mathematics, I'd say that the idea of
mathematical symbols is to make things more concise, and easier to
manipulate mechanically. I'm not entirely certain that their intent is
to make things clearer, though often they can make things more precise
(which is a bit of a double edged sword when it comes to clarity). I
quite often try to avoid symbols as much as possible, only switching
to formulas when the argument I'm making is very mechanical or
computational. After all, in most circumstances, the reader is going
to have to translate the symbols back into concepts and images in
their head, and usually natural language is a little farther along in
that process, making things easier to read.
I always have some beef with the common perception expressed in some of
the above.
This is quite a long reply!
...
I suppose I can speak concretely by listing again the three different
presentations appearing above and comparing them.
(A) ∀ pred : Nat → Prop .
         pred 0
       → (∀ n:Nat . pred n → pred (n + 1))
       → ∀ n : Nat . pred n
(B) For all statements about natural numbers,
       if    the statement applies to 0,
       and   if it applies to a number it applies to the next number,
       then  it applies to all numbers.
(C) Mathematical induction
This case does call for some names of things. Here's one way I might write it:

Theorem (Mathematical Induction for Naturals):
Let P be a predicate on natural numbers. If P(0) is true, and whenever
P(k) is true, we have that P(k+1) is true as well, then P(n) holds for
every natural number n.

I'm fairly sure that I prefer that to both A and B, at least for the
purposes of teaching.
...
I have re-formatted (A) and (B) to be fair, i.e., they both receive nice
formatting, and both formatting are more or less equivalent, dispelling
such myths as "formal logic is unorganized" or "English is unorganized".
I have formatted them to be equally well-organized.
(C) is the clearest to those people who have already learned it and
haven't lost it. I grant you that if you talk to a carefully selected
audience, you just say (C) and leave it at that. But if you need to
explain (C), saying (C) again just won't do.
Most people claim (B) is clearer than (A) when it comes to explaining
(C). One unfair factor at work is that most people have spent 10 years
learning English; if they have also spent 10 years learning symbols, we
would have a fairer comparison. Someone who know C but not Haskell is
bound to say that C is clearer than Haskell; we who know both languages
know better.
However, if they've spent 10 years on English, and, say, 6 more on
learning a formal system, it seems reasonable to take advantage of
both, wherever one outstrips the other.
...
(B) is less clear than (A) on several counts, and they are all caused by
well-known unclarity problems in most natural languages such as English.
"if it applies to a number it applies to the next number" is unclear
about whether "a number" is quantified by ∀ or ∃. It could be fixed but
I think I know why the author didn't and most people wouldn't. If we
actually try to insert "every" or "all" somewhere there, the whole of
(B) sounds strange. Part of the strangeness is due to the lack of scope
delimiters: Later in (B) there is another "all numbers" coming up, and
two "all numbers" too close to each other is confusing. You need scope
delimiters to separate them.
Just name your variables differently.
...
English provides just a handful of pronouns: I, we, you, he, she, it,
they, this, that, these, those. The amount is further decimated when you
make mathematical statements (pure or applied). Any statement that needs
to refer to multiple subjects must run into juggle trouble. In (B),
fortunately 95% of the time is spent around one single predicate, so you
can just get by with "it". You can't pull the same trick if you are to
explain something else that brings up two predicates and three numbers.
To this end, an unlimited supply of variables is a great invention from
formal logic. (I am already using this invention by designating certain
objects as (A), (B), (C) and thereby referring to them as such. There is
no more tractible way.) Of course with the use of variables comes again
the issue of scope delimiters. Actually even pronouns can be helped by
scope delimiters.
You can use variables to name things, and you can also use quantifiers
without actually using the formal symbols for them.
...
English badly lacks and needs scope delimiters. Quantifiers need them,
variables need them, and here is one more: nested conditionals such as (B):
     if if X then Y then Z
     X implies Y implies Z
are unparsible. Perhaps you can insert a few commas (as in (B)) or
alternate between the "if then" form and the "implies" form to help just
this case:
   if X implies Y then Z
but it doesn't scale. It also proves to be fragile as we insert actual
X, Y, Z:
if it applies to a number implies it applies to the next number
then it applies to all numbers
Now, (B) is actually parsible, but that's only because I formatted it
thoughtfully. Formatting is a form of scope delimiting, and that solves
the problem. But formatting is tricky, as we have all learned from
Haskell indentation rules. It is best to combine explicit delimiters
such as parentheses with aesthetic formatting. Both the textually
inclined and the visually inclined can parse it with ease, and (A) shows
it. Actually, I formatted (A) first, since that's more obvious to
format; then I copied its formatting to (B).
Of course, there are nesting limitations to English, but when they get
heavy, you will also find that people will automatically start finding
new names for things, in order to avoid having to awkwardly insert
pauses into their speech to indicate where the brackets should go. To
some extent, this is where new abstractions come from, the limitations
of our brains in holding onto arbitrarily nested constructions.

It's also possible to make (numbered) lists of conditions or results,
which makes it more apparent where each component begins and ends.
There are phrases like "The following are equivalent". And if all that
fails, you can of course still fall back on formal symbols, but you're
probably better off making a new definition or two.
...
You may counter me with this part:
(A') ∀ n : Nat . pred n
(B') pred applies to all numbers
Not really, I actually like that variable to be there sometimes too.
...
Surely the introduction of the variable n in (A') is an extra burden,
and (B') streamlines it away? Yes, but (A') is not the best expression
possible in formal logic anyway. Following the formal logic of HOL
(hol.sf.net) and many others, ∀ is just another polymorphic operator:
     ∀ :: (a -> Prop) -> Prop
and furthermore (A') is just syntactic sugar for
     ∀ (λn : Nat . pred n)
which is
(A*) ∀ pred
Now it is streamlined: It only mentions the two essences of (B'): about
pred, and being universally true. No unnecessary variable. If you also
want it to say something about Nat, just throw in some types.
I must admit that I'm far more used to first order logic with some set
theoretic axioms stacked on top of it.
...
Cale may counter that I have only been addressing precision, not
"clarity" as he defined by ease of reading and inducing images in heads.
Besides replying him with "if it doesn't have to be correct, I can write
one that's even easier to read" (with apology to Gerald Weinberg), let
me play along that definition just a bit. What image should I form in my
head, if "a number" may be universal or existential and I don't know
which to image, if "it" juggles between three subjects and I don't know
which to image at which instance, and if the whole thing lacks scoping
so much that it is not even parsible?
The point is that if that's really imprecise in the sense that the
intended listener might mistake it to mean something else, then the
text should certainly be improved, but it's easy to use quantifiers in
mathematical English too. We can say "for all numbers n" or "there
exists a number n", or some variation meaning one of these. We can
also say things like "Let n be a number", or "Suppose there were an n
such that P(n)."
...
As I write in another post:
"I think of formal logic as clarifying thought and semantics, cleaning
up the mess caused by idiosyncracies in natural languages (both syntax
and semantics) such as English. But not many people realize they are in
a mess needing cleanup."
There are two ways to clean up anything. Take governments for example.
If a government is corrupted, you could help it and fix it up, or you
could start a revolution and remove it.
Natural languages are also corrupted when it comes to clarity of
thought, reasoning, and semantics. I take the extremist approach of
siding with the revolutionaries, and that is starting over with a brand
new language divorcing natural words, i.e., formal logic. I do not
invest hope in fixing up natural languages because of these sentiments:
natural languages are too fragile to fix, natural words are too
overloaded to be safe to reuse, and if you fix it up it will look just
like formal logic with bloat. It is also a benefit, not a drawback, to
learn a different language and broaden one's mind; it is a drawback, not
a benefit, to avoid a different language and narrow one's mind. It is
also a benefit, not a drawback, to keep natural languages around for its
strength: poems, puns, visions, sweeping generalizations; but do let's
use a different language for a different task: a logical one for a
logical task.
Yeah, so let's suppose you don't have a blackboard handy, let's say
you're talking to your friend on the phone. How do you convey
mathematical statements? I certainly hope that you don't describe each
character in turn! You'll probably render the logic into some spoken
form, which is quite likely to be just the sort of thing which I'd
tend to write down in common use.

Don't get me wrong, when it's appropriate, I'll actually use explicit
quantifier symbols (∀,∃), and symbolic forms of things like "implies",
"and", "or", and so on, but usually this is when either I'm simply
unravelling some definitions, or I feel that a particularly
symbolic/computational approach to logic seems to be the best level at
which to consider the given statements. Usually though, I like to keep
the reader's mind focused on the subject matter at hand, and not the
formal logic framework that we happen to be using. When you're
specifically doing mathematical logic, of course it's more appropriate
to use the symbolic form, since the whole point is in the manipulation
and discussion of symbolic sentences. When you're doing geometry or
combinatorics, it's generally inappropriate to get involved in
minutiae like that too often.
...
Textually-inclined people feel at home with formal logic.
Visually-inclined people are disadvantaged, but they are just as
disadvantaged with textual natural languages. What they need are visual
languages. Formal logic enjoys a more well-defined grammar and is a
better basis than natural languages for developing visual languages.
I'm not entirely sure this last claim is true. For me at least, the
visual language which I use to reason about mathematical objects is
almost entirely divorced from the original English or formal
statements, except that each of the components readily has an
interpretation in terms of them.

Let's take a simple example from the context of topology. Consider the
perhaps still somewhat vague statement: Let S be a surface obtained by
removing an open disc from each of the torus and the projective plane,
and gluing a cylinder between the two circles at the boundaries of the
removed discs.

This is a translation into English of some shapes in my head.

Note that this involves no explicit maps of any kind. I don't even
really mention that there's a quotient, except for using the word
'gluing'. Let's assume we don't yet have some operation which does
this all formally. You can probably still see quite clearly what I
want, geometrically. You can also probably formalise it for yourself,
perhaps in even more generality than this specific case. I would say
that, among the right audience, it does a fairly adequate job of
defining a surface up to homeomorphism, putting it in the heads of the
people in the room. I can go on to talk about paths in that space
fairly adequately, and even perhaps draw a picture of it without
people shouting that it's not at all the picture that they had (at
least insofar as I can draw a recognisable picture of the projective
plane!). If I *do* happen to have the operation which does this
formally, then by saying what I've said above, I convey that I want
that operation to those people who are aware of it, and I convey
something which is still reasonably meaningful to those who don't --
they might even be able to determine what the formal definition must
be just from that.

On the other hand, if I'm reasonably sure that everyone at hand is
aware of the formalism, I might say: "Let S be the surface obtained by
taking the connected sum of the projective plane and the torus."

Or even the symbolic: "Let S = P # T", requiring a little more
context, of course.

These are both far more concise and far more opaque to those
unfamiliar with the context.

To define the connected sum more formally would take just a little
more time and care, which may or may not be worthwhile depending on
how detailed a discussion one wants to get into. Obviously, there are
many circumstances which warrant that, and any extended period of work
would. As an introduction to the whole idea of the connected sum,
studying an example similar to this might be useful before moving to
the general idea.

Even when doing so, writing the definition of the connected sum in
terms of formal logic will generally not make it clearer. In any
proper treatment though, the quotient and the isomorphism between the
two circles would be made explicit. I might remove the cylinder from
the formalisation, since it is mostly there as a visual aid rather
than something which is mathematically necessary. I might even manage
to bore people by spelling the details out though. 'Oh, come on, we
know what you mean already! It was enough to say "glue the circles
together!"'

My point is not that the formalism shouldn't be there, somewhere. It's
just that writing it down is often an exercise in tedium, for any but
the most primitive of tasks. If you don't believe it, have a look at
metamath.org. While that level of rigour is a worthy goal as a
separate project, it's not something you'll ever convince any
mathematician to do on an everyday basis in their own work, unless the
person in question is a hardcore logician. Even then, it's likely
going to be an uphill battle. The reason is that this sort of
translation into the mechanical is usually extremely boring! It can be
complicated certainly, but usually long before you've finished,
perhaps even before you've begun, you've convinced yourself that it's
doable. The rest is busy work, and typically doesn't serve to
elucidate anything apart from possibly some details about the
formalism in question rather than the theorem you're embedding into
it.

My algebraic topology professor had a word for the sort of language
which is typically used, and what most mathematicians are aiming for:
rigourisable. Specifically, the goal is to pin enough details down so
that anyone in the audience could sit down and, with knowledge of a
particular formalism, turn it into something that a computer would
accept. Humans are at present much better than machines at bridging
all the small gaps in a consistent fashion (like applying the
appropriate isomorphisms and so on), and there's little reason that we
should suffer to communicate with each other at their level.

On the other hand, for many modern papers, putting some extra details
somewhere seems like something worthwhile, and you might even be able
to get graduate students to do a bit of it. Writing everything into a
formal, machine-checkable system still seems pretty unlikely though.

There's also a good philosophical reason for working in a way which is
slightly detached from any particular formalism when possible -- any
particular formalism for mathematics might happen to be inconsistent.
The normal level of detachment from formalism would cushion many
results from the shock-waves an inconsistency would send through every
branch of mathematics, as it is quite likely that many things will
still have justifiable translations in terms of whatever new system
people would then come up with. Things written directly at the level
of the formalism would be comparatively brittle, and possibly need
translation up into softer language and then back down into the new
system.

Anyway, that's just my own perspective on the matter. Sorry for going
on for so long in this somewhat off-topic discussion. :)