
On 17/08/06, Brian Hulley
Jared Updike wrote:
In other words, superscripts bind tighter than prefix ops but prefix ops bind tighter than infix.
I see. My point is that there already exists a convention[1] that the way to type in 2 -4 is -4^2 which means -(4^2) not (-4)^2 because - as a prefix op has the same precedence as binary subtraction, not super tight like normal prefix ops (i.e. normal function application) as you would like it to be (if I understand correctly). You are welcome to break an existing (unofficial) convention for the sake of lexical syntax[2]. [2] http://wadler.blogspot.com/2006/01/bikeshed-coloring.html
This choice of precedence for unary - conflicts with the normal usage in languages like C, where unary ops "obviously" bind tighter than infix.
The typesetting in maths conveys a lot of info eg to distinguish f -x from f - x or f-x, and so the relationship between the visual representation and the meaning depends on a knowledge of various conventions that have evolved over time, and the knowledge of when to apply them in a given context.
In contrast, a programming language should be based on general concepts uniformly applied. In Haskell we have operators, identifiers, prefix application using an identifier and infix application using a symbol, and a uniform way to convert a symbol to an identifier and vice versa, and a uniform way of forming sections.
All this machinery should be enough to be satisfied with. However, someone somewhere decided that one trivial arithmetic operation, namely unary minus, should be allowed to totally ruin everything, and not only that, but that half of any number line, the positives, should (literally!) have a special advantage over the other half, the negatives.
Thus while I can agree with Wadler that it's easy to have different opinions on "little" issues, I think that in this case the goal of uniformity leads to an objective answer.
Of course not all languages care about being uniform or neat ;-)
Best regards, Brian.
First, f - x, f -x, and f-x all tend to mean the same thing in mathematics, though f -x would be considered poorly typeset (and, to some degree, they're all poorly typeset, because we're using hyphens rather than the minus symbol, which really don't look the same). We tend to write f(-x) when applying a function f to the negation of x, even in circumstances when application is normally written without parentheses. Secondly, I think it's quite a reasonable thing to do to treat unary negation as a separate operation. It follows quite naturally to do so from the definition of a ring. While having separate literals for negative numbers might be okay, it seems unnecessary in light of the fact that we *do* want a nice looking unary negation symbol, which doesn't strictly apply to literals. If -x suddenly became a non-expression, and I had to write 0-x, -1*x or (negate x) instead, I'd consider that a severe enough bug that I would avoid upgrading my compiler until it was fixed. In mathematics, we don't use separate symbols for negative integers, and negated positive integers, even though in the underlying representation of the integers as equivalence classes of pairs of naturals, we can write things like -[(1,0)] = [(0,1)], which expressed in ordinary notation just says that -1 = -1. This doesn't bother us, because the two things are always equal. Another thing to note is that all the natural literals are not, as one might initially think, plain values, but actually represent the embedding of that natural number into the ring (instance of Num), by way of 0 and 1. They simply provide a convenient notation for getting particular values of many rings, but in many cases, don't get one very far at all before other functions must be introduced to construct the constant values one wants. While there always is a homomorphism from Z to a ring (represented in Haskell by fromInteger), one would get similar expressiveness by with just the nullary operators 0 and 1, and the unary negation as well as addition and multiplication (albeit with an often severe performance hit, and some annoyance, I'm not recommending we really do this, simply characterising the purpose of numeric literals). If the performance issue regarding the polymorphic literal -5 meaning negate (fromInteger 5) is a problem, it would be easy enough to agree for the compiler to find and rewrite literals like that as fromInteger (-5) instead, where -5 is the precomputed integer -5. Assuming that fromInteger is not broken, that will always mean the same thing (because fromInteger is supposed to be a homomorphism). Similarly, when the type of (fromInteger x) is known statically to be Integer, the compiler can rewrite it as x. In any event, this is a tiny constant factor performance hit. Anyway, the point of all this is that 0,1,2... are not really literals at all. They're nullary operators which give particular elements of any given instance of Num. Perhaps at some level in the compiler after performing the fromInteger transformation they may be taken as literal integers, but there is no reason that this level has to be exposed to the user. Additionally, consider things like Rational. It is possible to write some elements of Rational in terms of integer "literals", but not all of them, even if negative literals become included. Floating point literals help a bit here, but not really all that much. (Consider things like 1/3, or 1/7.) In particular, any rational number with a denominator greater than 1 is inaccessible from that interface. Based on your previously mentioned design principle that all values of a type should be expressible via literals, or none of them should be, we should in fact remove the polymorphic interface for 0,1, etc. and force the user to type 1%1 for the rational 1. But this is annoying, and destroys polymorphism! I think that design principle is broken. If it was extended to say something like "All values of a type should be possible to write solely in terms of its constructors, or none of them should be.", then potentially infinite data structures would be excluded from having exposed constructors, for no good reason other than that there are infinite values which require other operations to define. This is, in a way, rather similar to the problem with rationals. I'd also like to say that the exponentiation example is also a good one. -4^2 is *always* -16, in every sane mathematical context since unary negation is treated as an additive operation, and thus should happen after exponentiation and multiplication (though under normal circumstances, it doesn't matter whether it's done before or after multiplication). Though this is a little offtopic, another important thing to note about parsing exponentiation is that a^b^c always means a^(b^c) and not (a^b)^c, which is a fairly standard thing in mathematics, because of the tendency to automatically rewrite (a^b)^c as a^(b*c), which looks nicer (and wouldn't normally involve parentheses on the page), and that no such rule exists for the other association. While I've considered that there are reasons that requiring spaces to be included to separate operator symbols from their arguments might actually be a decent thing to have, I wouldn't recommend doing things in the way that you're suggesting. With that in place, we could have negative integer literals (provided that people really care that strongly), but that's no reason to drop unary negation altogether -- just require that a space occur between the unary minus and its parameter. However, there are certain operators, especially exponentiation, and multiplication inside an additive expression, which putting spaces around them just looks "wrong" to me, and though I might be able to get used to it, I'd probably end up recompiling things all the time over syntax errors related to it. Newcomers to the language would also probably dislike it when they typed x+y at the ghci prompt and got some error saying that x+y is not in scope. - Cale