Why shouldn't variable names be capitalized?

Hi, I'm wondering what the rationale was for not allowing capitalized variable names (and uncapitalized type names and constructors). I can only think of two arguments, and IMHO both of them are bad: 1. Enforces a naming convention. Fine - but my view is that this doesn't belong in the language definition (it belongs in the user's coding standards). I get annoyed, for example, that when I write code that manipulates matrices and vectors, I can't refer to the matrices with capital letters as is common in the literature. And to anyone who says that it's good to enforce naming consistency, I have this to say: Any language that requires me to learn about category theory in order to write imperative code should treat me like an adult when it comes to the naming of variables as well. ;-) 2. It makes it easier to write the compiler. I don't think I need to explain why this is bad... I imagine that someone is just itching to "sort me out". Do your worst! ;-) Thx Martin

On Aug 4, 2006, at 1:12 PM, Martin Percossi wrote:
Hi, I'm wondering what the rationale was for not allowing capitalized variable names (and uncapitalized type names and constructors). I can only think of two arguments, and IMHO both of them are bad:
1. Enforces a naming convention. Fine - but my view is that this doesn't belong in the language definition (it belongs in the user's coding standards). I get annoyed, for example, that when I write code that manipulates matrices and vectors, I can't refer to the matrices with capital letters as is common in the literature.
This is occasionally irritating.
And to anyone who says that it's good to enforce naming consistency, I have this to say: Any language that requires me to learn about category theory in order to write imperative code should treat me like an adult when it comes to the naming of variables as well. ;-)
2. It makes it easier to write the compiler. I don't think I need to explain why this is bad...
Eh? I'm not convinced this is a bad reason. It obviously needs to be balanced against other competing factors, but ease of implementation should always a consideration when designing a language. 3. It removes a whole class of possible ambiguities from the language. You the programmer (and the compiler, as an added bonus) can always identify the syntactic class of an identifier from _purely local_ context. Suppose I remove the case restriction. Is the following a pattern match or a function definition? Is M a variable or a data constructor? let f x M = z M in .... You can't tell! Worse, it could change depending on what identifiers are in scope. It could happen that you import a module and it silently causes your function definition to change to a pattern match. The situation is similar with type classes and type variables. You could magically end up with an instance declaration that is less polymorphic than you expect (if you have extensions turned on).
I imagine that someone is just itching to "sort me out". Do your worst! ;-)
Thx Martin
Rob Dockins Speak softly and drive a Sherman tank. Laugh hard; it's a long way to the bank. -- TMBG

Martin Percossi wrote:
Hi, I'm wondering what the rationale was for not allowing capitalized variable names (and uncapitalized type names and constructors). I can only think of two arguments, and IMHO both of them are bad:
I'm not so sure about variable names and constructors, but the type syntax just wouldn't work without a lexical distinction between type names and type variables. Is (Int -> Int) supposed to be polymorphic with a type variable named "Int", or is it talking about a type "Int"? Perhaps you'd be happier reserving names beginning with apostrophes for variables? I think case is a bit easier to see - subattentive visual processing, and all that. Mostly, case is used so you know what basic sort of thing some object is, without reviewing everything in scope. Mathematicians use typesetting similarly to tell basic kinds of things appart. Imagine a mathematician complaining that he was forced to learn category theory to get a degree, and people still don't let him use letter with an arrow over it to denote a scalar quantity. It's just that ascii is more restricted, so we don't have things like fonts, greek letters, and accents. Lexical syntax is the least important kind of linguistic freedom. Brandon

Ok, you asked for it, so here's my worst :-) 1) Here's what the "History of Haskell" has to say about this: Namespaces were a point of considerable discussion in the Haskell Committee. We wanted the user to have as much freedom as possible, while avoiding any form of ambiguity. So we carefully defined a set of lexemes for each namespace that were orthogonal when they needed to be, and overlapped when context was sufficient to distinguish their meaning. As an example of overlap, capitalised names such as Foo can, in the same lexical scope, refer to a type constructor, a data constructor, and a module, since whenever the name Foo appears, it is clear from context to which entity it is referring. As an example of orthogonality, we designed normal variables, infix operators, normal data constructors, and infix data constructors to be mutually exclusive. We adopted from Miranda the convention that data constructors are capitalised while variables are not; and added a similar convention for infix constructors, which in Haskell must start with a colon. ... The key point here is that we wanted data constructors to be orthogonal to formal parameters. For example, in: foo x y = ... We know that x and y are formal parameters, whereas if they were capitalized we'd know that they were constructors. Some of us had had experience with ML where this distinction is not made, and we didn't like that. There are surely other ways to achieve this, but captilization was one of the least painful, as we saw it. 2) Note that this is not a compiler issue -- the compiler won't have much problem either way -- but it is a readability issue. 3) I suspect that you are mostly kidding, but Haskell doesn't require you to know any category theory to write imperative code! I hope this helps, -Paul Martin Percossi wrote:
Hi, I'm wondering what the rationale was for not allowing capitalized variable names (and uncapitalized type names and constructors). I can only think of two arguments, and IMHO both of them are bad:
1. Enforces a naming convention. Fine - but my view is that this doesn't belong in the language definition (it belongs in the user's coding standards). I get annoyed, for example, that when I write code that manipulates matrices and vectors, I can't refer to the matrices with capital letters as is common in the literature. And to anyone who says that it's good to enforce naming consistency, I have this to say: Any language that requires me to learn about category theory in order to write imperative code should treat me like an adult when it comes to the naming of variables as well. ;-)
2. It makes it easier to write the compiler. I don't think I need to explain why this is bad...
I imagine that someone is just itching to "sort me out". Do your worst! ;-)
Thx Martin

Paul Hudak wrote:
Ok, you asked for it, so here's my worst :-)
You're too gentle! I was expecting some serious community flagellation for my heretical remarks!
1) Here's what the "History of Haskell" has to say about this:
Namespaces were a point of considerable discussion in the Haskell Committee. We wanted the user to have as much freedom as possible, while avoiding any form of ambiguity. So we carefully defined a set of lexemes for each namespace that were orthogonal when they needed to be, and overlapped when context was sufficient to distinguish their meaning. As an example of overlap, capitalised names such as Foo can, in the same lexical scope, refer to a type constructor, a data constructor, and a module, since whenever the name Foo appears, it is clear from context to which entity it is referring. As an example of orthogonality, we designed normal variables, infix operators, normal data constructors, and infix data constructors to be mutually exclusive.
We adopted from Miranda the convention that data constructors are capitalised while variables are not; and added a similar convention for infix constructors, which in Haskell must start with a colon. ...
The key point here is that we wanted data constructors to be orthogonal to formal parameters. For example, in:
foo x y = ...
We know that x and y are formal parameters, whereas if they were capitalized we'd know that they were constructors. Some of us had had experience with ML where this distinction is not made, and we didn't like that. There are surely other ways to achieve this, but captilization was one of the least painful, as we saw it.
I agree that naming can be abused. But I think it should be *me*, the programmer, or in the limit ghc, the glorious compiler (but only because of unresolvable ambiguities), who decides it -- not *you*, the language implementor!!! ;-)
2) Note that this is not a compiler issue -- the compiler won't have much problem either way -- but it is a readability issue.
Ok - that's what I suspected - contrary to some of the other replies which seem to imply that it would cause big problems in the compiler. While I have never written a compiler of anything near the complexity of haskell (I just about managed an awk-like language! ;-), you still feel that it shouldn't be that difficult to handle these cases.
3) I suspect that you are mostly kidding, but Haskell doesn't require you to know any category theory to write imperative code!
True again - but I think you understood the general gist.
I hope this helps, -Paul
It does, thanks for your time. And now I will stop complaining! ;-) Martin

On Fri, 4 Aug 2006, Martin Percossi wrote:
I agree that naming can be abused. But I think it should be *me*, the programmer, or in the limit ghc, the glorious compiler (but only because of unresolvable ambiguities), who decides it -- not *you*, the language implementor!!! ;-)
The ML constructor/variable ambiguity introduces a nasty maintenance
headache: what if you upgrade to a new version of a library which
introduces a new constructor which happens to be the same as a variable
you have been using? Suddenly the meaning of your functions changes!
Tony.
--
f.a.n.finch

Martin Percossi wrote:
Paul Hudak wrote:
foo x y = ...
We know that x and y are formal parameters, whereas if they were capitalized we'd know that they were constructors.
I agree that naming can be abused. But I think it should be *me* ...
Oh, you like to decide lexical ambiguities. Well, I suppose you know a bit of C++. So what do you think this is: *> int *foo ; It's the declaration of a pointer to 'int' named 'foo', isn't it? now what's this: *> x * y ; *Obviously* this mulplies x and y and throws the result away, doesn't it? Now look more closely. Do you see it? Or does it get more blurred the closer you look? We don't have this problem in Haskell, and in a sane world, C++ shouldn't have it either. If you find second-guessing the programmer funny, try to write a parser for C++. You will have so much fun, it's almost impossible to describe. Udo. -- Even if you're on the right track, you'll get run over if you just sit there. -- Will Rogers

Haskell very specifically has the really vitally important property that when you change the imports of a module in any way whatsoever, only one of two possible results can occur 1) the module behaves identically to the way it did before. or 2) the module fails to compile with an unambiguous compile-time error. This is a very important property that I wouldn't be willing to give up. also, it is nice for a human to not have to know what is imported to be able to locally determine what a function does to some degree. this would not be possible if you couldn't tell what was a constructor and what was a variable locally. heck, you can't even tell what is being defined. think of x + y * z = ... this could be declaring three top level names, x,y, and z or the function (+) or perhaps even just y and x and not z (or even a couple more possibilities) depending on which were constructors and which were variable names which you cannot determine without examining every import. not even being able to tell what values an expression is defining is a pretty bad quality :) John -- John Meacham - ⑆repetae.net⑆john⑈

Martin Percossi wrote:
Hi, I'm wondering what the rationale was for not allowing capitalized variable names (and uncapitalized type names and constructors). I can only think of two arguments, and IMHO both of them are bad:
1. Enforces a naming convention. Fine - but my view is that this doesn't belong in the language definition (it belongs in the user's coding standards).
If everyone uses the same coding standards it's easier to understand other people's code. Also, if you're working in a team you'd usually have to agree to adhere to someone else's idea of how to name identifiers which you might find really irritating, whereas with Haskell there is at least some global common ground that has already been established so there would be less reasons to get irritated with actual people! ;-)
I get annoyed, for example, that when I write code that manipulates matrices and vectors, I can't refer to the matrices with capital letters as is common in the literature.
But you also can't write things like: v' = M v in a general purpose programming language and expect it to be interpreted as v' = M * v.
And to anyone who says that it's good to enforce naming consistency, I have this to say: Any language that requires me to learn about category theory in order to write imperative code should treat me like an adult when it comes to the naming of variables as well. ;-)
But it does! Haskell realises that as an adult you are more interested in getting as much feedback about the correctness of your program as possible, rather than glossing over possible errors to maintain an illusory world where the lure of extra choices magnifies childish whimsy! ;-)
2. It makes it easier to write the compiler. I don't think I need to explain why this is bad...
Why would you say this? If it's easier to write the compiler the chances are: a) The compiler will have a simpler and cleaner design a1) There will be less bugs in the compiler a2) It is easier to modify so more language improvements can be explored b) More people will be motivated to write or modify compilers or other language-processing tools leading to improvements in the language and better development environments
I imagine that someone is just itching to "sort me out". Do your worst! ;-)
The extra coding confidence you gain by having a fixed capitalisation rule probably allows you to feel more relaxed when coding so I would not be surprised if the capitalisation rule leads to health benefits and therefore a feeling of peace, well-being, and goodwill towards other coders because either one is in agreement or else there is a common enemy namely the rule that cannot be changed! ;-) All we can do is pity those poor C++ souls locked in an abyss of inflated personality, case conflicts, chronic anxiety, and bug ridden code... Best regards, Brian. -- Logic empowers us and Love gives us purpose. Yet still phantoms restless for eras long past, congealed in the present in unthought forms, strive mightily unseen to destroy us. http://www.metamilk.com

There are two places where confusion could arise if you didn't have the case distinction in Haskell: pattern matching (does a name refer to a constructor or not) and type expressions (is it a type variable or not). In Haskell the distinction is made by case, but this is far from the only choice. There are other ways to mark what is a variable and what is not. I don't necessarily think that Haskell did it the best way, but then this is a minor syntactic issue. Changing the case of variables is a pretty low price to pay to solve this problem. -- Lennart On Aug 4, 2006, at 13:12 , Martin Percossi wrote:
Hi, I'm wondering what the rationale was for not allowing capitalized variable names (and uncapitalized type names and constructors). I can only think of two arguments, and IMHO both of them are bad:
1. Enforces a naming convention. Fine - but my view is that this doesn't belong in the language definition (it belongs in the user's coding standards). I get annoyed, for example, that when I write code that manipulates matrices and vectors, I can't refer to the matrices with capital letters as is common in the literature. And to anyone who says that it's good to enforce naming consistency, I have this to say: Any language that requires me to learn about category theory in order to write imperative code should treat me like an adult when it comes to the naming of variables as well. ;-)
2. It makes it easier to write the compiler. I don't think I need to explain why this is bad...
I imagine that someone is just itching to "sort me out". Do your worst! ;-)
Thx Martin _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
participants (9)
-
Brandon Moore
-
Brian Hulley
-
John Meacham
-
Lennart Augustsson
-
Martin Percossi
-
Paul Hudak
-
Robert Dockins
-
Tony Finch
-
Udo Stenzel