Re: Qualified identifiers opinion

Hi Isaac, just to give you a reply at all, see below. I reply glasgow-haskell-users@haskell.org since I'm not subscribed to haskell-prime. And I don't want to subscribe, because I'm more interested that Haskell becomes more stable (and standard). So here is my opinion: 1. The lexer should recognize keywords. 2. I would not mind if Haskel98 rejected all keywords that are also rejected by extensions, so that the lexer is extension independent. (Starting with Haskell98, removing conflicting identifiers as soon as I switch on valuable extensions does not make sense.) 3. I'm against qualified identifiers, with the unqualified part being a keyword like "Foo.where". (The choice of qualification should be left to the user, usually one is not forced to used qualified names.) 4. However, "Foo.where" should always be rejected and not changed to "Foo.wher e"! (Longest matching, aka "maximal munch", must not consider keywords!) (see end of: http://www.haskell.org/onlinelibrary/lexemes.html#sect2.4) I would not mind if a name "F. " is plainly rejected. It only makes sense, when a data constructor is the first argument of the composition operator "(.)" Maybe "." and "$" as operators should require white spaces on both sides, since "$(" also indicates template haskell. Cheers Christian Isaac Dupree wrote:
Especially after writing a partial lexer for Haskell, I opine that this should be all legal:
module Foo where
--in case you didn't know, this is legal syntax: Foo.f = undefined
Foo.mdo = undefined Foo.where = undefined x Foo.! y = undefined x Foo... y = undefined --remember ".." is reserved id, e.g. [2..5]
{-# LANGUAGE RecursiveDo, BangPatterns #-} module Bar where import Foo hello !x = mdo { y <- Foo.mdo Foo... ({-Foo.-}f x y); return y }
{- Haskell 98 -} module Baz where import Foo goodbye x = x ! 12
(Foo.where) lexing as (Foo.wher e) or (Foo . where) does not make me happy. (being a lexer error is a little less bad...) Especially not when the set of keywords is flexible. I don't see any good reason to forbid declaring keywords as identifiers/operators, since it is completely unambiguous, removes an extension-dependence from the lexer and simplifies it (at least the mental lexer); Also I hear that the Haskell98 lexing is (Foo.wher e), which I'm sure no one relies on...
Well, that's my humble opinion on what should go into Haskell' on this issue.
Isaac

On Fri, Aug 17, 2007 at 12:53:11PM +0200, Christian Maeder wrote:
Hi Isaac,
just to give you a reply at all, see below. I reply glasgow-haskell-users@haskell.org since I'm not subscribed to haskell-prime. And I don't want to subscribe, because I'm more interested that Haskell becomes more stable (and standard). So here is my opinion:
1. The lexer should recognize keywords.
2. I would not mind if Haskel98 rejected all keywords that are also rejected by extensions, so that the lexer is extension independent. (Starting with Haskell98, removing conflicting identifiers as soon as I switch on valuable extensions does not make sense.)
3. I'm against qualified identifiers, with the unqualified part being a keyword like "Foo.where". (The choice of qualification should be left to the user, usually one is not forced to used qualified names.)
4. However, "Foo.where" should always be rejected and not changed to "Foo.wher e"! (Longest matching, aka "maximal munch", must not consider keywords!)
(see end of: http://www.haskell.org/onlinelibrary/lexemes.html#sect2.4)
I would not mind if a name "F. " is plainly rejected. It only makes sense, when a data constructor is the first argument of the composition operator "(.)"
Maybe "." and "$" as operators should require white spaces on both sides, since "$(" also indicates template haskell.
What's wrong with the status quo? Our current lexical rules *seem* complicated to newbies, but just like everything else in Haskell it carries a deep simplicity; having only one rule (maximal-munch) gives a certain elegance that the proposals all lack. I'd hate to see Haskell become complex all the way down just to fix a few corner cases; I see this pattern of simplicity degerating through well-intentioned attempts to fix things all over the language... Stefan

-----Original Message----- From: glasgow-haskell-users-bounces@haskell.org [mailto:glasgow-haskell-users-bounces@haskell.org] On Behalf Of Stefan O'Rear Sent: Friday, August 17, 2007 2:45 PM To: Christian Maeder Cc: Haskell Prime; GHC Users Mailing List; Isaac Dupree Subject: Re: Qualified identifiers opinion On Fri, Aug 17, 2007 at 12:53:11PM +0200, Christian Maeder wrote:
Hi Isaac,
just to give you a reply at all, see below. I reply glasgow-haskell-users@haskell.org since I'm not subscribed to haskell-prime. And I don't want to subscribe, because I'm more interested that Haskell becomes more stable (and standard). So here is my opinion:
1. The lexer should recognize keywords.
2. I would not mind if Haskel98 rejected all keywords that are also rejected by extensions, so that the lexer is extension independent. (Starting with Haskell98, removing conflicting identifiers as soon as I switch on valuable extensions does not make sense.)
3. I'm against qualified identifiers, with the unqualified part being a keyword like "Foo.where". (The choice of qualification should be left to the user, usually one is not forced to used qualified names.)
4. However, "Foo.where" should always be rejected and not changed to "Foo.wher e"! (Longest matching, aka "maximal munch", must not consider keywords!)
(see end of: http://www.haskell.org/onlinelibrary/lexemes.html#sect2.4)
I would not mind if a name "F. " is plainly rejected. It only makes sense, when a data constructor is the first argument of the composition operator "(.)"
Maybe "." and "$" as operators should require white spaces on both sides, since "$(" also indicates template haskell.
What's wrong with the status quo? Our current lexical rules *seem* complicated to newbies, but just like everything else in Haskell it carries a deep simplicity; having only one rule (maximal-munch) gives a certain elegance that the proposals all lack. I'd hate to see Haskell become complex all the way down just to fix a few corner cases; I see this pattern of simplicity degerating through well-intentioned attempts to fix things all over the language... Stefan I agree with Stefan, for the reasons he stated and for one additional reason: There would be a multitude of unintended behavior changes. Seth Kurtzberg Software Engineer Specializing in Security, Reliability, and the Hardware/Software Interface

Stefan O'Rear wrote:
What's wrong with the status quo? Our current lexical rules *seem* complicated to newbies, but just like everything else in Haskell it carries a deep simplicity; having only one rule (maximal-munch) gives a certain elegance that the proposals all lack.
I'm quite in favour of "maximal munch", but after munching "Foo." or "Foo.where" saying: Sorry I've munched too much, I meant to munch only "Foo" and "." (because Foo is a data constructor) or "Foo.wher" and "e" (because "where" is a keyword) carries "simplicity", to "deep" for me. Cheers Christian

Christian Maeder wrote:
Stefan O'Rear wrote:
What's wrong with the status quo? Our current lexical rules *seem* complicated to newbies, but just like everything else in Haskell it carries a deep simplicity; having only one rule (maximal-munch) gives a certain elegance that the proposals all lack.
I'm quite in favour of "maximal munch", but after munching "Foo." or "Foo.where" saying:
Sorry I've munched too much, I meant to munch only "Foo" and "." (because Foo is a data constructor) or "Foo.wher" and "e" (because "where" is a keyword)
carries "simplicity", to "deep" for me.
Apologies, if the above sounded rude. In fact, I just realized that the tokenizer can decide what to do with "Foo." or "Foo.wher" when seeing the next character. However, it is not helpful at all, when the lexer passes "Foo.wher" and "e" to the type checker. The programmers's input deserves more respect. C.

Stefan O'Rear wrote:
On Fri, Aug 17, 2007 at 12:53:11PM +0200, Christian Maeder wrote:
Hi Isaac,
just to give you a reply at all, see below. I reply glasgow-haskell-users@haskell.org since I'm not subscribed to haskell-prime. And I don't want to subscribe, because I'm more interested that Haskell becomes more stable (and standard). So here is my opinion:
1. The lexer should recognize keywords.
2. I would not mind if Haskel98 rejected all keywords that are also rejected by extensions, so that the lexer is extension independent. (Starting with Haskell98, removing conflicting identifiers as soon as I switch on valuable extensions does not make sense.)
3. I'm against qualified identifiers, with the unqualified part being a keyword like "Foo.where". (The choice of qualification should be left to the user, usually one is not forced to used qualified names.)
4. However, "Foo.where" should always be rejected and not changed to "Foo.wher e"! (Longest matching, aka "maximal munch", must not consider keywords!)
(see end of: http://www.haskell.org/onlinelibrary/lexemes.html#sect2.4)
I would not mind if a name "F. " is plainly rejected. It only makes sense, when a data constructor is the first argument of the composition operator "(.)"
Maybe "." and "$" as operators should require white spaces on both sides, since "$(" also indicates template haskell.
What's wrong with the status quo? Our current lexical rules *seem* complicated to newbies, but just like everything else in Haskell it carries a deep simplicity; having only one rule (maximal-munch) gives a certain elegance that the proposals all lack.
I'd hate to see Haskell become complex all the way down just to fix a few corner cases; I see this pattern of simplicity degerating through well-intentioned attempts to fix things all over the language...
I believe the solution we adopted for GHC 6.8.1 (and I proposed for Haskell') strikes the right balance. M.where is lexed as an identifier. This doesn't require adding any exceptions or corner cases to either the implementation or the specification of the grammar. It is much easier to implement than the existing Haskell 98 rule (I deleted 30 lines of code from GHC's lexer to implement it). It's easy to understand. It removes an opportunity for obfuscation. It must be the right thing! Cheers, Simon

Simon Marlow wrote:
I believe the solution we adopted for GHC 6.8.1 (and I proposed for Haskell') strikes the right balance.
M.where is lexed as an identifier. This doesn't require adding any exceptions or corner cases to either the implementation or the specification of the grammar. It is much easier to implement than the existing Haskell 98 rule (I deleted 30 lines of code from GHC's lexer to implement it). It's easy to understand. It removes an opportunity for obfuscation. It must be the right thing!
Now I've found the h'-wiki page http://hackage.haskell.org/trac/haskell-prime/wiki/QualifiedIdentifiers I _think_ the change to lexical syntax on that page is the one Simon mentions? and is also the same as what I am supporting? (I am terribly confused about "Foo.f = " though, since I thought I _used_ some code that qualified its definitions that way, and thought it was odd. Maybe it was just referring to the things it defined by e.g. Foo.f (without importing itself), and I was confused, and further confused that definitions then COULDN'T be qualified that way? oh dear...) Isaac

Simon Marlow wrote:
I believe the solution we adopted for GHC 6.8.1 (and I proposed for Haskell') strikes the right balance.
M.where is lexed as an identifier. This doesn't require adding any exceptions or corner cases to either the implementation or the specification of the grammar. It is much easier to implement than the existing Haskell 98 rule (I deleted 30 lines of code from GHC's lexer to implement it). It's easy to understand. It removes an opportunity for obfuscation. It must be the right thing!
Yes, and fortunately (for all tools that output names unqualified, like ghci's browse) an identifier like "M.where" is unusable, because ghc fails with: Qualified name in function definition Otherwise (or nevertheless) I would invest an extra code line in the lexer to rule out such identifiers. C.

Christian Maeder wrote:
Hi Isaac,
just to give you a reply at all, see below. I reply glasgow-haskell-users@haskell.org since I'm not subscribed to haskell-prime. And I don't want to subscribe, because I'm more interested that Haskell becomes more stable (and standard).
Then maybe you can join haskell-prime and provide the energy that rounds up all the little fixes and tries to actually produce the thing! Drastic changes are not intended to go in. Haskell' should bring more stability and standardness (as long as it doesn't diverge too much from Haskell98, which would decrease stability and standardness)
So here is my opinion:
1. The lexer should recognize keywords.
2. I would not mind if Haskel98 rejected all keywords that are also rejected by extensions, so that the lexer is extension independent. (Starting with Haskell98, removing conflicting identifiers as soon as I switch on valuable extensions does not make sense.)
Trouble is, extensions are just that: extensions, and more with their own keywords may be added in the future! unless we want an internet-standard-like "x-keywordname" - but that doesn't solve this problem: standardized new keyword names clogging up the general namespace, as long as they don't have a symbol (like Objective-C has @class, @whatever...).
3. I'm against qualified identifiers, with the unqualified part being a keyword like "Foo.where". (The choice of qualification should be left to the user, usually one is not forced to used qualified names.)
4. However, "Foo.where" should always be rejected and not changed to "Foo.wher e"! (Longest matching, aka "maximal munch", must not consider keywords!)
(see end of: http://www.haskell.org/onlinelibrary/lexemes.html#sect2.4)
I would not mind if a name "F. " is plainly rejected. It only makes sense, when a data constructor is the first argument of the composition operator "(.)"
I wouldn't mind if that was banned either. That case needs to be considered for implementing my lexer. In fact, banning that and qualified keywords allows the lexer proper not to know keywords and nevertheless ban qualified keywords (a bit of a hack). But... while I wouldn't _recommend_ using qualified keywords, and compilers could give a warning even for haskell98 code that uses known extension-keyword-names at all, it seems best to me, to _allow_ them, in the interests of allowing code to remain fairly stable with the potential of extensions being developed (especially thinking of the BangPatterns that had an effect on existing definitions of (!) ).
Maybe "." and "$" as operators should require white spaces on both sides, since "$(" also indicates template haskell.
but it's so convenient as it is... plenty of code uses (.) without spaces, and I don't like the way template-haskell steals "$(" and "$id" (from the point of view of a person who has never tried to use template-haskell). I think haskell is more stable by allowing existing code e.g. (f = fix (\rec -> .... rec ....) --'rec' is arrow-sugar keyword than banning some bunch of new keyword names. And allowing interim interoperability with old code that exports those names, like the unfortunate (!) or (.) (I know, those aren't exactly ever keywords/syms) seems like a good idea when it removes complexity rather than adding it. I don't want Haskell98 to become a language that has difficulty interoperating with libraries and using-applications that use newer Haskells. from other comments:
What's wrong with the status quo? Our current lexical rules *seem* complicated to newbies, but just like everything else in Haskell it carries a deep simplicity; having only one rule (maximal-munch) gives a certain elegance that the proposals all lack.
I'd hate to see Haskell become complex all the way down just to fix a few corner cases; I see this pattern of simplicity degerating through well-intentioned attempts to fix things all over the language...
I agree with Stefan, for the reasons he stated and for one additional reason: There would be a multitude of unintended behavior changes.
Well, GHC doesn't implement aforementioned maximal-munch re: keywords. I don't think it's good (compositional?) design for the set of keywords to be part of the lexer rather than a pass after it, when keywords behave so similarly to other words, and also when there are non-keywords like "as" and "qualified" and sometimes "forall" (whose non-reserved status I support). lex --> keywords --> layout --> parse Besides, I don't think any of the above proposals will generate behavior changes in real code. Some cause more errors (adding more keywords; banning adjacent '.' or '$') and some allow a few more things that were errors before. f = Just.let x = x in id --a.k.a. f = Just would break in my proposal, but it also breaks according to Haskell98... Isaac

Christian Maeder wrote: | 3. I'm against qualified identifiers, with the unqualified part being a | keyword like "Foo.where". (The choice of qualification should be left to | the user, usually one is not forced to used qualified names.) Okay, here's a thought experiment... one may follow along, and agree or not as one likes (I'm not sure how much I agree with it myself, though it might be an interesting way to structure a compiler)
{-# LANGUAGE ForeignFunctionInterface #-} module Foo where
Suppose all modules have an implicit, unavoidable
import ":SpecialSyntax" (module, where, let, [], -- ... , foreign --because that extension is enabled )
Now let's import some imaginary already-existing modules that use "keywords"
import A (foreign) import B (mdo)
This turns up a problem already: explicitly naming things in an import or export list might not work unambiguously, because keywords are sometimes used to mean special things there. Going on... maybe we imported the whole modules. According to standard Haskell import rules, there is no conflict until the ambiguous word is used. Either "f" here works fine, because ":SpecialSyntax" in this module did not import "mdo":
f = mdo f = B.mdo
Whereas the possibilities with "foreign"...
g = foreign --error, ambiguous!!!! foreign import ccall ........ --error, ambiguous!!!! g = A.foreign --okay, unambiguous ":SpecialSyntax".foreign import ccall .... -- can't write in Haskell!
Now, if we want to avoid the understandably undesirable matter of imports interfering with keywords (after all, keywords can appear before the import list is finished, such as "module" "where" and "import"), we could tweak the import-conflict rules for this special case. In this module where "foreign" is imported from ":SpecialSyntax", the mere phrase "import A" could raise an error, as if all imported syntax were used (unqualified, as always) in the module. Whereas, "import qualified A" would not. (and what about "import A hiding ..."?) By the way, we are lucky that pragmas have their own namespace {-# NAME arguments #-}. But as I mentioned, we lack a namespace for extensions that have a semantic impact on the annotated code. Certain preprocessors invent things like {-! !-} ... or add template-haskell syntax, or some arbitrary other keywords syntax like "proc...do"... or even steal large portions of existing comment syntax (Haddock, which isn't even a semantic impact on the code!) BTW #2: The simpler and less variable the lexer is, the easier it is to scan for LANGUAGE pragmas. That search doesn't need to deal with keywords at all. (although it may be popular not to use the usual lexer in order to search for those pragmas, I don't know) Isaac

Isaac Dupree wrote:
{-# LANGUAGE ForeignFunctionInterface #-} module Foo where
Suppose all modules have an implicit, unavoidable
import ":SpecialSyntax" (module, where, let, [], -- ... , foreign --because that extension is enabled )
Now let's import some imaginary already-existing modules that use "keywords"
I think, it is nonsense to make an extension dependent lexer, because - as said before - I don't want to change my old code when switching on an extension. The syntax of extensions should be simply illegal for code without that extension. Breaking old code just because a new keyword has been introduced for a new extension is a smaller problem than trying to maintain (and call) different lexers. http://www.haskell.org/ghc/docs/6.6.1/html/users_guide/ghc-language-features... It would be nice, if the "Syntax stolen" bits could be streamlined for haskell-prime. In particular "[e|", "[p|", "[d|", "[t|" are ugly -- a keyword token for template haskell (-fth), three tokens (for comprehensions) otherwise. If that -fth syntax shall be kept, one should disallow any combination "[<letter>|" without white space for comprehensions! Cheers Christian
participants (5)
-
Christian Maeder
-
Isaac Dupree
-
Seth Kurtzberg
-
Simon Marlow
-
Stefan O'Rear