Wanted: warning option for usages of unary minus

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Since the unary negation operator `-' is often considered a wart in Haskell's syntax and in many cases saying "negate" is arguably clearer anyway, I propose adding options to GHC to warn about its use. The only case I don't normally want to be warned about is "negative numeric literals", e.g. (-1), even if in current Haskell they do mean (negate (fromInteger 1)) not (fromInteger (negate 1)) (or equivalently, (fromInteger (negate (fromInteger 1))) as the inner fromInteger === id). I know this would be useful to someone since I caught myself manually cleaning some of my code in this way without the help of warning messages :) Proposed flag names and descriptions (could use a little tidying up) : - -fwarn-unary-minus Causes a warning to be emitted for all usages of the unary operator `-', except those in which its argument is a numeric literal that is not separated from the minus sign by any space (space includes comments). - -fwarn-all-unary-minus For those who hate it with a passion, this option warns about ALL uses of the unary `-' operator, even code like `(-1)'. Opinions? (Should I put a feature request in Trac? Is that what I perhaps should have done in the first place?) Is this simple enough to try to implement myself so I can start getting to know GHC's internals? (I suspect that detecting the adjacency of the "-" to a numeric literal without changing the actual precedence may be a little tricky... -2^6 parses as negate(2^6) and should get a warning, as should (- 2), IMO) Oh, the actual warning message... file:line:column: Warning: Unary minus in expression foo (or, on the next line, In the expression: foo ?) for an appropriate "foo", which has parentheses added to describe the effects of operator precedence, as usual for GHC diagnostic messages (it's just particularly important for making this one easy to understand in some cases) Isaac -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.3 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGGBpSHgcxvIWYTTURAt5sAKCEIohHMiuUy9AofhWKzvDOf4rwmgCeJcbj e2RmW+UX2E6omuLpFfIH9fs= =1SnR -----END PGP SIGNATURE-----

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Now I understand why negative unboxed numeric literals are parsed weirdly, after poking around a little! "The parser parses all infix applications as right-associative, regardless of fixity." http://hackage.haskell.org/trac/ghc/wiki/Commentary/Compiler/Renamer A negative sign on the left of an expression is parsed as a special case, binding tighter than all infix ops (until the renamer reassociates it) (but '-' is not parsed that way when it _follows_ an expression: ( process -1# ) is treated as _infix_ minus, i.e. subtraction, i.e. likely compile error). Then, before reassociating based on fixity, negation of an unboxed number is performed (in order to allow a sort of numeric literals that are negative and unboxed). Here is a result of this funny treatment: \begin{code} {-# OPTIONS_GHC -fglasgow-exts #-} import GHC.Base main = do putStrLn $ "boxed: " ++ show ( ( - 2 ^ 6 ) :: Int ) -- output: boxed: -64 -- === ( -(2 ^ 6 )) putStrLn $ "unboxed: " ++ show ( I# ( - 2# ^# 6# ) ) -- output: unboxed: 64 -- === ((- 2#)^# 6# ) infixr 8 ^# --just like ^, binds tighter than - (which is infixl 6) ( ^# ) :: Int# -> Int# -> Int# base ^# 0# = 1# base ^# exponent = base *# (base ^# ( exponent -# 1# )) \end{code} This particular combination of behavior, unfortunately, doesn't seem useful for implementing sensible numeric literals, IMHO. My desired warning scheme would have to wait for the renamer to sort out fixities... unless I want to warn about (-1==1) which is ((-1)==1), as well (do I want that warning? how about (1 == -1), or (1 ^^ -1), which both must parse with negation being tightly binding? I hadn't considered those very well yet...). Isaac -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.3 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGGOF8HgcxvIWYTTURAiT5AKC1Zl9JYuSLBPdey/YdmCriY7FaUQCgqzNQ clHWTS162IZWHhlXKJR8NhQ= =zqzy -----END PGP SIGNATURE-----

I definitely think that -1# should be parsed as a single lexeme. Presumably it was easier at the time to do it the way it is, I don't remember exactly. I'd support a warning for use of prefix negation, or alternatively you could implement the Haskell' proposal to remove prefix negation completely - treat the unary minus as part of a numeric literal in the lexer only. This would have to be optional for now, so that we can continue to support Haskell 98 of course. Cheers, Simon Isaac Dupree wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Now I understand why negative unboxed numeric literals are parsed weirdly, after poking around a little! "The parser parses all infix applications as right-associative, regardless of fixity." http://hackage.haskell.org/trac/ghc/wiki/Commentary/Compiler/Renamer
A negative sign on the left of an expression is parsed as a special case, binding tighter than all infix ops (until the renamer reassociates it) (but '-' is not parsed that way when it _follows_ an expression: ( process -1# ) is treated as _infix_ minus, i.e. subtraction, i.e. likely compile error).
Then, before reassociating based on fixity, negation of an unboxed number is performed (in order to allow a sort of numeric literals that are negative and unboxed). Here is a result of this funny treatment:
\begin{code} {-# OPTIONS_GHC -fglasgow-exts #-}
import GHC.Base
main = do putStrLn $ "boxed: " ++ show ( ( - 2 ^ 6 ) :: Int ) -- output: boxed: -64 -- === ( -(2 ^ 6 ))
putStrLn $ "unboxed: " ++ show ( I# ( - 2# ^# 6# ) ) -- output: unboxed: 64 -- === ((- 2#)^# 6# )
infixr 8 ^# --just like ^, binds tighter than - (which is infixl 6) ( ^# ) :: Int# -> Int# -> Int# base ^# 0# = 1# base ^# exponent = base *# (base ^# ( exponent -# 1# )) \end{code}
This particular combination of behavior, unfortunately, doesn't seem useful for implementing sensible numeric literals, IMHO. My desired warning scheme would have to wait for the renamer to sort out fixities... unless I want to warn about (-1==1) which is ((-1)==1), as well (do I want that warning? how about (1 == -1), or (1 ^^ -1), which both must parse with negation being tightly binding? I hadn't considered those very well yet...).
Isaac -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.3 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iD8DBQFGGOF8HgcxvIWYTTURAiT5AKC1Zl9JYuSLBPdey/YdmCriY7FaUQCgqzNQ clHWTS162IZWHhlXKJR8NhQ= =zqzy -----END PGP SIGNATURE----- _______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Simon Marlow wrote:
I definitely think that -1# should be parsed as a single lexeme. Presumably it was easier at the time to do it the way it is, I don't remember exactly.
I'd support a warning for use of prefix negation, or alternatively you could implement the Haskell' proposal to remove prefix negation completely - treat the unary minus as part of a numeric literal in the lexer only. This would have to be optional for now, so that we can continue to support Haskell 98 of course.
Cheers, Simon
Yes, I've been thinking about how to implement both - details will come later when I have more time. I think I have a reasonably working idea of how to divide up the cases for warnings for ambiguous-looking use of both infix and prefix minus, as well as actual syntax changes... Isaac -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.3 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGHLqHHgcxvIWYTTURAu4YAJ9v7fd8tkJLztqQxCblRGZy21UxfwCgn7++ OvLrEoLJtP9Uq8oQGeVhwA8= =hTdv -----END PGP SIGNATURE-----

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Isaac Dupree wrote:
Simon Marlow wrote:
I definitely think that -1# should be parsed as a single lexeme. Presumably it was easier at the time to do it the way it is, I don't remember exactly.
I'd support a warning for use of prefix negation, or alternatively you could implement the Haskell' proposal to remove prefix negation completely - treat the unary minus as part of a numeric literal in the lexer only. This would have to be optional for now, so that we can continue to support Haskell 98 of course.
Cheers, Simon
Yes, I've been thinking about how to implement both - details will come later when I have more time. I think I have a reasonably working idea of how to divide up the cases for warnings for ambiguous-looking use of both infix and prefix minus, as well as actual syntax changes...
not considering warnings, just syntax: 123abc is two valid Haskell tokens. for example: \begin{code} main = (\n c -> print (n,c)) 123Abc data Abc = Abc deriving Show \end{code} prints (123,Abc). So does this suggest that under a negation-is-part-of-numeric-token regime, 123-456 should be two tokens (a positive number then a negative number, here), as is signum-456 ... Presently, GHC doesn't even warn about the first thing (123abc) ^_^ Isaac -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.3 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGHgT9HgcxvIWYTTURAmhLAJ0Zwd8fRYWRIWDjsTRaPx84x80RBwCgjsMA RxcCEg+2T/fraJmnsBYVEhE= =HsSp -----END PGP SIGNATURE-----

Isaac Dupree wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Isaac Dupree wrote:
Simon Marlow wrote:
I definitely think that -1# should be parsed as a single lexeme. Presumably it was easier at the time to do it the way it is, I don't remember exactly.
I'd support a warning for use of prefix negation, or alternatively you could implement the Haskell' proposal to remove prefix negation completely - treat the unary minus as part of a numeric literal in the lexer only. This would have to be optional for now, so that we can continue to support Haskell 98 of course.
Cheers, Simon Yes, I've been thinking about how to implement both - details will come later when I have more time. I think I have a reasonably working idea of how to divide up the cases for warnings for ambiguous-looking use of both infix and prefix minus, as well as actual syntax changes...
not considering warnings, just syntax: 123abc is two valid Haskell tokens. for example: \begin{code} main = (\n c -> print (n,c)) 123Abc data Abc = Abc deriving Show \end{code} prints (123,Abc). So does this suggest that under a negation-is-part-of-numeric-token regime, 123-456 should be two tokens (a positive number then a negative number, here), as is signum-456 ...
Yes, absolutely.
Presently, GHC doesn't even warn about the first thing (123abc) ^_^
and remember that while '123e 4' is 3 tokens, '123e4' is only 1. Cheers, Simon

This last piece of conversation was *so* reminiscent of a paper[1] I once
read, I was almost convinced it was late by 11 days...until I checked :)
Cheers,
Dinko
[1] http://www.research.att.com/~bs/whitespace98.pdf
On 4/12/07, Simon Marlow
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Isaac Dupree wrote:
Simon Marlow wrote:
I definitely think that -1# should be parsed as a single lexeme. Presumably it was easier at the time to do it the way it is, I don't remember exactly.
I'd support a warning for use of prefix negation, or alternatively you could implement the Haskell' proposal to remove prefix negation completely - treat the unary minus as part of a numeric literal in
Isaac Dupree wrote: the
lexer only. This would have to be optional for now, so that we can continue to support Haskell 98 of course.
Cheers, Simon Yes, I've been thinking about how to implement both - details will come later when I have more time. I think I have a reasonably working idea of how to divide up the cases for warnings for ambiguous-looking use of both infix and prefix minus, as well as actual syntax changes...
not considering warnings, just syntax: 123abc is two valid Haskell tokens. for example: \begin{code} main = (\n c -> print (n,c)) 123Abc data Abc = Abc deriving Show \end{code} prints (123,Abc). So does this suggest that under a negation-is-part-of-numeric-token regime, 123-456 should be two tokens (a positive number then a negative number, here), as is signum-456 ...
Yes, absolutely.
Presently, GHC doesn't even warn about the first thing (123abc) ^_^
and remember that while '123e 4' is 3 tokens, '123e4' is only 1.
Cheers, Simon _______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
-- Cheers, Dinko

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Simon Marlow wrote:
So does this suggest that under a negation-is-part-of-numeric-token regime, 123-456 should be two tokens (a positive number then a negative number, here), as is signum-456 ...
Yes, absolutely.
[see note 1 at the end responding irrelevantly to that] Okay, here we go with the through descriptions... Warn about any "-" that precedes without spaces a numeric literal, that is not an application of "negate" to that literal. This includes when it's infix (n-1) and when it's out-precedenced (-2^6). ===> A file that does not trigger this warning is safe to have negative numeric literals added to the syntax / lexer. [see Note 2 at the end about how commonly this warning might occur in practice] Warn about any "-" that DOES NOT precede-without-spaces a numeric literal, that nonetheless means negate. ===> A file that triggers neither this nor the previous warning is safe to have negative numeric literals added AND interpretation of unqualified operator "-" as negate removed. "Reverse" warnings, for those who want to take advantage of negative numeric literal syntax and then possibly convert to Haskell98 syntax easily: If a "-" isn't followed immediately by a numeric literal, the only thing to watch out for (and warn about) is the "forbidden section" (- 1), which could mean an actual section (\x -> x - 1) in the "new" syntax. For actual negative literals: warn when literal is the left-hand-side an infix expression with relevant precedence ((> 6, which changes program behaviour) or (= 6 and not left-associative, which causes a parse error)). (being on the right-hand side, e.g. (x ^^ -1) is completely unambiguous, and expressions like (-1 + 2) mean the same thing either way). Also, warn if the literal is part of a function application: either it would become infix in '98 syntax (e.g. (signum -2)) or just negate multiple things to the right (e.g. (-1 foo)) (some of these are type errors assuming (->) isn't made an instance of Num, but that's a later stage in the compilation process). Should we allow "positive numeric literals" +37 as well, for symmetry, so we can also break (n+1) as well as (n-1)? (and also break (+1), which is actually an asymmetric problem since that isn't a section in the first place in Haskell98) Implementation notes: I haven't looked at the part of GHC's code where it deals with fixity resolution yet, but I'm concerned that GHC might throw away information about where parentheses were in the original code at the same time - which is important information for determining whether some of the warnings are valid, it seems. For the purpose of warnings, I would explicitly keep track, for unqualified operator "-", whether it was followed by a digit (which is the unique and certain determiner that a numeric literal follows. Octal and hexadecimal start with 0c for some "c" and floating-point always starts with a decimal digit). This would probably involve adding an argument isomorphic to Bool to the constructor "ITminus". Then in compiler/parser/Lexer.x just before the @varsym rule (since alex is first maximal-munch, then top-to-bottom in the .x file, in matching choice), add rules "-" / [0-9] { minus followed by number } "-" { minus not followed by number } ( the [0-9] pattern could be refined perhaps... ) Then this notation has to be carried on through the Parser.y, which shouldn't be too hard. For negative numeric literals, I think extra rules in the lexer would be added, '-' followed by the various numeric literal types (this seems a little repetitious, is there an easier way?). The varieties of literals that were standard in the first place (i.e. non-unboxed) will get " / { extension is on }" qualifications to their patterns. mkHsNegApp (in RdrHsSyn.lhs) will be simplified or removed, since we are moving towards a more sensible treatment of negative literals. Another implementation choice could be to recognize the "minus followed by number" in the parser, but then it might be hard to distinguish between '98-syntax negate, subtraction, and negative unboxed literals, without ambiguity in the parser? (Negative) numeric literals can occur in patterns, not just expressions; that may or may not need tweaks specific to it. Test cases!!!! I suppose I should make a bunch of them, that deal with every oddity I can think of, since I have already been thinking about them... (1 Prelude.-1) is infix with either syntax, and shouldn't (probably) be warned about, etc., etc. -- which explain better what the intended behaviour is anyway. Note 1: I happen to think it's silly to allow two such tokens such that one begins at the same character-location that the previous one ends, but that's clearly a completely separate issue. I have been bitten by - -fglasgow-exts and x$y z (template haskell syntax $identifier, which is rather similar to the proposed negative literal syntax) before; maybe I don't even want infix operators adjacent to identifiers normally! (but in practice everything tends to work out without difficulty) Note 2: looking through the results for http://www.google.com/codesearch for lang:haskell [0-9a-zA-Z_'#)]-[0-9] suggests that expressions like (n-1) without spaces are mildly popular. I wouldn't trust the "number of results" though, because (1) results in comments are included, (2) who knows what code it's searching, and (3) searching for lang:haskell [-][0-9] gave me fewer results than the more restrictive lang:haskell [^0-9a-zA-Z_'#)]-[0-9] . The "#" was included in case there were glasgowIdentifiers#, and the rest of the symbols could have been useful if *&$%- didn't make one infix operator. Feeling excessively thorough, Isaac -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.3 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGHqihHgcxvIWYTTURAk54AJ9rsqBgu1kKJqudazzuBm6u5WujiACg2f1Y sTrl1AZrHXxzMtnpez6OSEY= =ktjn -----END PGP SIGNATURE-----

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Er,
For the purpose of warnings, I would explicitly keep track, for unqualified operator "-", whether it was followed by a digit (which is the unique and certain determiner that a numeric literal follows. Octal and hexadecimal start with 0c for some "c" and floating-point always starts with a decimal digit). This would probably involve adding an argument isomorphic to Bool to the constructor "ITminus". Then in compiler/parser/Lexer.x just before the @varsym rule (since alex is first maximal-munch, then top-to-bottom in the .x file, in matching choice), add rules "-" / [0-9] { minus followed by number } "-" { minus not followed by number } ( the [0-9] pattern could be refined perhaps... ) Then this notation has to be carried on through the Parser.y, which shouldn't be too hard.
For negative numeric literals, I think extra rules in the lexer would be added, '-' followed by the various numeric literal types (this seems a little repetitious, is there an easier way?). The varieties of literals that were standard in the first place (i.e. non-unboxed) will get " / { extension is on }" qualifications to their patterns. mkHsNegApp (in RdrHsSyn.lhs) will be simplified or removed, since we are moving towards a more sensible treatment of negative literals. Another implementation choice could be to recognize the "minus followed by number" in the parser, but then it might be hard to distinguish between '98-syntax negate, subtraction, and negative unboxed literals, without ambiguity in the parser?
When the "new" syntax is switched on, assuming this includes removing "-" as general prefix negate, ITminus would always be not followed by a number (by design; those become single negative-number tokens). Furthermore, we don't really want to treat "-" specially in this case. So I guess the rule
"-" { minus not followed by number }
should be more like "-" / { not "new" syntax } { minus not followed by number } , and the case that interprets "..", "=>", "->", etc. would have its "-" case removed (whether "new" syntax or not). The only this this "don't treat '-' specially in this case" might fall afoul of is this proposed warning option:
If a "-" isn't followed immediately by a numeric literal, the only thing to watch out for (and warn about) is the "forbidden section" (- 1), which could mean an actual section (\x -> x - 1) in the "new" syntax.
, if it proves difficult to detect at the appropriate point whether an infix-operator was written as the unqualified "-". Isaac -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.3 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGHsG+HgcxvIWYTTURAl7sAJsFFNEcjTA6l5iPSwSqbx8zs6IkSQCcCyJY F2ng1MXJ0WN1v2scSDe72gM= =JHlF -----END PGP SIGNATURE-----

Okay, first steps: 1. A Trac ticket (#1318, http://hackage.haskell.org/trac/ghc/ticket/1318) (is "feature request" a good category, versus "task"?) 2. A test-case to make sure I don't break anything with existing '-' syntax. I'm guessing it should go in testsuite/tests/ghc-regress/parser/should_run/, although maybe since it checks Haskell-98 compatibility it should go in the testsuite/tests/h98 directory? (tested ghc and hugs, which both pass) Isaac (test-case attached in case anyone wants to look at or review it; I'll send a darcs patch adding the testcase once I know where to put it)

On Wed, Apr 11, 2007 at 09:05:21AM +0100, Simon Marlow wrote:
I definitely think that -1# should be parsed as a single lexeme. Presumably it was easier at the time to do it the way it is, I don't remember exactly.
I'd support a warning for use of prefix negation, or alternatively you could implement the Haskell' proposal to remove prefix negation completely - treat the unary minus as part of a numeric literal in the lexer only. This would have to be optional for now, so that we can continue to support Haskell 98 of course.
yes please! odd that I look forward to such a minor change in the big scheme of things, but the current treatment of negation has annoyed me more than any other misfeature I think. John -- John Meacham - ⑆repetae.net⑆john⑈

John Meacham wrote:
On Wed, Apr 11, 2007 at 09:05:21AM +0100, Simon Marlow wrote:
I definitely think that -1# should be parsed as a single lexeme. Presumably it was easier at the time to do it the way it is, I don't remember exactly.
I'd support a warning for use of prefix negation, or alternatively you could implement the Haskell' proposal to remove prefix negation completely - treat the unary minus as part of a numeric literal in the lexer only. This would have to be optional for now, so that we can continue to support Haskell 98 of course.
yes please! odd that I look forward to such a minor change in the big scheme of things, but the current treatment of negation has annoyed me more than any other misfeature I think.
Really? I'm beginning to have second thoughts about the proposed change to negation for Haskell'. The main reason, and this isn't pointed out as well as it should be on the wiki, is that "x-1" will cease to be an infix application of (-), it will parse as x applied to the literal (-1). And this is different from "x - 1" (syntax in which whitespace matters should be avoided like the plague, IMO). I think this would be worse than the current situation. Cheers, Simon

Hello,
I agree with Simon on this one: "x-1" should parse as expected (i.e.,
the infix operator "-" applied to two arguments "x" and "1"). Having
this result in a type error would be confusing to both beginners and
working Haskell programmers.
I think that if we want to change anything at all, we should simply
eliminate the unary negation operator without changing the lexer
(i.e., we would have only positive literals). Then we would have to
be explicit about what is currently happening implicitly in
Haskell98---we would write "negate 1" instead of "-1".
However, I don't thinks that this change is justified---as far as I
can see, the only benefit is that it simplifies the parser. However,
the change is not backward compatible and may break some programs.
-Iavor
On 5/14/07, Simon Marlow
John Meacham wrote:
On Wed, Apr 11, 2007 at 09:05:21AM +0100, Simon Marlow wrote:
I definitely think that -1# should be parsed as a single lexeme. Presumably it was easier at the time to do it the way it is, I don't remember exactly.
I'd support a warning for use of prefix negation, or alternatively you could implement the Haskell' proposal to remove prefix negation completely - treat the unary minus as part of a numeric literal in the lexer only. This would have to be optional for now, so that we can continue to support Haskell 98 of course.
yes please! odd that I look forward to such a minor change in the big scheme of things, but the current treatment of negation has annoyed me more than any other misfeature I think.
Really? I'm beginning to have second thoughts about the proposed change to negation for Haskell'. The main reason, and this isn't pointed out as well as it should be on the wiki, is that "x-1" will cease to be an infix application of (-), it will parse as x applied to the literal (-1). And this is different from "x - 1" (syntax in which whitespace matters should be avoided like the plague, IMO). I think this would be worse than the current situation.
Cheers, Simon
_______________________________________________ Haskell-prime mailing list Haskell-prime@haskell.org http://www.haskell.org/mailman/listinfo/haskell-prime

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Iavor Diatchki wrote:
Hello,
I agree with Simon on this one: "x-1" should parse as expected (i.e., the infix operator "-" applied to two arguments "x" and "1"). Having this result in a type error would be confusing to both beginners and working Haskell programmers.
I think that if we want to change anything at all, we should simply eliminate the unary negation operator without changing the lexer (i.e., we would have only positive literals). Then we would have to be explicit about what is currently happening implicitly in Haskell98---we would write "negate 1" instead of "-1".
However, I don't thinks that this change is justified---as far as I can see, the only benefit is that it simplifies the parser. However, the change is not backward compatible and may break some programs.
Simplifies the _mental_ parser, much more important than the compilers' parsers which are already implemented. Here is what I am thinking to do: In my own code, since there seems to be so much difficulty with the matter, don't use (-X) to mean negative for any kind of X whatsoever. For this I want a warning for ALL usages of the unary minus operator. I'll define a function for my negative literals that calls fromInteger and negate in the order I would prefer to my sensibilities, which is actually different from the order that the Report specifies for (-x) :
{-# INLINE negative #-} negative :: Num a => Integer -> a negative a = fromInteger (negate a)
I might feel like having a parallel
{-# INLINE positive #-} positive :: Num a => Integer -> a positive a = fromInteger a
(e.g. C has a unary + operator... and "positive" even has the same number-of-characters length as "negative"!). For GHC's unboxed negative literals I think I will still change the lexer/parser since the current way it's done is rather confusing anyway (as previously described) I don't know what else is worth implementing... NOT an option to turn off parsing of unary minus, since warnings are good and it would just create more incompatibility. John Meacham, since you seem to be interested, what are your thoughts now? Advice on flag names - or any other discussion! is anyone interested in having something, say so? - would be appreciated. Isaac -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGTDBQHgcxvIWYTTURAt14AJ9+Avd3FJ54+f0eNzUBFM7tOPy5TgCfRys8 usEFDx9uNH2UjUHBbG9kyGs= =M3CU -----END PGP SIGNATURE-----

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 I wrote:
negative :: Num a => Integer -> a negative a = fromInteger (negate a)
Oops, I forgot Rational literals, they make things a little more complicated :( Isaac -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGTJKxHgcxvIWYTTURAtGMAJ9oetioh1rfTF1o+bqCWqWxG/LSiwCgghq9 pOBHdfUp625ll1lpTbW0X+w= =X0oP -----END PGP SIGNATURE-----

On Mon, May 14, 2007 at 10:19:07AM +0100, Simon Marlow wrote:
Really? I'm beginning to have second thoughts about the proposed change to negation for Haskell'. The main reason, and this isn't pointed out as well as it should be on the wiki, is that "x-1" will cease to be an infix application of (-), it will parse as x applied to the literal (-1). And this is different from "x - 1" (syntax in which whitespace matters should be avoided like the plague, IMO). I think this would be worse than the current situation.
White space already matters when it comes to numbers quite a bit 0 x 123 vs 0x123 1.5 vs 1 . 5 3e4 vs 3 e 4 etc. I think this change is more than worth it. I mean, having to write (-4) everywhere is bad enough, but when writing polymorphic code, (fromInteger (-4)) is horrific to embed everywhere. :) another option would be to only count it as a negative if there is a non-identifier character preceeding it. A little ugly. but still better than the current situation IMHO. John -- John Meacham - ⑆repetae.net⑆john⑈

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 John Meacham wrote:
another option would be to only count it as a negative if there is a non-identifier character preceeding it. A little ugly. but still better than the current situation IMHO.
I think Ghc's lexer "Alex" can do this although this functionality is not used anywhere else... it seems a little out of character. I don't really like that "(3-2)-1" would be parsed differently because it's a parenthesized expression; consider "3^2-1" vs. "(3^2)-1" ... Isaac -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGTwMCHgcxvIWYTTURAkzHAKCdekuA6rUw4QcnIV3Qq9WJ8ZkljQCfTH5G c0jDDrAGLtBVZ4WVRdTDJu8= =1BDf -----END PGP SIGNATURE-----

John Meacham wrote:
On Mon, May 14, 2007 at 10:19:07AM +0100, Simon Marlow wrote:
Really? I'm beginning to have second thoughts about the proposed change to negation for Haskell'. The main reason, and this isn't pointed out as well as it should be on the wiki, is that "x-1" will cease to be an infix application of (-), it will parse as x applied to the literal (-1). And this is different from "x - 1" (syntax in which whitespace matters should be avoided like the plague, IMO). I think this would be worse than the current situation.
White space already matters when it comes to numbers quite a bit
0 x 123 vs 0x123 1.5 vs 1 . 5 3e4 vs 3 e 4
etc.
Yes, I happen to think that whitespcae should only be significant where it separates two lexemes of the same category. I'm prepared to make an exception for numbers, because the syntax of numbers is already so familiar to almost everyone. I think that we could easily remove the '3e4' lexical syntax though, since '3*10^^4' works just as well (I often write the latter anyway) (and guess what, I just had to look up the difference between ^ and ^^, only to discover I picked the wrong one). The '3e4' syntax is a common source of compiler bugs, becuase it is rarely used and hence rarely tested. Cheers, Simon

On Mon, May 21, 2007 at 10:33:56AM +0100, Simon Marlow wrote:
I think that we could easily remove the '3e4' lexical syntax though, since '3*10^^4' works just as well (I often write the latter anyway) (and guess what, I just had to look up the difference between ^ and ^^, only to discover I picked the wrong one). The '3e4' syntax is a common source of compiler bugs, becuase it is rarely used and hence rarely tested.
but they have substantially different translations. 3e2 -> fromRational (300 % 1) 3*10^^2 -> (fromInteger 3) * (fromInteger 10) ^^ (2 :: Foo) where Foo is whatever 4 defaults to, probably Integer, but could be a compile error if defaulting is off or changed. Though, the current floating point support in haskell is pretty funky as is... John -- John Meacham - ⑆repetae.net⑆john⑈
participants (5)
-
Dinko Tenev
-
Iavor Diatchki
-
Isaac Dupree
-
John Meacham
-
Simon Marlow