[GHC] #9573: Add warning for invalid digits in integer literals

#9573: Add warning for invalid digits in integer literals -------------------------------------+------------------------------------- Reporter: vlopez | Owner: Type: bug | Status: new Priority: normal | Milestone: 7.10.1 Component: Compiler (Parser) | Version: 7.9 Keywords: parsing, integer, | Operating System: octal, binary, hexadecimal, | Unknown/Multiple Architecture: Unknown/Multiple | Type of failure: Difficulty: Difficult (2-5 | None/Unknown days) | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Revisions: -------------------------------------+------------------------------------- In its latest version, GHC can parse binary (with `-XBinaryLiterals`), octal and hexadecimal literals: {{{#!hs
0b101010 0o52 0x2A }}}
Currently, the parser/lexer reads digits from the input as long as they are valid for the specified radix. All subsequent digits are interpreted as a new, separate token. If the user uses a digit which isn't valid for the radix, it may be reported with a non-obvious error message, or interpreted in surprising ways: {{{#!hs
:t 0o567 0o576 :: Num a => a :t 0o5678 0o5678 :: (Num (a -> t), Num a) => t Prelude> :t 0x1bfah
<interactive>:1:7: Not in scope: ‘h’
replicate 0o5678 [8,8,8,8,8,8,8... }}}
We suggest warning the user when a literal of this sort is written, while respecting any other error messages and the original behaviour. More specifically, the parser or lexer would give a warning if a token starting with an alphanumeric character is found immediately after a numeric literal, without a blank between them. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9573 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9573: Add warning for invalid digits in integer literals -------------------------------------+------------------------------------- Reporter: vlopez | Owner: Type: bug | Status: new Priority: normal | Milestone: 7.10.1 Component: Compiler | Version: 7.9 (Parser) | Keywords: parsing, integer, Resolution: | octal, binary, hexadecimal, Operating System: | Architecture: Unknown/Multiple Unknown/Multiple | Difficulty: Difficult (2-5 Type of failure: | days) None/Unknown | Blocked By: Test Case: | Related Tickets: Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Comment (by vlopez): Here's a partial patch (attachment `invalid-digit-warning-6c9246.patch`), which gives a warning in most cases. However, the feature not very useful when the user for making sense of a type error, as GHC only seems to display warnings when the module can be compiled correctly. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9573#comment:1 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9573: Add warning for invalid digits in integer literals -------------------------------------+------------------------------------- Reporter: vlopez | Owner: vlopez Type: bug | Status: new Priority: normal | Milestone: 7.10.1 Component: Compiler | Version: 7.9 (Parser) | Keywords: parsing, integer, Resolution: | octal, binary, hexadecimal, Operating System: | Architecture: Unknown/Multiple Unknown/Multiple | Difficulty: Difficult (2-5 Type of failure: | days) None/Unknown | Blocked By: Test Case: | Related Tickets: Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Changes (by vlopez): * owner: => vlopez Old description:
In its latest version, GHC can parse binary (with `-XBinaryLiterals`), octal and hexadecimal literals:
{{{#!hs
0b101010 0o52 0x2A }}}
Currently, the parser/lexer reads digits from the input as long as they are valid for the specified radix. All subsequent digits are interpreted as a new, separate token.
If the user uses a digit which isn't valid for the radix, it may be reported with a non-obvious error message, or interpreted in surprising ways:
{{{#!hs
:t 0o567 0o576 :: Num a => a :t 0o5678 0o5678 :: (Num (a -> t), Num a) => t Prelude> :t 0x1bfah
<interactive>:1:7: Not in scope: ‘h’
replicate 0o5678 [8,8,8,8,8,8,8... }}}
We suggest warning the user when a literal of this sort is written, while respecting any other error messages and the original behaviour.
More specifically, the parser or lexer would give a warning if a token starting with an alphanumeric character is found immediately after a numeric literal, without a blank between them.
New description: In its latest version, GHC can parse binary (with `-XBinaryLiterals`), octal and hexadecimal literals: {{{#!hs
0b101010 0o52 0x2A }}}
Currently, the parser/lexer reads digits from the input as long as they are valid for the specified radix. All subsequent digits are interpreted as a new, separate token. If the user uses a digit which isn't valid for the radix, it may be reported with a non-obvious error message, or interpreted in surprising ways: {{{#!hs
:t 0o567 0o576 :: Num a => a :t 0o5678 0o5678 :: (Num (a -> t), Num a) => t :t 0x1bfah
<interactive>:1:7: Not in scope: ‘h’
replicate 0o5678 [8,8,8,8,8,8,8... }}}
We suggest warning the user when a literal of this sort is written, while respecting any other error messages and the original behaviour. More specifically, the parser or lexer would give a warning if a token starting with an alphanumeric character is found immediately after a numeric literal, without a blank between them. -- -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9573#comment:2 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9573: Add warning for invalid digits in integer literals -------------------------------------+------------------------------------- Reporter: vlopez | Owner: vlopez Type: bug | Status: new Priority: normal | Milestone: 7.10.1 Component: Compiler | Version: 7.9 (Parser) | Keywords: parsing, integer, Resolution: | octal, binary, hexadecimal, Operating System: | Architecture: Unknown/Multiple Unknown/Multiple | Difficulty: Difficult (2-5 Type of failure: | days) None/Unknown | Blocked By: Test Case: | Related Tickets: Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Comment (by andreas.abel): I think a parse *error* would be more appropriate (and useful) than a warning. I consider it as a bug rather than a feature that the following parses: {{{ test h = (+)92837492837h }}} Whitespace isn't that expensive anymore, and even cheaper on the black market ;-), so this "feature" does not seem worth preserving. The moral: number literals should have to be terminated by one of the following: whitespace, punctuation (including parenthesis), operators etc, but not just by an alphabetic character. Maybe one can first parse a number literal as it was an identifier (i.e., allow leading numeric characters in identifiers), and then postprocess identifiers that start with a numeric character as number literals, giving parse errors if they do not make sense. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9573#comment:3 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9573: Add warning for invalid digits in integer literals -------------------------------------+------------------------------------- Reporter: vlopez | Owner: vlopez Type: bug | Status: new Priority: normal | Milestone: 7.10.1 Component: Compiler | Version: 7.9 (Parser) | Keywords: parsing, integer, Resolution: | octal, binary, hexadecimal, Operating System: | Architecture: Unknown/Multiple Unknown/Multiple | Difficulty: Difficult (2-5 Type of failure: | days) None/Unknown | Blocked By: Test Case: | Related Tickets: Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Comment (by vlopez): I fully agree with making it an error, and simplifying lexing of numeric literals However, there's at least one test case which would fail.¹ As for the Haskell report, there's no explicit rule about number literals being separated by whitespace. I'd suggest giving a deprecated warning for 7.10. Then, for 7.12, the lexer can be switched to the new behaviour, and the language report updated accordingly. 1. [http://git.haskell.org/ghc.git/blob/c0fa383d9109800a4e46a81b418f1794030ba1bd...] -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9573#comment:4 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9573: Add warning for invalid digits in integer literals -------------------------------------+------------------------------------- Reporter: vlopez | Owner: vlopez Type: bug | Status: new Priority: normal | Milestone: 7.10.1 Component: Compiler | Version: 7.9 (Parser) | Keywords: parsing, integer, Resolution: | octal, binary, hexadecimal, Operating System: | Architecture: Unknown/Multiple Unknown/Multiple | Difficulty: Difficult (2-5 Type of failure: | days) None/Unknown | Blocked By: Test Case: | Related Tickets: Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Comment (by hvr): Replying to [comment:4 vlopez]:
However, there's at least one test case which would fail.¹
...damn, I spent quite a bit of time to come up with that test-case to make sure I covered all code-paths and didn't deviate from the octal literal's lexing ;-) -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9573#comment:5 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9573: Add warning for invalid digits in integer literals -------------------------------------+------------------------------------- Reporter: vlopez | Owner: vlopez Type: bug | Status: new Priority: normal | Milestone: 8.0.1 Component: Compiler | Version: 7.9 (Parser) | Keywords: report- Resolution: | impact, parsing, integer, octal, | binary, hexadecimal, Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by thomie): * keywords: parsing, integer, octal, binary, hexadecimal, => report- impact, parsing, integer, octal, binary, hexadecimal, -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9573#comment:8 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler
participants (1)
-
GHC