[GHC] #11632: Data.Char repeated readLitChar barfs on output from show "ó1"

#11632: Data.Char repeated readLitChar barfs on output from show "ó1" -------------------------------------+------------------------------------- Reporter: inversemot | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: | Version: 7.10.3 libraries/base | Keywords: | Operating System: Linux Architecture: | Type of failure: Incorrect result Unknown/Multiple | at runtime Test Case: | Blocked By: Blocking: | Related Tickets: Differential Rev(s): | Wiki Page: -------------------------------------+------------------------------------- "ó1" is "\243\&1" and when shown that's "\"\\243\\&1\"" {{{#!hs readLitChar "\"\\243\\&1\"" = [("\"", "\243\\&1")] readLitChar "\243\\&1" = [("\243", "\\&1")] --should have consumed "\\&" readLitChar "\\&1" = [] }}} -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11632 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#11632: Data.Char repeated readLitChar barfs on output from show "ó1" -------------------------------------+------------------------------------- Reporter: inversemot | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: libraries/base | Version: 7.10.3 Resolution: | Keywords: Operating System: Linux | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Description changed by inversemot: @@ -1,1 +1,1 @@ - "ó1" is "\243\&1" and when shown that's "\"\\243\\&1\"" + "ó1" is "\243\&1" and when shown that's `"\"\\243\\&1\""` New description: "ó1" is "\243\&1" and when shown that's `"\"\\243\\&1\""` {{{#!hs readLitChar "\"\\243\\&1\"" = [("\"", "\243\\&1")] readLitChar "\243\\&1" = [("\243", "\\&1")] --should have consumed "\\&" readLitChar "\\&1" = [] }}} -- -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11632#comment:1 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#11632: read (show "ó1") fails -------------------------------------+------------------------------------- Reporter: inversemot | Owner: Type: bug | Status: new Priority: high | Milestone: Component: libraries/base | Version: 7.10.3 Resolution: | Keywords: Operating System: Linux | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by rwbarton): * priority: normal => high @@ -0,0 +1,5 @@ + {{{ + Prelude> let s = "ó1" in read (show s) + *** Exception: Prelude.read: no parse + }}} + @@ -4,3 +9,4 @@ - readLitChar "\"\\243\\&1\"" = [("\"", "\243\\&1")] - readLitChar "\243\\&1" = [("\243", "\\&1")] --should have consumed "\\&" - readLitChar "\\&1" = [] + readLitChar "\"\\243\\&1\"" = [('"', "\\243\\&1\"")] + readLitChar "\\243\\&1\"" = [('\243', "\\&1\"")] --should have consumed + "\\&" + readLitChar "\\&1\"" = [] New description: {{{ Prelude> let s = "ó1" in read (show s) *** Exception: Prelude.read: no parse }}} "ó1" is "\243\&1" and when shown that's `"\"\\243\\&1\""` {{{#!hs readLitChar "\"\\243\\&1\"" = [('"', "\\243\\&1\"")] readLitChar "\\243\\&1\"" = [('\243', "\\&1\"")] --should have consumed "\\&" readLitChar "\\&1\"" = [] }}} -- Comment: Thanks for the report. I fixed some errors in the ticket description, hope this is what you meant. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11632#comment:2 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#11632: Data.Char repeated readLitChar barfs on output from show "ó1" -------------------------------------+------------------------------------- Reporter: inversemot | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: libraries/base | Version: 7.10.3 Resolution: | Keywords: Operating System: Linux | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by rwbarton): * priority: high => normal @@ -1,5 +1,0 @@ - {{{ - Prelude> let s = "ó1" in read (show s) - *** Exception: Prelude.read: no parse - }}} - New description: "ó1" is "\243\&1" and when shown that's `"\"\\243\\&1\""` {{{#!hs readLitChar "\"\\243\\&1\"" = [('"', "\\243\\&1\"")] readLitChar "\\243\\&1\"" = [('\243', "\\&1\"")] --should have consumed "\\&" readLitChar "\\&1\"" = [] }}} -- Comment: Wait no, I'm wrong. `read (show "ó1")` works fine, I just forgot to ask for a string. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11632#comment:3 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#11632: Data.Char repeated readLitChar barfs on output from show "ó1" -------------------------------------+------------------------------------- Reporter: inversemot | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: libraries/base | Version: 7.10.3 Resolution: | Keywords: Operating System: Linux | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by inversemot): By the way this applies to any of the numeric escapes followed by a number. ó1 just happens to be the string quickcheck found while checking that Unicode characters were parsed correctly. So technically this effects megaparsec's charLiteral which uses readLitChar. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11632#comment:4 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#11632: Data.Char repeated readLitChar barfs on output from show "ó1" -------------------------------------+------------------------------------- Reporter: inversemot | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: libraries/base | Version: 7.10.3 Resolution: | Keywords: newcomer Operating System: Linux | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by thomie): * keywords: => newcomer Comment: inversemot: what should `lexLitChar "\\243\\&1"` return in your opinion: 1. unchanged `[("\\243","\\&1")]` 2. consume the `\\&1` `[("\\243","")]` 3. consume and include the `\\&1` `[("\\243\\&1","")]` I suppose option 2. For a newcomer: I think you'll want to either change the function `lexChar` in the file `libraries/base/Text/Read/Lex.hs`, and/or the functions `lexLitChar` and `readLitChar` in `libraries/base/GHC/Read.hs`. * Why not change the function `lexCharE`? Because it is used by the function `lexLitChar`, which lexes a character surrounded by single quotes, but `'x\&'` isn't a valid character (maybe it should be? It would simplify things.). Also note that the function `lexString` handles `\&` by itself in `lexEmpty`. Don't forget a [wiki:Building/RunningTests/Adding test] and [wiki:WorkingConventions/FixingBugs submit] your patch to Phabricator. For reference, the [https://www.haskell.org/onlinereport/haskell2010/haskellch2.html#x7-200002.6 Haskell 2010 report] has this to say:
The escape character \& is provided as a “null character” to allow strings such as "\137\&9" and "\SO\&H" to be constructed (both of length two). Thus "\&" is equivalent to "" and the character '\&' is disallowed.
-- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11632#comment:5 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#11632: Data.Char repeated readLitChar barfs on output from show "ó1" -------------------------------------+------------------------------------- Reporter: inversemot | Owner: kgupta Type: bug | Status: new Priority: normal | Milestone: Component: libraries/base | Version: 7.10.3 Resolution: | Keywords: newcomer Operating System: Linux | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by kgupta): * owner: => kgupta Comment: I would like to try to fix this bug. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11632#comment:6 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#11632: Data.Char repeated readLitChar barfs on output from show "ó1" -------------------------------------+------------------------------------- Reporter: inversemot | Owner: kgupta Type: bug | Status: new Priority: normal | Milestone: Component: libraries/base | Version: 7.10.3 Resolution: | Keywords: newcomer Operating System: Linux | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by kgupta): How do I add a test case for this? There aren't any existing tests as it is (the only thing that ever even imports `ReadP` is something that is importing it for the sake of importing it. Is there a separate test suite for base? Thanks! Replying to [comment:5 thomie]:
inversemot: what should `lexLitChar "\\243\\&1"` return in your opinion: 1. unchanged `[("\\243","\\&1")]` 2. consume the `\\&1` `[("\\243","")]` 3. consume and include the `\\&1` `[("\\243\\&1","")]`
I suppose option 2.
For a newcomer: I think you'll want to either change the function `lexChar` in the file `libraries/base/Text/Read/Lex.hs`, and/or the functions `lexLitChar` and `readLitChar` in `libraries/base/GHC/Read.hs`.
* Why not change the function `lexCharE`? Because it is used by the function `lexLitChar`, which lexes a character surrounded by single quotes, but `'x\&'` isn't a valid character (maybe it should be? It would simplify things.). Also note that the function `lexString` handles `\&` by itself in `lexEmpty`.
Don't forget a [wiki:Building/RunningTests/Adding test] and [wiki:WorkingConventions/FixingBugs submit] your patch to Phabricator.
For reference, the [https://www.haskell.org/onlinereport/haskell2010/haskellch2.html#x7-200002.6 Haskell 2010 report] has this to say:
The escape character \& is provided as a “null character” to allow strings such as "\137\&9" and "\SO\&H" to be constructed (both of length two). Thus "\&" is equivalent to "" and the character '\&' is disallowed.
-- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11632#comment:7 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#11632: Data.Char repeated readLitChar barfs on output from show "ó1" -------------------------------------+------------------------------------- Reporter: inversemot | Owner: kgupta Type: bug | Status: new Priority: normal | Milestone: Component: libraries/base | Version: 7.10.3 Resolution: | Keywords: newcomer Operating System: Linux | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by thomie): Tests for base are in `libraries/base/tests`. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11632#comment:8 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#11632: Data.Char repeated readLitChar barfs on output from show "ó1" -------------------------------------+------------------------------------- Reporter: inversemot | Owner: kgupta Type: bug | Status: new Priority: normal | Milestone: Component: libraries/base | Version: 7.10.3 Resolution: | Keywords: newcomer Operating System: Linux | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by kgupta): I think for option 2 you meant "consume the `\\&`", right? Because the `1` isn't part of the null character? Or am I misinterpreting? Thanks for your patience. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11632#comment:9 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#11632: Data.Char repeated readLitChar barfs on output from show "ó1" -------------------------------------+------------------------------------- Reporter: inversemot | Owner: kgupta Type: bug | Status: new Priority: normal | Milestone: Component: libraries/base | Version: 7.10.3 Resolution: | Keywords: newcomer Operating System: Linux | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by thomie): Replying to [comment:9 kgupta]:
I think for option 2 you meant "consume the `\\&`", right?
Yes, I updated comment:5. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11632#comment:10 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#11632: Data.Char repeated readLitChar barfs on output from show "ó1" -------------------------------------+------------------------------------- Reporter: inversemot | Owner: kgupta Type: bug | Status: new Priority: normal | Milestone: Component: libraries/base | Version: 7.10.3 Resolution: | Keywords: newcomer Operating System: Linux | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: readLitChar Blocked By: | Blocking: Related Tickets: | Differential Rev(s): D2391 Wiki Page: | -------------------------------------+------------------------------------- Changes (by kgupta): * testcase: => readLitChar * differential: => D2391 -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11632#comment:11 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#11632: Data.Char repeated readLitChar barfs on output from show "ó1" -------------------------------------+------------------------------------- Reporter: inversemot | Owner: kgupta Type: bug | Status: patch Priority: normal | Milestone: 8.2.1 Component: Core Libraries | Version: 7.10.3 Resolution: | Keywords: newcomer Operating System: Linux | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: readLitChar Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D2391 Wiki Page: | -------------------------------------+------------------------------------- Changes (by thomie): * cc: ekmett (added) * status: new => patch * differential: D2391 => Phab:D2391 * component: libraries/base => Core Libraries * milestone: => 8.2.1 Comment: Could someone from the CLC review Phad:D2391 please. Thanks. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11632#comment:12 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#11632: Data.Char repeated readLitChar barfs on output from show "ó1"
-------------------------------------+-------------------------------------
Reporter: inversemot | Owner: kgupta
Type: bug | Status: patch
Priority: normal | Milestone: 8.2.1
Component: Core Libraries | Version: 7.10.3
Resolution: | Keywords: newcomer
Operating System: Linux | Architecture:
Type of failure: Incorrect result | Unknown/Multiple
at runtime | Test Case: readLitChar
Blocked By: | Blocking:
Related Tickets: | Differential Rev(s): Phab:D2391
Wiki Page: |
-------------------------------------+-------------------------------------
Comment (by Ben Gamari

#11632: Data.Char repeated readLitChar barfs on output from show "ó1" -------------------------------------+------------------------------------- Reporter: inversemot | Owner: kgupta Type: bug | Status: merge Priority: normal | Milestone: 8.0.2 Component: Core Libraries | Version: 7.10.3 Resolution: | Keywords: newcomer Operating System: Linux | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: readLitChar Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D2391 Wiki Page: | -------------------------------------+------------------------------------- Changes (by bgamari): * status: patch => merge * milestone: 8.2.1 => 8.0.2 -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11632#comment:14 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#11632: Data.Char repeated readLitChar barfs on output from show "ó1"
-------------------------------------+-------------------------------------
Reporter: inversemot | Owner: kgupta
Type: bug | Status: merge
Priority: normal | Milestone: 8.0.2
Component: Core Libraries | Version: 7.10.3
Resolution: | Keywords: newcomer
Operating System: Linux | Architecture:
Type of failure: Incorrect result | Unknown/Multiple
at runtime | Test Case: readLitChar
Blocked By: | Blocking:
Related Tickets: | Differential Rev(s): Phab:D2391
Wiki Page: |
-------------------------------------+-------------------------------------
Comment (by Ben Gamari

#11632: Data.Char repeated readLitChar barfs on output from show "ó1" -------------------------------------+------------------------------------- Reporter: inversemot | Owner: kgupta Type: bug | Status: closed Priority: normal | Milestone: 8.0.2 Component: Core Libraries | Version: 7.10.3 Resolution: fixed | Keywords: newcomer Operating System: Linux | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: readLitChar Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D2391 Wiki Page: | -------------------------------------+------------------------------------- Changes (by bgamari): * status: merge => closed * resolution: => fixed -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11632#comment:16 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler
participants (1)
-
GHC