[GHC] #15525: Unicode 8.0 and later characters are invariably lexical errors

#15525: Unicode 8.0 and later characters are invariably lexical errors -------------------------------------+------------------------------------- Reporter: ChaiTRex | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: 8.6.1 Component: Compiler | Version: 8.4.3 (Parser) | Keywords: | Operating System: Unknown/Multiple Architecture: | Type of failure: None/Unknown Unknown/Multiple | Test Case: | Blocked By: Blocking: | Related Tickets: Differential Rev(s): | Wiki Page: -------------------------------------+------------------------------------- I've tried a few added alphabet characters and emojis from various Unicode versions. It seems like Unicode 7.0 works fine. It seems like characters from Unicode 8.0 and later are lexical errors. For example, with the Unicode 10.0 [https://emojipedia.org/t-rex/ T. rex emoji], there are three lexical errors below: {{{#!hs module NoTRex where tRex :: String tRex = "🦖" 🦖 :: String 🦖 = "🦖" }}} produces: {{{ [1 of 1] Compiling NoTRex ( NoTRex.hs, NoTRex.o ) NoTRex.hs:4:9: error: lexical error in string/character literal at character '\129430' | 4 | tRex = "🦖" | ^ }}} If that's removed, the name of the function `🦖` is also shown to be a lexical error. Also, pasting the fourth line into GHCi pastes only the characters before the first `🦖`, like the `🦖` and everything afterward weren't pasted in. ---- System information: {{{ $ ghc --version The Glorious Glasgow Haskell Compilation System, version 8.4.3 $ lsb_release -ds Ubuntu 16.04.5 LTS }}} -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15525 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15525: Unicode 8.0 and later characters are invariably lexical errors -------------------------------------+------------------------------------- Reporter: ChaiTRex | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: 8.6.1 Component: Compiler | Version: 8.4.3 (Parser) | Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: 5518 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by ulysses4ever): * related: => 5518 Comment: Yes, the issue was raised in #5518 and there is a [https://phabricator.haskell.org/D5066 patch] waiting to be merged for this. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15525#comment:1 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15525: Unicode 8.0 and later characters are invariably lexical errors -------------------------------------+------------------------------------- Reporter: ChaiTRex | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: 8.6.1 Component: Compiler | Version: 8.4.3 (Parser) | Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: #5518 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by ulysses4ever): * related: 5518 => #5518 -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15525#comment:2 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15525: Unicode 8.0 and later characters are invariably lexical errors -------------------------------------+------------------------------------- Reporter: ChaiTRex | Owner: (none) Type: bug | Status: closed Priority: normal | Milestone: 8.6.1 Component: Compiler | Version: 8.4.3 (Parser) | Resolution: duplicate | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: #5518 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by ChaiTRex): * status: new => closed * resolution: => duplicate -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15525#comment:3 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15525: Unicode 8.0 and later characters are invariably lexical errors
-------------------------------------+-------------------------------------
Reporter: ChaiTRex | Owner: (none)
Type: bug | Status: closed
Priority: normal | Milestone: 8.6.1
Component: Compiler | Version: 8.4.3
(Parser) |
Resolution: duplicate | Keywords:
Operating System: Unknown/Multiple | Architecture:
| Unknown/Multiple
Type of failure: None/Unknown | Test Case:
Blocked By: | Blocking:
Related Tickets: #5518 | Differential Rev(s):
Wiki Page: |
-------------------------------------+-------------------------------------
Comment (by Ben Gamari

#15525: Unicode 8.0 and later characters are invariably lexical errors -------------------------------------+------------------------------------- Reporter: ChaiTRex | Owner: (none) Type: bug | Status: closed Priority: normal | Milestone: 8.6.1 Component: Compiler | Version: 8.4.3 (Parser) | Resolution: duplicate | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: #5518 | Differential Rev(s): Phab:D5066 Wiki Page: | -------------------------------------+------------------------------------- Changes (by ChaiTRex): * differential: => Phab:D5066 -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15525#comment:5 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler
participants (1)
-
GHC