
#10196: Regression regarding Unicode subscript characters in identifiers -------------------------------------+------------------------------------- Reporter: thomie | Owner: Type: bug | thoughtpolice Priority: normal | Status: patch Component: Compiler | Milestone: 7.10.3 (Parser) | Version: 7.10.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: GHC rejects | Unknown/Multiple valid program | Test Case: Blocked By: | Blocking: Related Tickets: #5108 | Differential Revisions: Phab:D969 -------------------------------------+------------------------------------- Comment (by thomie): Replying to [comment:5 hvr]:
We're planning to allow `Lm` from the 2nd character on in an identifier for 7.10.2
The current patch does exactly this. It still needs a changelog entry. I would have preferred to only allow `Lm` in the suffix of an identifier. But we can leave that for 7.12 or later, as there is a slight chance it breaks someone's code. We could mention it in the docs. There's also the issue that ModifierLetter perhaps brings in too many weird characters: "15-06-18T11:46:27"< hvr@> thomie: can we easily list all modifier letters in Haskell? "15-06-18T11:47:55"< hvr@> [ c | c <- ['\0'..], generalCategory c == ModifierLetter ] "15-06-18T11:47:56"< hvr@> got it "15-06-18T11:48:56"< hvr@> ok, there's a lot in there one doesn't want to allow in identifiers :-/ "15-06-18T11:49:31"< thomie > booh "15-06-18T11:49:50"< hvr@> these look nasty: "15-06-18T11:50:46"< hvr@> so many column variants, theres also "ː" hvr: do you think this a big enough issue to not proceed with the current patch? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10196#comment:9 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler