[GHC] #14589: The isUpper function should return true for the '\9438' character

#14589: The isUpper function should return true for the '\9438' character -------------------------------------+------------------------------------- Reporter: mrkkrp | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: | Version: 8.2.1 libraries/base | Keywords: | Operating System: Unknown/Multiple Architecture: | Type of failure: None/Unknown Unknown/Multiple | Test Case: | Blocked By: Blocking: | Related Tickets: Differential Rev(s): | Wiki Page: -------------------------------------+------------------------------------- {{{ λ> toLower '\9438' '\9438' λ> toUpper '\9438' '\9412' λ> isUpper '\9438' False λ> isLower '\9438' False }}} Here we can observe a contradiction. The `toLower` function does not alter its argument, but `toUpper` does, which tells us that the character 1) has the notion of case 2) it must be lower-cased. On the other hand, both `isUpper` and `isLower` functions return `False` for `\9438` suggesting that it has no notion of case. Apparently, `\9438` is lower-cased and `\9412` is its upper-cased version indeed: {{{ λ> putStrLn "\9438" ⓞ λ> putStrLn "\9412" Ⓞ }}} -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14589 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14589: The isUpper function should return true for the '\9438' character -------------------------------------+------------------------------------- Reporter: mrkkrp | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: libraries/base | Version: 8.2.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by int-index): * cc: int-index (added) -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14589#comment:1 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

Selects upper-case or title-case alphabetic Unicode characters (letters). Title case is used by a small number of letter ligatures like
#14589: The isUpper function should return true for the '\9438' character -------------------------------------+------------------------------------- Reporter: mrkkrp | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: libraries/base | Version: 8.2.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by nomeata): This is documented behavior. `isUpper` says: the single-character form of Lj. Note that is says “letter”, and the code in libraries/base/cbits/WCsubst.c explicitly selects only uppper-case and title-case letters. I see how this is a bit unfortunate, but I am sure we should not change the semantics of `isUpper` . Maybe we are missing a function? `isUpperAnything` or something? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14589#comment:2 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

I see how this is a bit unfortunate, but I am sure we should not change
#14589: The isUpper function should return true for the '\9438' character -------------------------------------+------------------------------------- Reporter: mrkkrp | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: libraries/base | Version: 8.2.1 Resolution: | Keywords: unicode Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by lelf): * keywords: => unicode Comment: the semantics of isUpper . Why? It's broken. ⓞ is lowercase. Also it's wrong wrt title-cased ones. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14589#comment:3 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler
participants (1)
-
GHC