[GHC] #7853: UTF encodings do not detect overlong forms

#7853: UTF encodings do not detect overlong forms ----------------------------------------+----------------------------------- Reporter: batterseapower | Owner: Type: bug | Status: new Priority: normal | Component: libraries/base Version: 7.6.3 | Keywords: Os: Unknown/Multiple | Architecture: Unknown/Multiple Failure: Incorrect result at runtime | Blockedby: Blocking: | Related: ----------------------------------------+----------------------------------- Overlong UTF-{8,16} sequences can have security implications (http://www.cl.cam.ac.uk/~mgk25/unicode.html). Decoders for these encodings should detect them and flag them as invalid characters. GHC's implementations of these decoders do not do so! This problem has additional implications for GHC since as we are not rejecting overlong sequences, trying to roundtrip 0xC0 0xB1 through UTF-8//ROUNDTRIP results in 0x31 rather than the expected sequence. The roundtripping fails because the overlong sequence is not flagged up by the UTF-8 encoder and so the surrogate escape mechanism never gets a chance to work. -- Ticket URL: http://hackage.haskell.org/trac/ghc/ticket/7853 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#7853: UTF encodings do not detect overlong forms ----------------------------------------+----------------------------------- Reporter: batterseapower | Owner: Type: bug | Status: new Priority: normal | Component: libraries/base Version: 7.6.3 | Keywords: Os: Unknown/Multiple | Architecture: Unknown/Multiple Failure: Incorrect result at runtime | Blockedby: Blocking: | Related: ----------------------------------------+----------------------------------- Comment(by batterseapower): Addendum: luckily GHC already detects overlong 3/4 byte forms. It merely lacks the check for 2-byte forms. I have a marvelous patch on a private branch that remedies this flaw, but alas this margin is too small to contain it. -- Ticket URL: http://hackage.haskell.org/trac/ghc/ticket/7853#comment:1 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#7853: UTF encodings do not detect overlong forms ---------------------------------+------------------------------------------ Reporter: batterseapower | Owner: Type: bug | Status: closed Priority: normal | Component: libraries/base Version: 7.6.3 | Resolution: fixed Keywords: | Os: Unknown/Multiple Architecture: Unknown/Multiple | Failure: Incorrect result at runtime Blockedby: | Blocking: Related: | ---------------------------------+------------------------------------------ Changes (by batterseapower): * status: new => closed * resolution: => fixed Comment: Fixed in 2b4705254638f5b06a0e83359e28e361f40d2ac4 -- Ticket URL: http://hackage.haskell.org/trac/ghc/ticket/7853#comment:2 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler
participants (1)
-
GHC