
#9114: Invalid UTF8 not round-tripped correctly -------------------------------------+------------------------------------ Reporter: nomeata | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: libraries/base | Version: 7.6.3 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Unknown/Multiple Type of failure: None/Unknown | Difficulty: Unknown Test Case: | Blocked By: Blocking: | Related Tickets: -------------------------------------+------------------------------------ Description changed by tibbe: Old description:
As reported by Robert Bihlmeyer at http://bugs.debian.org/748125, the promised round-tripping of invalid UTF8 sequences in filenames through String does not work:
``` $ mkdir foo $ touch foo/$(echo -e '\xC0\xB7.txt') $ ghc -e 'System.Directory.getDirectoryContents "foo" >>= print . last' "7.txt" ```
The sequence 0xC8B7 is an (invalid) encoding of 37, i.e. `'7'`, so if it is mapped to `'7'`, no round-tripping is possible. (Other invalid byte sequences are round-tripped.)
New description: As reported by Robert Bihlmeyer at http://bugs.debian.org/748125, the promised round-tripping of invalid UTF8 sequences in filenames through String does not work: {{{ $ mkdir foo $ touch foo/$(echo -e '\xC0\xB7.txt') $ ghc -e 'System.Directory.getDirectoryContents "foo" >>= print . last' "7.txt" }}} The sequence 0xC8B7 is an (invalid) encoding of 37, i.e. `'7'`, so if it is mapped to `'7'`, no round-tripping is possible. (Other invalid byte sequences are round-tripped.) -- -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9114#comment:2 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler