
#9114: Invalid UTF8 not round-tripped correctly ------------------------------------+------------------------------------- Reporter: nomeata | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: libraries/base | Version: 7.6.3 Keywords: | Operating System: Unknown/Multiple Architecture: Unknown/Multiple | Type of failure: None/Unknown Difficulty: Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | ------------------------------------+------------------------------------- As reported by Robert Bihlmeyer at http://bugs.debian.org/748125, the promised round-tripping of invalid UTF8 sequences in filenames through String does not work: ``` $ mkdir foo $ touch foo/$(echo -e '\xC0\xB7.txt') $ ghc -e 'System.Directory.getDirectoryContents "foo" >>= print . last' "7.txt" ``` The sequence 0xC8B7 is an (invalid) encoding of 37, i.e. `'7'`, so if it is mapped to `'7'`, no round-tripping is possible. (Other invalid byte sequences are round-tripped.) -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9114 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler