
On Tue, Nov 1, 2011 at 11:43, Max Bolingbroke
Hi John,
On 1 November 2011 17:14, John Millikin
wrote: GHC 7.2 assumes Linux/BSD paths are text, which 1) silently breaks all existing code and 2) makes it impossible to fix within the given API.
Please can you give an example of code that is broken with the new behaviour? The PEP 383 mechanism will unavoidably break *some* code but I don't expect there to be much of it. One thing that most likely *will* be broken is code that attempts to reinterpret a String as a "byte string" - i.e. assuming that it was decoded using latin1, but I expect that such code can just be deleted when you upgrade to 7.2.
Examples of broken code are Darcs, my system-fileio, and likely anything else which needs to open Unicode-named files in both 7.0 and 7.2. As a quick example, consider the case of files with encodings different from the user's locale. This is *very* common, especially when interoperating with foreign Windows systems. $ ghci-7.0.4 GHC> import System.Directory GHC> createDirectory "path-test" GHC> writeFile "path-test/\xA1\xA5" "hello\n" GHC> writeFile "path-test/\xC2\xA1\xC2\xA5" "world\n" GHC> ^D $ ghci-7.2.1 GHC> import System.Directory GHC> getDirectoryContents "path-test" ["\161\165","\61345\61349","..","."] GHC> readFile "path-test/\161\165" "world\n" GHC> readFile "path-test/\61345\61349" *** Exception: path-test/: openFile: does not exist (No such file or directory)
As I pointed out earlier in the thread you can recover the old behaviour if you really want it by manually reencoding the strings, so I would dispute the claim that it is "impossible to fix within the given API".
Please describe how I can, in GHC 7.2, read the contents of the file "path-test/\xA1\xA5" without changing my locale. As far as I can tell, there is no way to do this using the standard libraries. I would have to fall back to the "unix" package, or even FFI imports, to open that file.