
On 30 March 2011 20:53, Max Bolingbroke
On 30 March 2011 07:52, Michael Snoyman
wrote: I could manually do something like (utf8Decode . S8.pack), but that presumes that the character encoding on the system in question is UTF8. So two questions:
Funnily enough I have been thinking about this quite hard recently, and the situation is kind of a mess and short of implementing PEP383 (http://www.python.org/dev/peps/pep-0383/) in GHC I can't see how to make it easier on the programmer. As Jason points out the best you can really do is probably:
1. Treat Strings that represent filenames as raw byte sequences, even though they claim to be strings
2. When presenting such Strings to the user, re-decode them by using the current locale encoding (which will typically be UTF-8). You probably want to have some means of avoiding decoding errors here too -- ignoring or replacing undecodable bytes -- but presently this is not so straightforward. If you happen to be on a system with GNU Iconv you can use it's "C//TRANSLIT//IGNORE" encoding to achieve this, however.
http://www.haskell.org/pipermail/libraries/2009-August/012493.html I took from this discussion that FilePath really should be a pair of the actual filename ByteString, and the printable String (decoded from the ByteString, with encoding specified by the user's locale). The conversion from ByteString to String (and vice versa) is not guaranteed to be lossless, so you need to remember both. Alistair