
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 6/25/10 02:42 , Roman Cheplyaka wrote:
* Jason Dagit
[2010-06-24 20:52:03-0700] On Sat, Jun 19, 2010 at 1:06 AM, Roman Cheplyaka
wrote: While ghc 6.12 finally has proper locale support, core packages (such as unix) still use withCString and therefore work incorrectly when argument (e.g. file path) is not ASCII.
Pardon me if I'm misunderstanding withCString, but my understanding of unix paths is that they are to be treated as strings of bytes. That is, unlike windows, they do not have an encoding predefined. Furthermore, you could have two filepaths in the same directory with different encodings due to this.
you got everything right here. So, as you said, there is a mismatch between representation in Haskell (list of code points) and representation in the operating system (list of bytes), so we need to know the encoding. Encoding is supplied by the user via locale (https://secure.wikimedia.org/wikipedia/en/wiki/Locale), particularly LC_CTYPE variable.
You might want to look at how Python is dealing with this (including the pain involved; best to learn from example). - -- brandon s. allbery [linux,solaris,freebsd,perl] allbery@kf8nh.com system administrator [openafs,heimdal,too many hats] allbery@ece.cmu.edu electrical and computer engineering, carnegie mellon university KF8NH -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (Darwin) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkwkcAYACgkQIn7hlCsL25W4BgCfVEyndklgo2TOyyemqdTKGkvS dBMAoKq3t9vMOkZZHiEHkIN5IDjgVbRt =69C5 -----END PGP SIGNATURE-----