
12 Sep
2007
12 Sep
'07
3:26 p.m.
On Wed, Sep 12, 2007 at 11:16:25AM -0400, Seth Gordon wrote:
It appears that in spite of the locale definition, hGetContents is treating each byte as a separate character without translating the multi-byte sequences *from* UTF-8, and then putStrLn sends each of those bytes to standard output without translating the non-ASCII characters *to* UTF-8. So the second line of your program's output is correct...but only by accident.
that's it indeed. As I said in the message I've just sent, I've read that the String/CString conversion is automatically done in ISO-8859-1, so "รจรจรจ", which are 6 bytes in utf-8, are translated into 6 iso-8859-1 characters. What puzzles me is the behavior of putStrLn. Thanks for your time. Andrea