
On Sun, Feb 24, 2008 at 05:46:35PM +0000, Duncan Coutts wrote:
I've added readTextFile and writeTextFile to the Utils module and checked all other uses of readFile and writeFile.
I've also switched the rawSystemStdout to assume UTF8 output format.
The read and write functions ought to open their files in binary mode. It's just wrong to read Unicode characters (which is what a plain text Handle promises you) and treat them as bytes. There's a similar problem with using toUTF on stdout and stderr. Haskell 98 is very clear that putChar on those Handles takes Unicode characters, though it does not specify how these are encoded in the environment. GHC has historically assumed an ISO-8859-1 encoding, truncating larger characters, but other implementations could map them to the current locale (as Hugs does). Perhaps a future GHC will map them to UTF. I think you should just hand the characters to putChar and leave their presentation to the implementation, flawed though GHC's currently is.