
On Tue, Feb 03, 2009 at 04:42:44PM +0000, Simon Marlow wrote:
Unicode-aware Handles ~~~~~~~~~~~~~~~~~~~~~
This is a significant restructuring of the Handle implementation with the primary goal of supporting Unicode character encodings.
The only change to the existing behaviour is that by default, text IO is done in the prevailing encoding of the system. Handles created by openBinaryFile use the Latin-1 encoding, as do Handles placed in binary mode using hSetBinaryMode.
I would like to see the API make a type distinction between text handles with Char-based operations, and binary handles with Word8-based operations. (And phase out openBinaryFile, hSetBinaryMode etc.) It's a vital distinction that often trips people up. Treating binary data as Chars with a Latin-1 encoding just feels like a kludge.