
On Tue, 2009-06-30 at 13:03 +0100, Simon Marlow wrote:
Ticket:
http://hackage.haskell.org/trac/ghc/ticket/3337
For the proposed new additions, see:
* http://www.haskell.org/~simonmar/base/System-IO.html#23 System.IO (Unicode encoding/decoding)
* http://www.haskell.org/~simonmar/base/System-IO.html#25 System.IO (Newline conversion)
Discussion period: 2 weeks (14 July).
A couple things we brought up at the ghc irc meeting yesterday: * UTF-8 with or without BOM? or variants utf8_bom. Do we need all three variants: (pass through bom, produce no bom) -- raw utf8 (accept and ignore bom, produce bom) -- utf8 with bom (accept and ignore bom, produce no bom) -- permissive After thinking about it a bit, I think we can get away with just the existing utf8 and a utf8_bom that accepts a bom and produces a bom. The reason is that to get the third behaviour you just read with utf8_bom and write with utf8. Most operations on text files are read or write of the whole file, not read/write on a single file. * For the moment we are not publicly exposing the TextEncoding type. Later we may want to consider making TextEncoding pure (using ST) and share it for pure conversions String/Text <-> ByteString. Duncan