Re: Proposal #3337: expose Unicode and newline translation from System.IO

2 Jul 2009

      On Tue, 2009-06-30 at 13:03 +0100, Simon Marlow wrote:
...
Ticket:
http://hackage.haskell.org/trac/ghc/ticket/3337
For the proposed new additions, see:
* http://www.haskell.org/~simonmar/base/System-IO.html#23
    System.IO (Unicode encoding/decoding)
* http://www.haskell.org/~simonmar/base/System-IO.html#25
    System.IO (Newline conversion)
Discussion period: 2 weeks (14 July).
A couple things we brought up at the ghc irc meeting yesterday:

* UTF-8 with or without BOM? or variants utf8_bom. Do we need all three
variants:
   (pass through bom, produce no bom)       -- raw utf8
   (accept and ignore bom, produce bom)     -- utf8 with bom
   (accept and ignore bom, produce no bom)  -- permissive

After thinking about it a bit, I think we can get away with just the
existing utf8 and a utf8_bom that accepts a bom and produces a bom. The
reason is that to get the third behaviour you just read with utf8_bom
and write with utf8. Most operations on text files are read or write of
the whole file, not read/write on a single file.

* For the moment we are not publicly exposing the TextEncoding type.
Later we may want to consider making TextEncoding pure (using ST) and
share it for pure conversions String/Text <-> ByteString.

Duncan

Re: Proposal #3337: expose Unicode and newline translation from System.IO

Duncan Coutts