Re: [Haskell] Re: [Haskell-cafe] ANNOUNCE: enumerator, an alternative iteratee package

21 Aug 2010

      On Sat, Aug 21, 2010 at 11:58, Judah Jacobson  wrote:
...
You should note that in ghc>=6.12, hWaitForInput tries to decode the
next character of input based on to the Handle's encoding.  As a
result, it will block if the next multibyte sequence is incomplete,
and it will throw an error if a multibyte sequence gets split between
two chunks.
I worked around this problem in Haskeline by temporarily setting stdin
to BinaryMode; you may want to do something similar.
Also, this issue caused a bug in bytestring with ghc-6.12:
http://hackage.haskell.org/trac/ghc/ticket/3808
which will be resolved by the new function 'hGetBufSome' (in ghc-6.14)
that blocks only when there's no data to read:
http://hackage.haskell.org/trac/ghc/ticket/4046
That function might be useful for your package, though not portable to
other implementations or older GHC versions.
You should not be reading bytestrings from text-mode handles.

The more I think about it, the more having a single Handle type for
both text and binary data causes problems. There should be some
separation so users don't accidentally use a text handle with binary
functions, and vice-versa:

openFile :: FilePath -> IOMode -> IO TextHandle
openBinaryFile :: FIlePath -> IOMode -> IO BinaryHandle
hGetBuf :: BinaryHandle -> Ptr a -> Int -> IO Int
Data.ByteString.hGet :: BinaryHandle -> IO ByteString
-- etc

then the enumerators would simply require the correct handle type:

Data.Enumerator.IO.enumHandle :: BinaryHandle -> Enumerator
SomeException ByteString IO b
Data.Enumerator.Text.enumHandle :: TextHandle -> Enumerator
SomeException Text IO b

I suppose the enumerators could verify the handle mode and throw an
exception if it's incorrect -- at least that way, it will fail
consistently rather than only in rare occasions.

Re: [Haskell] Re: [Haskell-cafe] ANNOUNCE: enumerator, an alternative iteratee package

John Millikin