
wnoise:
On 2007-07-14, Donald Bruce Stewart
wrote: wnoise:
On 2007-07-13, Stefan O'Rear
wrote: He's not trying to report a bug; he's just complaining about base's long-known lack of support for non-latin1 encodings. (IIUC)
Which is a bug. Base needs to support (in an /obvious/ way) (1) direct I/O of octets (bytes), with no character interpretation set
Data.ByteString
And does this work for Non-GHC yet? And when does it get added to Haskell' and guaranteed to work?
Yes, Data.ByteString is available for GHC, Hugs and nhc98. Unsure about YHC, but it wouldn't be hard presuming the FFI support is up to speed.
(2) I/O of text in UTF-8.
not in base, but see utf8-string on hackage.haskell.org.
Yes, this a decent layering of (2), on top of (1), for GHC only, depending on it to reading the bytes, and interpreting them as Latin-1.
Yeah, we can also layer it on Data.ByteString, which uses the FFI to avoid relying on any latin-1 behviour.
(1) can currently be done, but it's not at all clear how to do so, or once you have figured out how to do so, why it works.
(This may be a bit out of date, but seeing this brought up again, I think not.)
I think its a little out of date, given Data.ByteString and utf8-string?
It's not obvious that ByteString is the place to look for I/O, so it's not yet good enough. It should be as easy to use as character I/O, and as easy to find.
Agreed. -- Don