
wnoise:
On 2007-07-13, Stefan O'Rear
wrote: He's not trying to report a bug; he's just complaining about base's long-known lack of support for non-latin1 encodings. (IIUC)
Which is a bug. Base needs to support (in an /obvious/ way) (1) direct I/O of octets (bytes), with no character interpretation set
Data.ByteString
(2) I/O of text in UTF-8.
not in base, but see utf8-string on hackage.haskell.org.
In addition, it would be nice to support (3) (On Unix) use of locale to determine text encoding but users can work around this themselves, and will often need to, even
Hmm, there's System.Locale, but I've not used it for anything other than dates.
if (3) is supported.
(2) can also be layered atop (1), but something is wrong if you have to write your own layer to do simple text input and output. It's even worse if you can't without going to the FFI.
Yes, there's been a few encoding layers on top of Data.ByteString written for other non-latin1 encodings.
(1) can currently be done, but it's not at all clear how to do so, or once you have figured out how to do so, why it works.
(This may be a bit out of date, but seeing this brought up again, I think not.)
I think its a little out of date, given Data.ByteString and utf8-string? -- Don