
On Tue, Mar 23, 2010 at 00:27, Johann Höchtl
How are ByteStrings (Lazy, UTF8) and Data.Text meant to co-exist? When I read bytestrings over a socket which happens to be UTF16-LE encoded and identify a fitting function in Data.Text, I guess I have to transcode them with Data.Text.Encoding to make the type System happy?
There's no such thing as a UTF8 or UTF16 bytestring -- a bytestring is just a more efficient encoding of [Word8], just as Text is a more efficient encoding of [Char]. If the file format you're parsing specifies that some series of bytes is text encoded as UTF16-LE, then you can use the Text decoders to convert to Text. Poor separation between bytes and characters has caused problems in many major languages (C, C++, PHP, Ruby, Python) -- lets not abandon the advantages of correctness to chase a few percentage points of performance.