Re: [Haskell-cafe] Copying Arrays

30 May 2008

      "Johan Tibell"  writes:
...
...
I guess this is where I don't follow: why would you need more short
strings for Unicode text than for ASCII or 8-bit latin text?
...
But ByteStrings are neither ASCII nor 8-bit Latin text! 
  [...] 
The intent of the not-yet-existing Unicode string is to represent
text not bytes.
Right, so this will replace the .Char8 modules as well?  What confused
me was my misunderstanding Duncan to mean that Unicode text would
somehow imply shorter strings than non-Unicode (i.e. 8-bit) text.
...
To give just one example, short (Unicode) strings are common as keys
in associative data structures like maps
I guess typically, you'd break things down to words, so strings of
lenght 4-10 or so.  BS uses three words and LBS four (IIRC), so the
cost of sharing typically outweighs the benefit.
...
Can I also here insert a plea for keeping lazy I/O out of the new
Unicode module?
I use ByteString.Lazy almost exclusively.  I realize it there's a
penalty in time and space, but the ability to write applications that
stream over multi-Gb files is essential.

Of course, these applications couldn't care less about Unicode, so
perhaps the usage is different.

-k
-- 
If I haven't seen further, it is by standing in the footprints of giants

Re: [Haskell-cafe] Copying Arrays

Ketil Malde