
Am Sonntag 11 April 2010 15:29:38 schrieb Stephen Tetley:
Hi James
There's a paper describing the implementation of ByteStrings here:
http://www.cse.unsw.edu.au/~dons/papers/CSL06.html http://www.cse.unsw.edu.au/~dons/papers/fusion.pdf
For my own work, I generally need short immutable strings and haven't found ByteStrings compelling,
ByteStrings shine for long strings. When you're using long strings, ByteStrings almost certainly are *much* faster (utf8-ByteStrings are probably significantly slower, but should still beat [Char] comfortably). I've found ByteStrings better than [Char] when dealing with short strings only for a few things (e.g. as keys of Maps, ByteStrings tend to be better [at least if using ByteStrings there doesn't introduce too much packing and unpacking], things like edit-distance are faster on ByteStrings; UArray Int Char is slower than ByteString [in my measurements] for these tasks, but it can also be used for characters > toEnum 255 and isn't too much slower). Other things [see below] were faster for short [Char] than for short ByteStrings. When dealing with short strings, in my experience there are rarely compelling reasons to choose one over the other.
though the results presented in the above suggest [Char] is better at nothing
[Char] is (far) better at sorting short Strings; it often is better for map and filter.
and worse at many things.
[Char]-IO is abysmally slow in comparison, [Char] uses much more memory, random access is horrible for lists.
[Hmm - insert emoticon here]
Best wishes
Stephen