Re: Haskell Platform Proposal: add the 'text' library

8 Sep 2010

      On Wed, Sep 8, 2010 at 12:21 AM, Duncan Coutts 
...
wrote:
...
On 7 September 2010 22:50, Ian Lynagh  wrote:> Are there
cases when Data.Text is significantly faster than
...
Data.Text.Lazy? Do we need both? (Presumably .Lazy is built on top of
Data.Text, but do we need the user to have a complete interface for
both?)
Mm, this is a fair question. In the case of bytestring we need both
because sometimes for dealing with foreign code or IO you need the
representation to be a contigious block of memory. For text the
representation is more abstract so that need does not arrise. One
might argue that if it is simply to control strictness then one could
use the lazy version and provide a deepseq instance.
Here's an alternative argument: suppose we change the representation
of strict text to be a tree of chunks (e.g. finger tree). We could
achieve effecient concatenation. This representation would be
impossible while preserving semantics of a lazy tail. A tree impl that
has any kind of balance needs to know the overall length so cannot
have a lazy tail.
The lazy version of Text uses one more word per value than the strict
version. This can be significant for small strings (e.g. ~8 characters)
where the overhead per character already is quite high. If I counted the
size of the BA# constructor correctly, a strict Text has a fixed overhead of
7 words and a lazy Text has an overhead of 8 words. This matters when you
e.g. want to use Texts as keys in a Map.

Btw, I see that the BA# constructor is not manually unpacked into the Array
data type. Is that done automatically since ByteArray# is unlifted or is
there some room for improvement here?

Cheers,
Johan

Re: Haskell Platform Proposal: add the 'text' library

Johan Tibell