Re: [Haskell-cafe] Efficient string construction

4 Jun 2010


      Daniel Fischer  writes:
...
...
So why is there a UTF8 implementation for bytestrings? Does that not
duplicate what Text is trying to do? If so, why the duplication?
...
I think Data.ByteString.UTF8 predates Data.Text.
One difference is that Data.Text uses UTF-16 internally, not UTF-8.
...
...
When is each library more appropriate?
Much data is overwhelmingly ASCII, but with an option for non-ASCII in
comments, labels, or similar.  E.g., for biological sequence data, files
can be large (the human genome is about 3GB) and non-ascii characters
can only occur in sequence headers which constitute a miniscule fraction
of the total data.  So I use ByteString for this.

-k
-- 
If I haven't seen further, it is by standing in the footprints of giants

Re: [Haskell-cafe] Efficient string construction

Ketil Malde