
4 Jun
2010
4 Jun
'10
2:28 a.m.
Daniel Fischer
So why is there a UTF8 implementation for bytestrings? Does that not duplicate what Text is trying to do? If so, why the duplication?
I think Data.ByteString.UTF8 predates Data.Text.
One difference is that Data.Text uses UTF-16 internally, not UTF-8.
When is each library more appropriate?
Much data is overwhelmingly ASCII, but with an option for non-ASCII in comments, labels, or similar. E.g., for biological sequence data, files can be large (the human genome is about 3GB) and non-ascii characters can only occur in sequence headers which constitute a miniscule fraction of the total data. So I use ByteString for this. -k -- If I haven't seen further, it is by standing in the footprints of giants