On Tue, Aug 17, 2010 at 12:39 PM, Bulat Ziganshin <bulat.ziganshin@gmail.com> wrote:
Hello Tom,

Tuesday, August 17, 2010, 2:09:09 PM, you wrote:

> In the first iteration of the Text package, UTF-16 was chosen because
> it had a nice balance of arithmetic overhead and space.  The
> arithmetic for UTF-8 started to have serious performance impacts in
> situations where the entire document was outside ASCII (i.e. a Russian
> or Arabic document), but UTF-16 was still relatively compact

i don't understand what you mean. are you support all 2^20 codepoints
in Data.Text package?

Yes, UTF-16 can represent all Unicode code points, using surrogate pairs.

-- Johan