Hi Colin,

On Sun, Aug 15, 2010 at 9:34 AM, Colin Paul Adams <colin@colina.demon.co.uk> wrote:
But UTF-16 (apart from being an abomination for creating a hole in the
codepoint space and making it impossible to ever etxend it) is slow to
process compared with UTF-32 - you can't get the nth character in
constant time, so it seems an odd choice to me.

Aside: Getting the nth character isn't very useful when working with Unicode text:

* Most text processing is linear.
* What we consider a character and what Unicode considers a character differs a bit e.g. since Unicode uses combining characters.

Cheers,
Johan