Re: Unicode support

30 Sep 2001


      At 2001-09-30 07:29, Marcin 'Qrczak' Kowalczyk wrote:
...
Some time ago the Unicode Consortium slowly began switching to the
point of view that abstract characters are denoted by numbers in the
range U+0000..10FFFF.
It's worth mentioning that these are 'codepoints', not 'characters'. 
Sometimes a character will be made up of two codepoints, for instance an 
'a' with a dot above is a single character that can be made from the 
codepoints LATIN SMALL LETTER A and COMBINING DOT ABOVE. Perhaps this 
makes the UTF-16 'surrogate' problem a bit less serious, since there 
never was a one-to-one correspondence between any kind of n-bit unit and 
displayed characters.

-- 
Ashley Yakeley, Seattle WA