
21 Aug
2002
21 Aug
'02
7:17 a.m.
Ashley Yakeley
That's not quite correct. Every code point is exactly one Char, but some characters may be composed of more than one code point. For instance, 'á' might be represented as
\#00E1 [LATIN SMALL LETTER A WITH ACUTE]
or
\#0061 [LATIN SMALL LETTER A] + \#0301 [COMBINING ACUTE ACCENT]
I guess they must be treated the same, too? That is, the length of the strings should be the same, they should compare equal, etc etc. Or is it an alternative to just ignore the issue, and simply think of the latter as two characters? -kzm -- If I haven't seen further, it is by standing in the footprints of giants