
At 2002-08-21 00:17, Ketil Z. Malde wrote:
\#00E1 [LATIN SMALL LETTER A WITH ACUTE]
or
\#0061 [LATIN SMALL LETTER A] + \#0301 [COMBINING ACUTE ACCENT]
I guess they must be treated the same, too? That is, the length of the strings should be the same, they should compare equal, etc etc.
In my opinion no. As far as String is concerned, since it is simply [Char], it should be considered as simply a list of codepoints without further interpretation. So 'length' and its instance for Eq should be the same as for any other list.
Or is it an alternative to just ignore the issue, and simply think of the latter as two characters?
Consider the latter as two codepoints, and don't worry about characters. There should be separate functions for doing such things as decomposition and equivalence. -- Ashley Yakeley, Seattle WA