
On Sun, Nov 07, 2010 at 08:19:00AM -0800, Bryan O'Sullivan wrote:
On Sun, Nov 7, 2010 at 7:16 AM, Ian Lynagh
wrote: Can you give an example of such an operation please, which doesn't go wrong when the argument is "c", the input contains "cx" and 'x' is a combining character such that there is no composed codepoint for "cx"?
I don't think there's been any contention that searching purely on Text values is enough to handle that. However, a Char→Bool predicate clearly isn't enough either :-)
Quite so, but it doesn't claim to be able to do so either. Maybe I'm misunderstanding the issue, so my question was too specific. AIUI the motivation for the current text API is: The design of the Text library encourages the use of substring operations because these are expected to be more commonly used and because correct handling of Unicode often requires substring operations (due to issues with combining characters). so can someone please give an example of a function that correctly handles Unicode by using a substring operation? Thanks Ian