
On Mon, Mar 26, 2012 at 6:21 PM, Brandon Allbery
On Mon, Mar 26, 2012 at 13:12, Ian Lynagh
wrote: Maybe your point is that neither "take" function should be used with unicode strings, but I don't see how advocating the Text type is going to help with that.
I think we established earlier that the list-like operations on Text are a backward compatibility wart. Either they should go away, or they should be modified to operate on some other level than codepoints. Probably the way the ecosystem should work is that [Char] (or possibly a packed version thereof, sort of like lazy ByteStrings with Word32 instead of Word8 as the fundamental unit) is the codepoint view and Text is the grapheme view; both are necessary at various times, but the grapheme view is the more natural one for text /per se/.
Does this mean we've firmly established that there currently is *no* completely satisfactory method of dealing with Unicode in existence today? In that case, even if it /will be/ a good idea one day, can't we agree that it's not the right time to deprecate String = [Char]? The language has a good history, I understand, of not standardising that which is not implemented and in common use, so if we'd like to change Text before introducing it to the language, I say let's do that separately. No-one's yet argued against OverloadedStrings. I think there /is/ an argument to be made, that it introduces ambiguity and could break existing programs (probably we can extend defaulting to take care of this, but I think there are people who'd be happier if we killed defaulting too). Too much polymorphism /can/ be a bad thing. But I think there's a serious chance we can make that happen, and make Text a bit more pleasant to work with. A passing thought: nearly anything can be made an instance of IsString, via something read-like. This prospect upsets me a little :) Maybe it would make sense to introduce additional methods, as in Num, to make sure that some sense is maintained (perhaps toString?)