Re: String != [Char]

26 Mar 2012

      On Mon, Mar 26, 2012 at 9:42 AM, Christian Siefkes
 wrote:
...
On 03/26/2012 05:50 PM, Johan Tibell wrote:
...
Normalization isn't quite enough unfortunately, as it does solve e.g.
    upcase = map toUppper
You need all-at-once functions on strings (which we could add.) I'm
just pointing out that most (all?) list functions do the wrong thing
when used on Strings.
Hm, do you have any other examples besides toUpper/toLower?
length, cons, head, tail, filter, folds, anything that works on an
element-by-element basis.
...
Also, that example is not really an argument against using list functions on
strings (which, by any reasonable definition, seem to be "sequences of
characters" -- whether that sequence is represented as a list, an array, or
something else, seems more like an implementation detail to me).
I agree on the second part. As someone pointed out earlier, we should
be careful in using the word character as the Unicode code point
doesn't correspond well to the commonly used concept of a character.
What we have today is really:

    type String = [CodePoint]

What you would normally think of as a character might consists of
several code points.
...
Rather, it
indicates the fact that Char.toUpper may have to wrong type. If its type was
Char -> String instead of Char -> Char, it could handle things like toUppper
'ß' == "SS" correctly. Then stuff like
       upcase = concatMap toUppper
would work fine.
Yes.
...
As it is, the problem seems to be with Char, not with [Char].
[Char] is a semantically OK representation of a Unicode string, using
an array like text does is simply an optimization. However, using the
list function defined by the Prelude is not a good idea if you want to
process a Unicode string correctly.

-- Johan