Re: String != [Char]

26 Mar 2012


      On Mon, Mar 26, 2012 at 5:08 AM, Christian Siefkes
 wrote:
...
On 03/26/2012 02:39 AM, Gabriel Dos Reis wrote:
...
True, but should the language definition default to a string type
that is one the most unsuited for text processing in the 21st
century where global multilingualism abounds?  Even C has qualms
about that.
...
I have no doubt believing that if all texts my students have to
process are US ASCII, [Char] is more than sufficient.  So, I have
sympathy for your position.  However,  I doubt [Char] would be
adequate if I ask them to shared texts from their diverse cultures.
Uh, while a C char is (usually) just a byte (2^8 bits of information, like
Word8 in Haskell), a Haskell Char is a Unicode character (2^21 bits of
information).
It is not the precision of Char or char that is the issue here.
It has been clarified at several points that Char is not a Unicode character,
but a Unicode code point.  Not every Unicode code point represents a
Unicode code character, and not every sequence of Unicode code points
represents a character or a sequence of Unicode character.
...
A single C char cannot contain arbitrary Unicode character,
while a Haskell Char can, and does. Hence [Char] is (efficiency issues
aside) perfectly adequate for dealing with texts written in arbitrary languages.
See above.

-- Gaby