Re: What is a punctuation character?

20 Mar 2012

      On Tue, Mar 20, 2012 at 5:37 PM, Iavor Diatchki
 wrote:
...
Hello,
So I looked at what GHC does with Unicode and to me it is seems quite
reasonable:
* The alphabet is Unicode code points, so a valid Haskell program is
simply a list of those.
* Combining characters are not allowed in identifiers, so no need for
complex normalization rules: programs should always use the "short"
version of a character, or be rejected.
* Combining characters may appear in string literals, and there they
are left "as is" without any modification (so some string literals may
be longer than what's displayed in a text editor.)
Perhaps this is simply what the report already states (I haven't
checked, for which I apologize) but, if not, perhaps we should clarify
things.
-Iavor
PS:  I don't think that there is any need to specify a particular
representation for the unicode code-points (e.g., utf-8 etc.) in the
language standard.
Thanks Iavor.

If the report intended to talk about code points only (and indeed ruling
out normalization suggests that), then the Report needs to be
clarified.  As you know, there is a distinction between a Unicode code
point and a Unicode character

    http://www.unicode.org/versions/Unicode6.0.0/ch02.pdf#G25564

Until I sent my original query, I had been reading the Report as meaning
Unicode characters (as the grammar seemed to suggest), but now it is
clear to me that only code points were intended.  That seemed to be
confirmed by your investigation of the GHC code base.

-- Gaby

Re: What is a punctuation character?

Gabriel Dos Reis