Re: [Haskell-cafe] Valid Haskell characters

On 26 Aug 2008, at 1:31 pm, Deborah Goldsmith wrote:
You can't determine Unicode character properties by analyzing the names of the characters.
However, the OP *does* have a copy of the UnicodeData...txt file, and you *can* determine the relevant Unicode character properties from that. For example, consider the entry for space: 0020;SPACE;Zs;0;WS;;;;;N;;;;; ^^ The Zs bit says it's a white space character (Zs: separator/space, Zl: separator/line, Zp: separator/paragraph). Or look at capital A: 0041;LATIN CAPITAL LETTER A;Lu;0;L;;;;;N;;;;0061;^ ^^ The Lu bit says it's a L(etter) that is u(pper case). Upper case: Lu, lower case: Ll, title case: Lt, modifier letter: Lm, other letter: Lo, digit: Nd, ... If memory serves me correctly, this is explained in the UnicodeData.html file, under a heading something like Normative Categories.

All characters with general category Lu have the property Uppercase, but the converse is not true. Deborah On Aug 25, 2008, at 8:27 PM, Richard A. O'Keefe wrote:
On 26 Aug 2008, at 1:31 pm, Deborah Goldsmith wrote:
You can't determine Unicode character properties by analyzing the names of the characters.
However, the OP *does* have a copy of the UnicodeData...txt file, and you *can* determine the relevant Unicode character properties from that.
For example, consider the entry for space: 0020;SPACE;Zs;0;WS;;;;;N;;;;; ^^ The Zs bit says it's a white space character (Zs: separator/space, Zl: separator/line, Zp: separator/paragraph).
Or look at capital A: 0041;LATIN CAPITAL LETTER A;Lu;0;L;;;;;N;;;;0061;^ ^^ The Lu bit says it's a L(etter) that is u(pper case).
Upper case: Lu, lower case: Ll, title case: Lt, modifier letter: Lm, other letter: Lo, digit: Nd, ...
If memory serves me correctly, this is explained in the UnicodeData.html file, under a heading something like Normative Categories.

One may wonder which of them was used in Haskell compilers. Did they take only Lu characters, or were they carefull to accept all Uppercase? Maurício Deborah Goldsmith a écrit :
All characters with general category Lu have the property Uppercase, but the converse is not true.
Deborah
On Aug 25, 2008, at 8:27 PM, Richard A. O'Keefe wrote:
On 26 Aug 2008, at 1:31 pm, Deborah Goldsmith wrote:
You can't determine Unicode character properties by analyzing the names of the characters.
However, the OP *does* have a copy of the UnicodeData...txt file, and you *can* determine the relevant Unicode character properties from that.
For example, consider the entry for space: 0020;SPACE;Zs;0;WS;;;;;N;;;;; ^^ The Zs bit says it's a white space character (Zs: separator/space, Zl: separator/line, Zp: separator/paragraph).
Or look at capital A: 0041;LATIN CAPITAL LETTER A;Lu;0;L;;;;;N;;;;0061;^ ^^ The Lu bit says it's a L(etter) that is u(pper case).
Upper case: Lu, lower case: Ll, title case: Lt, modifier letter: Lm, other letter: Lo, digit: Nd, ...
If memory serves me correctly, this is explained in the UnicodeData.html file, under a heading something like Normative Categories.
participants (3)
-
Deborah Goldsmith
-
Maurício
-
Richard A. O'Keefe