
On Fri, Mar 16, 2012 at 1:18 PM, Brandon Allbery
On Fri, Mar 16, 2012 at 14:08, Gabriel Dos Reis
wrote: The lexical structure chapter defines the non-terminal uniSymbol as
uniSymbol ::= any Unicode symbol or punctuation
There is a slight ambiguity here: is that description supposed to be parsed as: (a) "Unicode (symbol or punctuation)", or (b) "(Unicode symbol) or punctuation"?
(a) and I thought the report specified that the language's lexemes are defined in terms of Unicode properties so (a) is the only meaningful interpretation. (b) is not particularly meaningful, as your own question demonstrates.
It is not clear what "the language's lexemes are defined in terms of Unicode properties" really means. Why would you need ascSmall (and similar ASCII character categories) then when you already have uniSmall and associates? It is not clear that (b) is all that "not particularly meaningful". Have a look at the production <symbol>: it excludes double quote(") and apostrophe (') from uniSymbol. -- Gaby