
On Wed, Oct 18, 2006 at 04:42:00PM +0100, Simon Marlow wrote:
The other notorious part of the Haskell grammar that isn't LL/LR(1) is expressions vs. patterns. In a statement, if you see a variable, you don't know whether it is a pattern variable (apat) or an expression variable (aexpr). This is why Haskell grammars generally parse expressions and patterns using the same non-terminals.
it should be noted that all of haskell (including the maximal munching rules and lexing (but not the layout without a little preprocessing AFAIK)) can easily be parsed by a PEG. I am going to switch jhc to using one eventually as the maintainability advantages of a peg grammar are persuasive (and I pull my hair out every time I have to modify the current happy LALR parser). If you are writing something like a syntax highlighter for an editor, I'd strongly recommend checking them out as a basis. they are a straightforward generalization of regular expressions and can be made to deal gracefully with errors simply choosing the "most plausable" choice from ambiguous or incomplete code, something that is extremely useful for an editor.
By LL(1) I'm really meaning that the grammar for interactive editing needs to be adjusted so that it is possible to maintain the invariant that as code is entered from left to right constructs and identifiers can be highlighted according to their grammatical role and highlighting (modulo incompleteness) must remain unchanged regardless of whatever is typed afterwards to the right otherwise it can become more of a liability than a help, hence my hope that some future revision of Haskell grammar might consider taking the above points into account.
So you won't be able to colour patterns differently from expressions, that doesn't seem any worse than the context vs. type issue. Indeed, I'm not even sure you can colour types vs. values properly, look at this:
data T = C [Int]
at this point, is C a constructor? What if I continue the declaration like this:
data T = C [Int] `F`
no problem for a PEG as the infinite lookahead allows it to see the `F` no matter how far away it is. jhc may give horrible type errors, but by golly it's gonna give some good parse errors. :) John -- John Meacham - ⑆repetae.net⑆john⑈