
On May 29, 2007, at 10:44 AM, apfelmus wrote:
Mark T.B. Carroll wrote:
I've been playing with Text.Parsers.Frisby to see how it stacks against other options and, while it's been great so far, I am finding that I can't encode a grammar where what's acceptable depends on what's already been parsed in some nontrivial way. [...] Is this supposed to not be possible in Frisby, or (quite likely) am I missing something that allows me to?
It's intentionally impossible. Frisby uses a dynamic programming approach that crucially depends on the fact that the grammar in question is context-free (actually something related, but the effect is the same). You're trying to parse a context-sensitive language.
Interestingly, Rats (a packrat-based parser generator for Java) permits you to insert arbitrary boolean conditions into the grammar; if the test fails, we simply record this as "parsing this nonterminal failed" in the memo table, I believe. So I believe it'd actually feasible to incorporate some of the checking you're looking for into Frisby. Of course, as others point out, you can always generate grammar fragments up front if you have a fixed set of things you're looking for in any given program run (something a parser tool like Rats isn't capable of---though with its parametric module system Rats can come *very* close, doing multiple compile-time instantiations of grammar fragments). Packrat parsing, by the way, has made it vastly easier to structure and maintain a grammar for a highly ambiguous, hard-to-parse language (Fortress). I recommend it.
Sometimes, you can avoid context-sensitivity if there's a way to parse the token in question regardless of whether it's valid. For example, Pascal is a context-sensitive language because you may not use a variable before it has been declared:
procedure Foo(x:Integer) begin y := 1; end;
This not a correct Pascal program, nevertheless the parse succeeds just fine. ...
I'm pretty sure predicates in the grammar would let you catch this error at parse time (if you maintained a symbol table and looked up expression occurrences in it as you parsed). That said, I wouldn't necessarily try to structure my parser that way. -Jan-Willem Maessen