
On Wed, 2 Mar 2011 14:14:02 +0100, you wrote:
Thank you all for the responses. Here's an example:
As I alrerady said, I tried to parse the MMIXAL assembly language. Each instruction has up to three operands, looking like this:
@+4 (Jump for bytes forward) "foo" (the string foo" '0'>>(1+2)
etc. A string literal may contain anything but a newline, (there are no escape codes or similar). But when I add a check for a newline, the parser just fails and the next one is tried. This is undesired, as I want to return an error like "unexpected newline" instead. How is this handled in other parsers?
Tillman's reply is absolutely correct. If a particular sequence of characters is invalid according to your grammar, then _all_ of the alternatives in scope at that point should fail to parse that sequence. If that's not happening, then there's something wrong with the way you've expressed your grammar. I don't know how much experience you have with language grammars, but it might be helpful to try to write down MMIXAL's grammar using EBNF notation, as a starting point. -Steve Schafer