
On Sunday 20 July 2008, John Meacham wrote:
I do not believe that is the case, since the return type of runParser "Either ParseError a" means that before you can extract the result of the parse from the 'Right' branch, it must evaluate whether the result is 'Left' or 'Right' meaning it needs to parse the whole input in order to determine whether the parse was succesful.
This was the reason I made frisby's main parsing routine just be (roughly)
runPeg :: P a -> String -> a
so you have to do something explicit like
runPegMaybe :: P a -> String -> Maybe a runPegMaybe p s = runPeg (fmap Just p // return Nothing) s
to force strictness in the parsing.
Though, perhaps parsec is doing something more clever. I do know it uses the one token lookahead trick to determine which branch to take on alternation, but I don't think that solves the issue with parsing the entire file..
It doesn't stop it from parsing the entire file strictly. However, what it does do is prevent the parser from backtracking out of arbitrary amounts of lookahead. So, unless you use try (which allows for lookahead), when any token is consumed by the parser, it can be garbage collected (assuming the parser is the only thing pointing to the token stream). So, it consumes its input strictly, but with limited overhead (ideally using try only for some small bounded lookahead where it's needed). By contrast, a naive parser combinator of the form: p = foo <|> bar -- or p = try foo <|> bar in parsec Might read the entire file into memory parsing foo, without any of it being garbage collected until completion, in case foo fails and a backtrack to bar is required. Of course, this all assumes that the input to the parser can both be lazily generated, and discarded in pieces (so, not the case if reading an entire file into a strict byte string, for instance). -- Dan