
On Sun, Jul 20, 2008 at 09:55:15AM -0400, Isaac Dupree wrote:
It doesn't stop it from parsing the entire file strictly. However, what it does do is prevent the parser from backtracking out of arbitrary amounts of lookahead. So, unless you use try (which allows for lookahead), when any token is consumed by the parser, it can be garbage collected (assuming the parser is the only thing pointing to the token stream). So, it consumes its input strictly, but with limited overhead (ideally using try only for some small bounded lookahead where it's needed).
So with Parsec, you can keep the *input* from filling up memory, but if you do, the *result* will still take up space (e.g. Right (value)). For a simple transformation where the output is a similar string to the input, it will be just as large, so not much space is actually saved (maybe a factor of 2 -- just keeping the output, not also the input), it seems.
Yeah, this is my understanding. frisby combats this via 'irrefutable' parser combinators. An irrefutable combinator is one that always succeeds, a prime example is the 'many' combinator. Since 'many' consumes only as many of its arguments as it can and is perfectly fine consuming nothing, it inherently always succeeds so the parser can immediately begin returning results (before consuming all of the input). Ironically, this means frisby often uses less space than other parsers, despite being based on PEGs which generally are known for taking a lot of space. It is not too hard to ensure your optimizer is irrefutable, for instance, the parser for a simple language might be
many statement <> eof
however, the 'eof' makes the parser non-irrefutabel. however it is easy to gain back by doing
many statement <> (eof // pure (error "unexpected data"))
frisbys static analysis realizes that (irrefutable // ... ) and ( ... // irrefutable) are irrefutable. John -- John Meacham - ⑆repetae.net⑆john⑈