Re: [Haskell-cafe] ansi2html - one program, several issues

24 Jul 2008

      On Sun, Jul 20, 2008 at 09:55:15AM -0400, Isaac Dupree wrote:
...
...
It doesn't stop it from parsing the entire file strictly. However, what 
it does do is prevent the parser from backtracking out of arbitrary 
amounts of lookahead. So, unless you use try (which allows for 
lookahead), when any token is consumed by the parser, it can be garbage 
collected (assuming the parser is the only thing pointing to the token 
stream). So, it consumes its input strictly, but with limited overhead 
(ideally using try only for some small bounded lookahead where it's 
needed).
So with Parsec, you can keep the *input* from filling up memory, but if  
you do, the *result* will still take up space (e.g. Right (value)).  For  
a simple transformation where the output is a similar string to the  
input, it will be just as large, so not much space is actually saved  
(maybe a factor of 2 -- just keeping the output, not also the input), it  
seems.
Yeah, this is my understanding. frisby combats this via 'irrefutable'
parser combinators. An irrefutable combinator is one that always
succeeds, a prime example is the 'many' combinator. Since 'many'
consumes only as many of its arguments as it can and is perfectly fine
consuming nothing, it inherently always succeeds so the parser can
immediately begin returning results (before consuming all of the input).
Ironically, this means frisby often uses less space than other parsers,
despite being based on PEGs which generally are known for taking a lot
of space.

It is not too hard to ensure your optimizer is irrefutable, for
instance, the parser for a simple language might be
...
many statement <> eof
however, the 'eof' makes the parser non-irrefutabel. however it is easy
to gain back by doing
...
many statement <> (eof // pure (error "unexpected data"))
frisbys static analysis realizes that (irrefutable // ... ) and ( ... //
irrefutable) are irrefutable. 

        John

-- 
John Meacham - ⑆repetae.net⑆john⑈