Impact of "try" on Parsec performance

The Parsec documentation says that Parsec performs best on predictive grammars, and Parsec does not backtrack by default to improve performance (though this also improves error messages). On the other hand, I notice that attoparsec and polyparse backtrack by default, and attoparsec claims to be faster than Parsec (I can't remember if polyparse makes this claim). The Parsec documentation says there is a penalty for using try and suggests left-factoring grammars when possible. Real World Haskell says that excessive use of try can degrade Parsec performance quite significantly. On the other hand, some of the combinators provided with Parsec, such as notFollowedBy, use try. So my question is: what is the practical impact of using try? I ask because as a novice I have a simple grammar that could, perhaps, be factored to avoid use of try, but it's quite a brain puzzle for me and I wonder if it would ever be worth it. Does it matter how many characters "try" would have to store, or how often it would run?

Omari Norman wrote:
So my question is: what is the practical impact of using try?
My experience is based mostly on trying to improve the error location reporting in Ben Lippmeier's DDC compiler. What I found was that removing Parsec.try's almost always resulted in the parser generating better location information. Erik -- ---------------------------------------------------------------------- Erik de Castro Lopo http://www.mega-nerd.com/

On Fri, Mar 2, 2012 at 10:35 PM, Erik de Castro Lopo
Omari Norman wrote:
So my question is: what is the practical impact of using try?
My experience is based mostly on trying to improve the error location reporting in Ben Lippmeier's DDC compiler. What I found was that removing Parsec.try's almost always resulted in the parser generating better location information.
Roman Cheplyaka made some tweaks to 'try' error reporting in parsec 3.1.2, so this might be different now. Antoine

On Fri, Mar 2, 2012 at 10:30 PM, Omari Norman
The Parsec documentation says that Parsec performs best on predictive grammars, and Parsec does not backtrack by default to improve performance (though this also improves error messages).
A big thing that parsec strives to do is to "forget" consumed portions of the input as quickly as possible, allowing it to be collected. Using 'try' forces the program to hold on to input for longer than would be required if the grammar were factored to avoid 'try'.
The Parsec documentation says there is a penalty for using try and suggests left-factoring grammars when possible. Real World Haskell says that excessive use of try can degrade Parsec performance quite significantly. On the other hand, some of the combinators provided with Parsec, such as notFollowedBy, use try.
The assumption is that 'notFollowedBy' is primarily used with 'short' parsers, which means the amount of input we need to preserve is less (also, I think using 'try' is the only way to implement 'notFollowedBy').
So my question is: what is the practical impact of using try? I ask because as a novice I have a simple grammar that could, perhaps, be factored to avoid use of try, but it's quite a brain puzzle for me and I wonder if it would ever be worth it. Does it matter how many characters "try" would have to store, or how often it would run?
I guess you could test running a parser which always fails without consuming input both underneath a 'try' and not, to see what non-storage overhead 'try' has. I'm guessing the overhead would be minimal, but I have not measured it. Antoine

I'd suggest `try` can make parsers had to debug and fragile when refactoring. `try` is very useful for char parsers, where you don't really want to be sharing a prefix amongst different parsers. For regular parsers you have equivalents of the EBNF operators plus the `sepBy` etc. parsers so the "obsessive" left recursion removal you see in text books isn't so burdensome - you have combinators to help you do it rather than having to rely on introducing more productions.

* Omari Norman
The Parsec documentation says that Parsec performs best on predictive grammars, and Parsec does not backtrack by default to improve performance (though this also improves error messages).
On the other hand, I notice that attoparsec and polyparse backtrack by default, and attoparsec claims to be faster than Parsec (I can't remember if polyparse makes this claim).
Attoparsec's goal is to be fast in practice. It achieves this goal by making some compromises -- e.g. being potentially asymptotically slower and admitting potential memory leaks, because of the unrestricted backtracking. Often this doesn't happen or isn't significant, but I believe users should be aware of this. I think the speed benefit of not restricting backtracking comes just from allocating less continuations. Bryan, can you confirm? Although Polyparse permits backtracking by default, it is still closer to Parsec in that it allows controlled backtracking. -- Roman I. Cheplyaka :: http://ro-che.info/

On 3 Mar 2012, at 04:30, Omari Norman wrote:
On the other hand, I notice that attoparsec and polyparse backtrack by default, and attoparsec claims to be faster than Parsec (I can't remember if polyparse makes this claim).
In my benchmarks, polyparse has about the same performance as Parsec, when using the monadic style (possibly a very tiny bit faster). But polyparse is hugely, asymptotically, faster than Parsec when your parser is written in applicative style, your input text is large, and you consume the parse results lazily. Regards, Malcolm

On Mar 3, 2012, at 12:45 , Malcolm Wallace wrote:
On 3 Mar 2012, at 04:30, Omari Norman wrote:
On the other hand, I notice that attoparsec and polyparse backtrack by default, and attoparsec claims to be faster than Parsec (I can't remember if polyparse makes this claim).
In my benchmarks, polyparse has about the same performance as Parsec, when using the monadic style (possibly a very tiny bit faster). But polyparse is hugely, asymptotically, faster than Parsec when your parser is written in applicative style, your input text is large, and you consume the parse results lazily.
As is the case for the u-parsinglib ;-} Doaiyse
Regards, Malcolm
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
participants (7)
-
Antoine Latter
-
Erik de Castro Lopo
-
Malcolm Wallace
-
Omari Norman
-
Roman Cheplyaka
-
S D Swierstra
-
Stephen Tetley