Re: [Haskell-cafe] attoparsec and backtracking

16 Mar 2013


      On 16 March 2013 12:54, Erik de Castro Lopo  wrote:
...
Evan Laforge wrote:
...
The first is that it's hard to get the right error msg out.  For
instance, I have a parser that tries to parse a number with an
optional type suffix.  It's an error if the suffix is unrecognized:
p_num :: A.Parser Score.TypedVal
p_num = do
    num <- p_untyped_num
    suffix <- A.option "" ((:"") <$> A.letter_ascii)
    case Score.code_to_type suffix of
        Nothing -> fail $ "p_num expected suffix in [cdsr]: " ++ show suffix
        Just typ -> return $ Score.Typed typ num
I think the mistake here is to parse something and then decide if
its it valid. It should be the parser which decides whether its
valid. So rather than:
suffix <- A.option "" ((:"") <$> A.letter_ascii)
try:
typ <- A.choice [ {- list or valid suffix parsers -} ]
     return $ Score.Typed typ num
...
However, which error msg shows up depends on the order of the (<|>)
alternatives, and in general the global structure of the entire
parser, because I think it just backtracks and then picks the last
failing backtrack.
I'm not sure if what I've offered will help, but its worth a try.
...
Even after carefully rearranging all the parsers
it seems impossible to get this particular error to bubble up to the
top.
Yes, I've found it impossible to force attoparsec to fail a parse.
I think that is intended as a feature.
I don't know about a "feature", but I tried adding polyparse-style
commit semantics to attoparsec and couldn't do so without making it
rather noticeably slower.
...
...
The thing is, as soon as I see an unexpected suffix I know I can
fail entirely right there, with exactly that error msg, but since
there's no way to turn off backtracking I think there's no way to do
that.
Yes, that's my impression.
<snip>
...
I
originally used parsec, but parsing speed is my main bottleneck, so I
don't want to give ground there.
We you using Parsec as a token parser or as a Char parser. Obviously
the second is going to be slow in comparison to the first.
...
I've heard some good things about traditional alex+happy...
of course it would mean a complete rewrite but might be interesting.
I've user Alex with both Parsec and Happy, where speed was strong
secondary goal. Personally I much prefer Parsec; IMO its easier to debug
and extend and modify that Happy based parsers. I also know some people
prefer Happy.
...
Has anyone compared the performance of attoparsec vs. alex+happy?
I haven't, nor have I compared those two with alex+parsec. It would
be an interesting experiment.
Erik
--
----------------------------------------------------------------------
Erik de Castro Lopo
http://www.mega-nerd.com/
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe
-- 
Ivan Lazar Miljenovic
Ivan.Miljenovic@gmail.com
http://IvanMiljenovic.wordpress.com