
Am 17.03.2017 um 15:27 schrieb Mario Blažević:
On 2017-03-16 05:26 PM, Ben Franksen wrote:
I am glad my remark didn't scare you off, in retrospect my wording was perhaps a bit strong. Yes, John did a lot to make things as lazy as possible to avoid excessive memory consumption (cool to say that, isn't it). There is also some ugly type casting (unsafeCoerce) going on, since the parser keeps the alternatives in an array (remember that this is a packrat parser).
I should be able to replace the array with a user-defined record, I submitted a paper to this year's ICFP demonstrating this.
That sounds interesting. Looking forward to read that.
The only problem would be backward compatibility, but if there are no current users there's no problem.
Unfortunately I can't spare the time to work on this ATM. But I would be glad if you would revive the project. PEGs offer some unique advantages for day-to-day parsing tasks, where you can't be bothered to write a separate lexer or mess around with 'try' until your harmless looking grammar actually accepts the source language. A fair portion of these can nowadays be handled nicely with regex-applicative (many file formats are actually regular) but now and again there is one where you need the power of a CFG.
We're on the same page here. I have a solution in mind that would allow one to choose a parsing algorithm, from Parsec-style to Packrat to parallel-parsing CFGs, and apply it to a single grammar specification written with little syntactic overhead compared to Parsec. Some of it is written up, some half-implemented.
My gut feeling would be so say that this can't work because they all build on a different (though /almost/ the same) set of primitives. For instance, IIRC the semantics of 'many' differs in subtle ways between implementations (greedy vs. maximum munch -- but don't ask me about the details its been a while since I studied these things). Cheers Ben