New subject: Fwd: Parsing in Practice

19 Oct 2005

      [Sending it again to haskell-cafe]

On 10/18/05, Tom Hawkins  wrote:
...
I am writing a parser for a big, ugly, standard language and I need to
decide between using either Happy or Parsec.
I wrote a parser for a big, ugly, non-standard language - Transact-SQL
from MSSQL.
...
I currently have a priliminary LALR(1) grammar, so a port to Happy would
be relatively easy.
In such situation I would try Happy first, but with some cautiousness.

I also started with LALR(1) parsing in ocamlyacc, but I didn't manage
to cover the whole language with this approach. The more productions
I added the more conflicts I had. Of course I tried to remove the conflicts,
but often when I added new productions I had to repeat the work. It is
possible that my understanding of LALR parsing was insufficient, but the
problem is that the set of LALR grammars is not closed under composition
(as I've read in some paper on GLR parsing).

Do you know that Happy supports GLR parsing?
...
But, I'm wondering if life would be easier if I chose Parsec's combinator
parsing instead.
...
From my experience it was indeed quite nice. However, there were some
parts of the grammer that I didn't know how to handle without using "try",
which could result in worse efficiency and error reporting.
...
It's error reporting seems to be top notch
So long as you can avoid using "try" for non-terminals.
...
and it's "optional", "many", and "sepBy1" combinators are very elegant.
Indeed they are, as they correspond nicely to E in EBNF ;-)
...
However, I have a few concerns with Parsec.  First is performance; what
factor of slow-down should I expect?
I only have a Parsec parser, so I can't compare.  I dimly remember that
my parser had an average efficiency, ie. it could parse about 100kB
per second on P3 850.

There is a parser-combinator version of a parser for Haskell which you
could compare to the one from GHC.
...
Second is bug prevention.  I don't have much experience writing LL(n)
grammars, so how easy is it to introduce bugs in a Parsec grammar?
I always feared that this would be a problem, but it wasn't. I used a
sample of about 1MB of application code to test, and I didn't get to
using QuickCheck.
...
Even though I hate debugging LALR(1) parsing ambiguities,
it prevents problems.
Yes, when you fight the conflicts, you get static guarantees in return.
I wonder how it is with GLR?

Best regards
Tomasz

Fwd: [Haskell-cafe] Parsing in Practice

Tomasz Zielonka

Sylvain Schmitz

tags

participants (2)