Re: [Haskell-cafe] Parsec to parse tree structures?

17 Mar 2010

      Thanks for the links!

I didn't realize that Iteratee can also do random access. I looked at
Oleg's slides a while back, but I haven't wrapped my head around it.

On Mon, 2010-03-15 at 08:57 -0500, John Lato wrote:
...
Hi Dave,
...
From: david fries 
Hello Café
Some time ago I wrote a parser for a project of one our customers. The
format was proprietary and binary. The data was structured as a tree
with tables pointing to sub tables farther in the file. (Well actually
there was one or two cases where branches joined together, so I guess it
was a directed graph.) Also it had no defined endianess, some tables
were bigendian others were little endian and some were in their very own
in-house-endianess.
All in all, the typical binary data structure that has been in use and
continuously extended for the last 15 years. You know the kind.
My parser was written in C# and wasn't particularly elegant, but it
worked reliably. I was wondering how you would parse tree-like
structures with Parsec (or other functional parsers)? Up to know, all
examples I've seen were of sequential data. To parse trees, you'd
essentially need to be able to follow pointers, parse whatever is there
and then jump back. I guess I'd have to mess around with the internal
state of the parser to do that, which is something I'd rather avoid.
Have you seen Edward Kmett's Parsing Trifecta talk/slides, available
at http://comonad.com/reader/2009/iteratees-parsec-and-monoid/ ?  I
think a similar approach would allow you to parse this data structure.
 The trick is to create a data source that can read from arbitrary
locations.
Also see http://okmij.org/ftp/Haskell/Iteratee/Tiff.hs for an
iteratee-based TIFF reader.  TIFF files have a similar structure, so
this code could apply pretty directly.  The parsing is quite
low-level, but you could use iteratee-parsec [1] or
attoparsec-iteratee [2] to use parser combinators with this technique.
[1] http://hackage.haskell.org/package/iteratee-parsec
[2] http://hackage.haskell.org/package/attoparsec-iteratee