Data.Binary.IncrementalGet remake

Hello cafe. Haskell wiki told me about continuation-based parser Data.Binary.IncrementalGet [1] from binary-strict package. I found the idea very useful and tried to use it. Original library by Lennart Kolmodin raises some questions. The lib's main data structures are: data IResult a = IFailed S String | IFinished S a | IPartial (B.ByteString -> IResult a) newtype Get r a = Get { unGet :: S -> (a -> S -> IResult r) -> IResult r } instance Monad (Get r) where return a = Get (\s -> \k -> k a s) m >>= k = Get (\s -> \cont -> unGet m s (\a -> \s' -> unGet (k a) s' cont)) fail err = Get (\s -> const $ IFailed s err) Here, "S" is parser's state. It works well, but currently doesn't support lookAhead. I tried to add such function and gave up to do it having current implementation, but write simpler one instead. Please see IncrementalGet2.hs (link [2] below). Remake is briefly tested, has no ghc-specific optimizations, but allows user to peek data from stream. What bothering me is the fact that I could actually miss idea (or part of idea) of the original. If it is so, please let me know :) For example, what is the point of using separate result type r in original Get r a? [1] - Original IncrementalGet lib. http://www.haskell.org/haskellwiki/DealingWithBinaryData#Incremental_parsing [2] - Main file of remake https://github.com/ierton/binary-incremental-get2/blob/a9f9afaa0cbc0435a8ace... [3] - whole github project https://github.com/ierton/binary-incremental-get2 -- Sergey sorry for weak English

On Wed, Apr 20, 2011 at 1:03 PM, Sergey Mironov
Hello cafe.
Haskell wiki told me about continuation-based parser Data.Binary.IncrementalGet [1] from binary-strict package. I found the idea very useful and tried to use it. Original library by Lennart Kolmodin raises some questions. The lib's main data structures are:
Lennart Kolmodin has a branch of Binary with incremental get which supports lookAhead: https://github.com/kolmodin/binary/tree/cps I don't have performance measurements, but if you look-ahead too far it obviously isn't good for memory consumption. Antoine
data IResult a = IFailed S String | IFinished S a | IPartial (B.ByteString -> IResult a)
newtype Get r a = Get { unGet :: S -> (a -> S -> IResult r) -> IResult r }
instance Monad (Get r) where return a = Get (\s -> \k -> k a s) m >>= k = Get (\s -> \cont -> unGet m s (\a -> \s' -> unGet (k a) s' cont)) fail err = Get (\s -> const $ IFailed s err)
Here, "S" is parser's state. It works well, but currently doesn't support lookAhead. I tried to add such function and gave up to do it having current implementation, but write simpler one instead. Please see IncrementalGet2.hs (link [2] below). Remake is briefly tested, has no ghc-specific optimizations, but allows user to peek data from stream.
What bothering me is the fact that I could actually miss idea (or part of idea) of the original. If it is so, please let me know :) For example, what is the point of using separate result type r in original Get r a?
[1] - Original IncrementalGet lib. http://www.haskell.org/haskellwiki/DealingWithBinaryData#Incremental_parsing [2] - Main file of remake https://github.com/ierton/binary-incremental-get2/blob/a9f9afaa0cbc0435a8ace... [3] - whole github project https://github.com/ierton/binary-incremental-get2 -- Sergey
sorry for weak English
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Hi,
On Wed, Apr 20, 2011 at 8:35 PM, Antoine Latter
On Wed, Apr 20, 2011 at 1:03 PM, Sergey Mironov
wrote: Hello cafe.
Haskell wiki told me about continuation-based parser Data.Binary.IncrementalGet [1] from binary-strict package. I found the idea very useful and tried to use it. Original library by Lennart Kolmodin raises some questions. The lib's main data structures are:
Lennart Kolmodin has a branch of Binary with incremental get which supports lookAhead:
Thanks Antonine for noting this.
I don't have performance measurements, but if you look-ahead too far it obviously isn't good for memory consumption.
Indeed you trade off memory consumption if you're not careful with lookAhead, as you would do with any depth-first parser allowing choice. Although I haven't benchmarked, I think it has a more memory efficient implementation than the increasingly popular text parsing library attoparsec (foundation of haskell's fastest json library, aeson).
Antoine
data IResult a = IFailed S String | IFinished S a | IPartial (B.ByteString -> IResult a)
newtype Get r a = Get { unGet :: S -> (a -> S -> IResult r) -> IResult r }
instance Monad (Get r) where return a = Get (\s -> \k -> k a s) m >>= k = Get (\s -> \cont -> unGet m s (\a -> \s' -> unGet (k a) s' cont)) fail err = Get (\s -> const $ IFailed s err)
Here, "S" is parser's state. It works well, but currently doesn't support lookAhead. I tried to add such function and gave up to do it having current implementation, but write simpler one instead. Please see IncrementalGet2.hs (link [2] below). Remake is briefly tested, has no ghc-specific optimizations, but allows user to peek data from stream.
What bothering me is the fact that I could actually miss idea (or part of idea) of the original. If it is so, please let me know :) For example, what is the point of using separate result type r in original Get r a?
In the case of parsing with binary, I don't think there are any serious limitations with not being able to specify your own "r" in "Get r a". I ran into something during the implementation, but I'm afraid it wasn't noteworthy enough to remember :) Possibly it had to do something with saving continuations for later use, and how flexible you could be when using them? Hmm.. Anyway, not a problem in binary. Work has been done lately by Johan Tibell to make the builder (part of the Put monad) efficient, and with very good progress. I've worked on the CPS based parsing. Our aim is to provide the easiest to use, and at the same time, the fastest, binary parsing library out there. Expect much internal change, yet backwards compatibility, with the next release :) Unfortunately I've been extremely busy lately, so don't expect a release for a few more months. I invite everybody to try out the new code and try to benchmark and break it :) Cheers, Lennart Kolmodin
participants (3)
-
Antoine Latter
-
Lennart Kolmodin
-
Sergey Mironov