
Summary: I have made the simplified version Duncan speculates about... Duncan Coutts wrote:
Yep, definitely interested. Sounds like we could make something that would satisfy the needs of existing users of the binary and binary-strict packages.
I see this as incremental development of incremental get. The comments in binary-strict's code point to one additional useful feature: When the parser suspends and asks for more input it could potentially also return a list or Sequence of the output-so-far. I believe this is possible to add to my existing code (even the simplified version below), so there will be an even fancier version eventually. This would make the parser a controllable part of a pipeline.
We'll have to look closely at the performance costs of the new features but my intuition is that a non-transformer but continuation based version that has error handling (and plus/alternative) and can request more input should have minimal cost.
Duncan
To modify down "MyGet.hs" to produce that type is a matter of using the delete key (which I have done, see below). I only have my Apple powerpc/G4 laptop to run Haskell, this and the lack of either need or available time means I will not be making performance measurements. I have written for the language shootout before and I had binary's Get.hs to look at, and so I claim my code has no show-stopping performance killers in it. A bit of !strictness and even more INLINE pragmas should be all it needs. And so I just took such a knife to MyGet.hs to make MyGetSimplified.hs. It is at http://darcs.haskell.org/packages/protocol-buffers/Text/ProtocolBuffers/ with the other files. The simplified form of the data definitions mostly fits in this email: newtype Get a = Get { unGet :: forall b. -- the forall hides the CPS style Success b a -- main continuation -> S -- parser state -> FrameStack b -- error handler stack -> Result b -- operation } type Success b a = (a -> S -> FrameStack b -> Result b) data S = S { top :: {-# UNPACK #-} !S.ByteString , current :: {-# UNPACK #-} !L.ByteString , consumed :: {-# UNPACK #-} !Int64 } deriving Show data FrameStack b = [snip details] data Result a = Failed {-# UNPACK #-} !Int64 String -- the Int64 is amount consumed successfully | Finished {-# UNPACK #-} !L.ByteString {-# UNPACK #-} !Int64 a -- the bytestring is the unconsumed part, the Int64 is amount consumed | Partial (Maybe L.ByteString -> Result a) -- passing Nothing indicates that there will never be more input This could be streamlined further by making (FrameStack b) a field of S. The MonadError/Plus/Alternative should still work. It will still suspend and resume. A few more short functions and this will have the same exported signatures as binary's Get.hs. Cheers, Chris