
Hello again, I apologize for replying to myself, but since no one else is talking to me, I suppose I have no choice. :) Anyhow, in case some people were intrigued, but simply didn't speak up (and because I was interested in seeing how easily it could be done), I took the liberty of implementing a version of the parser inverter that mimics the OCaml semantics pretty closely (I think). As I mentioned, this involves making a list data type that incorporates monads, so that it can be lazy in the side effects used to produce it. In short it looks like this: data MList' m a = MNil | MCons a (MList m a) type MList m a = m (MList' m a) So, each list tail (including the entire list) is associated with a side effect, which has the ultimate effect that you can build lists in ways such as: toMList :: Monad m => m (Maybe t) -> MList m t toMList gen = gen >>= maybe nil (`cons` toMList gen) This is the MList analogue of the toList function from the previous list (slightly modified here to demonstrate the similarity): toList :: Monad m => m (Maybe a) -> m [a] toList gen = gen >>= maybe (return []) (\c -> liftM (c:) $ toList gen) However, toList uses liftM, which will strictly sequence the effects (the recursive toList call has to complete before the whole list is returned), whereas toMList simply adds the *monadic action* to produce the rest of the list as the tail, and so the side effects it entails don't actually occur until a consumer asks to see that part of the list. So, the proof is in the output. The sample program (source included as an attachment) demonstrates normal lexing (where the underlying monad is just IO) and inverted lexing (which uses delimited continuations layered over IO). The 'lexing' is just the 'words' function adapted to MLists (I thought about doing a full-on parser, but I think that'd require making the parser a monad transformer (essentially) over the base monad, which would be complex, to say the least). The relevant parts look like so: normalLex :: IO () normalLex = printTokens (wordsML (liftList "The quick brown fox jumps over the lazy dog")) reqLex :: CCT ans IO () reqLex = do p1 <- begin p2 <- provideSome "The quick brown " p1 pStrLn "Break 1" p3 <- provideSome "fox jumps over " p2 pStrLn "Break 2" p4 <- provideSome "the laz" p3 pStrLn "Break 3" provideSome "y dog" p4 >>= finish pStrLn "Rollback" provideSome "iest dog" p4 >>= finish return () Which main invokes appropriately. Output looks like so: Normal Lexing ------------- The quick brown fox jumps over the lazy dog ------------- Inverted Lexing --------------- The quick brown Break 1 fox jumps over Break 2 the Break 3 lazy dog Rollback laziest dog --------------- So, success! Tokens are printed out as soon as the lexer is able to recognize them, properly interleaved with other IO side effects, and resuming from an intermediate parse does not cause duplication of output. So, that wasn't really that hard to hack up. However, I should mention that it wasn't trivial, either. When converting list functions to MList functions, you have to be very careful not to perform side effects twice. For instance, my first pass gave output like: ... he uick rown Break 1 ox ... Although it worked fine with the normal lexer. The culprit? I had written nullML like so: nullML :: Monad m => MList m a -> m Bool nullML m = isNothing `liftM` uncons m But in that version, testing for null, and then using the list performs side effects twice, and due to the way the delimited continuations produce MLists, characters were getting dropped! The correct version is: nullML :: Monad m => MList m a -> m (Bool, MList m a) nullML m = uncons m >>= maybe (return (True, nil)) (\(a,m') -> return (False, a `cons` m')) Which returns both whether the list is null, and a new list that won't perform a duplicate side effect. So, I guess what I'm saying is that reasoning about code with lots of embedded side effects can be difficult. :) As a final aside, it should be noted that to get the desired effect (that is, laziness with interleaved side effects), it's important to make use of the monadic data structures as much as possible. For instance, wordsML produces not an (m [MList m a]) or MList m [a] or anything like that (although the latter may work), but an MList m (MList m a), which is important for the effects to be able to get a hold over printTokens. However, if you want to produce something that's not a list, say, a tree, you'll have to write an MTree, or, in general, one lazy-effectful data structure for each corresponding pure structure you'd want to use. What a pain! However, there may be a way to alleviate that if you write all your structures in terms of shape functors. For instance: data ListShape a x = LNil | LCons a x newtype Fix f = In { out :: f (Fix f) } -- I think this is right type List a = Fix (ListShape a) And in general, many recursive data structures can be expressed as the fixed-point of shape functors. The kicker is, you can get the monadic version for free: newtype MShape m f x = M (f (m x)) type MList m a = m (Fix (MShape m (ListShape a))) -- = m (MShape m (ListShape a) (Fix (MShape m (ListShape a)))) -- = m (ListShape a (m (Fix (MShape m (ListShape a))))) -- = m (ListShape a (MList m a)) -- = m (LNil | LCons a (MList m a)) -- same as our manual definition -- I think the above substitutions are right, but I may have -- misstepped Of course, I haven't investigated this avenue, so I don't know if it helps in actually writing functions that *use* such data structures (and it might kill your ability to deforest/use an optimized representation underneath). However, I thought it was a cute use of the sort of thing you're likely to see in papers that apply category theory to Haskell, but typically not in practice. Anyhow, I hope that was of some interest to at least someone out there. If you have questions or comments, feel free to respond. Cheers Dan Doel