
Now take decodeRLEb and feed it's output to some nontrivial parser, and then feed the remainder of the input, unmodified, into another parser:
so the code as posted didn't exhibit a full use case. that specification is still a bit vague. assuming that p1: decodeRLE, p2: nontrivial parser, and p3: another parser, it could be interpreted simply as parser combination: do { output <- p1; x <- p2 output; y <- p3; return (x,y) } or perhaps you meant to run p2 over the output of p1 in a separate parser chain, with the remaining input left by p1, not by p2, being fed into p3? do { output <- p1; Just x <- return $ evalStateT p2 output; y <- p3; return (x,y) } then we'd have something like p2 `stack` p1 = do { out <- p1; Just x <- return $ evalStateT p2 out; return x }
Since I don't know how much data other_stuff is going to consume - let alone how much of the raw data you have to feed to decodeRLEb to make that much data - we arrive at the structure shown.
ah, that suggests yet another specification, a variation of the second version above, where the parser in control is not p1 itself, but p2, with p1 acting as an input transformation for p2, and p3 resuming where p1 left off. the difference being that p2's demand is supposed to drive p1's input processing. which is a bit of a problem. parsers are usually data- and grammar-driven, not demand-driven, ie the input consumed by p1 does not usually depend on the demands on p1's output. one could let p1 generate results of increasing length, and let p2 pick a result that fits, but that would involve rerunning p2 on the complete prefix of too-short results, backtracking into p1 until it produces an output useable by p2 - not exactly elegant or efficient, but it would fit the second variant above (one would have to ensure that p1 backtracked only over the length of input consumed, eg, an outermost 'many', and that the shortest alternative was produced first). looking a little bit more closely, however, p1 is used more as a piecewise input transformation for p2 than as a separate parser. so it makes more sense to integrate p1 into p2 (or rather: parts of p1 - if p1 is 'many group', then we should integrate only 'group'; in other words, we'd like to run p1 repeatedly, in minimal-much mode, rather than the more typical once, in maximal-munch mode), so that the latter uses some part of p1 as its item parser (which, in turn, assumes that p2 has a single, identifiable item parser - called 'fetch' here, and no other way to access the parse state). that seems to be what you have in mind with your stacked approach, where the state is read exclusively through the fetch method in the Source interface, and a Source can either be a plain list or buffered item parser stacked on top of a Source (where fetch is served from the buffer, which is replenished by running the item parser over the inner Source; btw, are unused buffer contents discarded when p2 finishes? they can't be transformed back into p1/p3's input format..). instead of using a type class, one could also parameterise p2 by its item parser, 'fetch'. that might make it easier to see that this stacking is a kind of parser composition. unlike the standard function and monad compositions, this one relies on the compositional nature of combinator parsers: there's an item parser, which accesses the input and produces output, and there is a coordination framework (the combinatorial grammar) specifying how the item parser is to be used. function composition allows us to abstract over each part of the composed function, including the inner function in a 'stack' of functions: \x->f (g x) ==> -- abstract over g (f .) we can try to view parsers as composed from a grammar and an item parser, where the latter is the 'inner' part of this composition: \s->(item >> item) s `mplus` item s ==> -- abstract over item \item s->(item >> item) s `mplus` item s turning item/fetch into a type class method is just another way of composing the grammar with an item parser. i had to implement it myself to understand what you were trying to do, and how.. if indeed i have understood?-) hth, claus
(This makes it, what, the 5th time I've explained this? LOL...)
with problem specifications, it isn't quantity that counts. the more ambiguous the specification, the more likely it is that replies interpret it in ways that do not answer the question. the fewer replies seem to address the question, the more likely it is that the specification needs to be clearer. on a high-volume list where readers might dip into and out of long threads at any point, repetition in the form of concise summaries can be helpful, even to those readers who might follow every post in every thread.