
From: John Millikin
Here's my (uneducated, half-baked) two cents:
There's really no need for an "Iteratee" type at all, aside from the utility of defining Functor/Monad/etc instances for it. The core type is "step", which one can define (ignoring errors) as:
data Step a b = Continue (a -> Step a b) | Yield b [a]
Input chunking is simply an implementation detail, but it's important that the "yield" case be allowed to contain (>= 0) inputs. This allows steps to consume multiple values before deciding what to generate.
In this representation, enumerators are functions from a Continue to a Step.
type Enumerator a b = (a -> Step a b) -> Step a b
I'll leave off discussion of enumeratees, since they're just a specialised type of enumerator.
-------------
Things become a bit more complicated when error handling is added. Specifically, steps must have some response to EOF:
data Step a b = Continue (a -> Step a b) (Result a b) | Result a b
data Result a b = Yield b [a] | Error String
In this representation, "Continue" has two branches. One for receiving more data, and another to be returned if there is no more input. This avoids the "divergent iteratee" problem, since it's not possible for Continue to be returned in response to EOF.
Is this really true? Consider iteratees that don't have a sensible default value (e.g. head) and an empty stream. You could argue that they should really return a Maybe, but then they wouldn't be divergent in other formulations either. Although I do find it interesting that EOF is no longer part of the stream at all. That may open up some possibilities. Also, I found this confusing because you're using Result as a data constructor for the Step type, but also as a separate type constructor. I expect this could lead to very confusing error messages ("What do you mean 'Result b a' doesn't have type 'Result'?")
Enumerators are similarly modified, except they are allowed to return "Continue" when their inner data source runs out. Therefore, both the "continue" and "eof" parameters are Step.
type Enumerator a b = (a -> Step a b) -> Step a b -> Step a b
I find this unclear as well, because you've unpacked the continue parameter but not the eof. I would prefer to see this as: type Enumerator a b = (a -> Step a b) -> Result a b -> Step a b However, is it useful to do so? That is, would there ever be a case where you would want to use branches from separate iteratees? If not, then why bother unpacking instead of just using type Enumerator a b = Step a b -> Step a John