Re: [Haskell-cafe] ANNOUNCE: enumerator 0.4.8

Wren Thornton wrote:
This is often conflated with the iteratee throwing an error/exception, which is wrong because we should distinguish between bad program states and argument passing.
I guess this is a matter of different points of view on exceptions. I am a fan of the model of (effectful) computation proposed by Cartwright and Felleisen a while ago: http://okmij.org/ftp/Computation/monads.html#ExtensibleDS In their model, all the computation is done by throwing resumable exceptions -- including the pure computation such as arithmetic and CBV/CBN applications. The similarity of ExtensibleDS.hs with Iteratee should be quite noticeable, especially regarding the part about throwing `errors.' The relation with co-routines is hard to miss: in fact, Iteratees are built upon co-routines, which are being resumed by the enumerator. The error message is the additional piece of data that is being associated with `yield', telling the enumerator to do something extra rather than mere getting the next piece of data and resuming the co-routine. The co-routine is the simplest part of the iteratee; it is the plumbing that takes a long time to engineer. John A. De Goes:
2. Error recovery is ill-defined because errors do not describe what portion of the input they have already consumed;
I'm confused about this complaint: if an iteratee encounters an unusual condition or just has a special request, it sends a message that eventually propagates to the responsible enumerator. That enumerator knows how much data it has sent down to iteratees. The RandomIO module http://okmij.org/ftp/Haskell/Iteratee/RandomIO.hs is a good illustration: when the enumerator receives the seek request, it checks if the desired stream offset corresponds to the data already in the current IO buffer. If so, no IO is performed and the iteratee is resumed with the existing buffer data. The tests in the file check for that. Iteratee knows nothing about the buffer or if there is a buffer. Wren Thornton wrote:
In an ideal framework the producers, transformers, and consumers of stream data would have a type parameter indicating the up-stream communication they support or require (in addition to the type parameters for stream type, result type, and side-effect type). Very true. Currently the design of Iteratees quite resembles that of Control.Exception: everything can throw SomeException. Ideally one would like to be more precise, and specify what exceptions or sorts of exceptions could be thrown -- by Iteratees, and by ordinary Haskell functions. The design of a good effect system is still the topic of active research, although there are some encouraging results.
John A. De Goes:
3. Iteratees sometimes need to manage resources, but they're not designed to do so which leads to hideous workarounds;
Gregory Collins:
The thing which I find is missing the most from enumerator as it stands is not this -- it's the fact that Iteratees sometimes need to allocate resources which need explicit manual deallocation (i.e. sockets, file descriptors, mmaps, etc), but because Enumerators are running the show, there is no "local" way to ensure that the cleanup/bracket routines get run on error.
I used to think that processing several inputs at different paces was indeed a stumbling block. It seemed that an iteratee needed to open a separate file, which it is indeed ill-equipped to do. Fortunately, that difficulty has been overcome, surprisingly in a natural way with no changes to the library: http://okmij.org/ftp/Streams.html#2enum1iter The pleasant surprise is that we can iterate (no pun intended) Iteratee monad transformers, just as we did with (IORT s) monad transformer in the Lightweight Monadic Regions. Thus we maintain the region-like discipline of managing resources. I'm keen to hear of the example that seem to require Iteratee's allocating additional resources. I'd really like to see if any of such cases can be cast it terms of regions, implemented via iterated Iteratee transformers. John A. De Goes:
1. It does not make sense in general to bind with an iteratee that has already consumed input, but there's no type-level difference between a "virgin" iteratee and one that has already consumed input;
I'm not sure I follow. Why should it make a difference between a virgin iteratee and the one that consumed some input. One should think of the Iteratee as two arguments of fold (f and z) bundled together. Why the function being folded over should care how many times it has been applied to input data? It is a pure function, transforming state plus input to a new state. The useful laws of fold hold precisely because the function f is pure and doesn't care. A detailed example showing why you think you need this distinction would be appreciated.
4. Iteratees cannot incrementally produce output, it's all or nothing, which makes them terrible for many real world problems that require both incremental input and incremental output.
Again, a detailed example, a use case if you will, describing the desired behavior is appreciated. By detailed I mean the example that describes the desired input-output behavior, preferably including sample input or output data. For instance, the question posed by Evgeniy Permjacov last December was precise and helpful: (incrementally) merge two sorted streams. I can't promise a prompt reply this month or in April, I'm afraid.

On 3/29/11 4:40 AM, oleg@okmij.org wrote:
Wren Thornton wrote:
This is often conflated with the iteratee throwing an error/exception, which is wrong because we should distinguish between bad program states and argument passing.
I guess this is a matter of different points of view on exceptions.
The problem is not so much the exceptions per se (one goto is about as good as any other), it has more to do with the fact that important things are being left out of the types. One of the great things about Haskell is that you can lean so heavily on the type system to protect yourself when refactoring, designing by contract, etc. However, if there's an unspoken code of communication between specific enumerators and iteratees, it's very easy to break things. This is why the communication should be captured in the types, regardless of the control-flow mechanism used to implement that communication. I'd like the static guarantee that whatever special requests my iteratee could make, its enumerator is in a position to fulfill those requests (or die trying). Allowing for the iteratee to be paired with an enumerator which is incapable of handling its requests is a type error and should be treated as such.
Wren Thornton wrote:
In an ideal framework the producers, transformers, and consumers of stream data would have a type parameter indicating the up-stream communication they support or require (in addition to the type parameters for stream type, result type, and side-effect type).
Very true. Currently the design of Iteratees quite resembles that of Control.Exception: everything can throw SomeException. Ideally one would like to be more precise, and specify what exceptions or sorts of exceptions could be thrown -- by Iteratees, and by ordinary Haskell functions. The design of a good effect system is still the topic of active research, although there are some encouraging results.
Yeah, I'm not a big fan of extensible exceptions either. Don't get me wrong, it's an awesome hack and it's far cleaner than the Java approach; but it still goes against my sensibilities. I think a big part of the problem is that we don't have a good type theory for coroutines. The idea of functions that "never return" just doesn't cut it. And conflating legitimate control-flow manipulation with bottom doesn't either. But, as of yet, that's all we've got. -- Live well, ~wren
participants (2)
-
oleg@okmij.org
-
wren ng thornton