
On 20 August 2010 06:29, wren ng thornton
John Millikin wrote:
On Wed, Aug 18, 2010 at 23:33, Jason Dagit
wrote: The main reason I would use iteratees is for performance reasons. To help me, as a potential consumer of your library, could you please provide benchmarks for comparing the performance of enumerator with say, a) iteratee, b) lazy/strict bytestring, and c) Prelude functions? I'm interested in both max memory consumption and run-times. Using criterion and/or progression to get the run-times would be icing on an already delicious cake!
Oleg has some benchmarks of his implementation at < http://okmij.org/ftp/Haskell/Iteratee/Lazy-vs-correct.txt >, which clock iteratees at about twice as fast as lazy IO. He also compares them to a native "wc", but his comparison is flawed, because he's comparing a String iteratee vs byte-based wc.
I was under the impression Jason was asking about the performance of the iteratee package vs the enumerator package. I'd certainly be interested in seeing that. Right now I'm using attoparsec-iteratee, but if I could implement an attoparsec-enumerator which has the same/better performance, then I might switch over.
So far I've been very pleased with John Lato's work, quality-wise. Reducing dependencies is nice, but my main concern is the lack of documentation. I know the ideas behind iteratee and have read numerous tutorials on various people's simplified versions. However, because the iteratee package uses somewhat different terminology and types, it's not always clear exactly how to translate my knowledge into being able to use the library effectively. The enumerator package seems to have fixed this :)
To be fair, John Lato's in-development branch of iteratee also fixes the naming problem (ie. is closer to Oleg's original naming for Iteratees, Enumerators and Enumeratees). I've been developing applications using iteratee for the past few weeks. Considering documentation, I don't think there is a lack of published characters on the topic. Oleg's series of emails introducing Iteratee and John Lato's article in the Monad.Reader were useful. John Millikin's documentation for enumerator is a welcome addition. However there is a deeper issue that Iteratees are semantically complex, and that complexity is not really addressed by the existing documentation: it mostly covers the various APIs, the design motivation (an extension of the left-fold enumerator), and evangelism (comparisons to lazy IO). I found it difficult to grok the reasons for the types, and what the operational control flow is (eg. how and why does EOF get propagated, how is a seek request communicated etc.). In general there seems to be a lot of interest in Iteratees recently as a way of dealing with resource management in IO. It's great to have a few different implementations to compare, but once performance is benchmarked and semantics are denotated it would be nice to converge on a single implementation and build a platform of libraries on it (for compression etc.), as was done for Lazy ByteString. cheers, Conrad.