On 12/15/2010 05:48 PM, John Lato wrote:

From: Permjacov Evgeniy <permeakra@gmail.com>

current links

https://github.com/permeakra/Rank2Iteratee
https://github.com/permeakra/PassiveIteratee

The main difference from 'original' iteratees I read about is that both
do not use 'chunks' and pass data one-by-one. So, what I wrote may be
slower, but should be easier to maintain and more transparent for ghc
optimising facilities. I wanted as clean and simple code as possible,
but it is still very, very messy at some places and I want it cleaner.
Any suggestions? I also want to check, how good ghc does its work with
this messy modules. They may become interesting benchmarks.

Have you tried comparing it to either iteratee or enumerator (which had mostly comparable performance last time I checked, with a slight edge to iteratee)? Or to Oleg's library? Try writing test cases, a simple byte-counting application, or similar, so you can compare the performance with the other versions. Both enumerator and iteratee include demo programs that you could use as a starting point.

Ok, I tested with ByteString chunks and got roughly the same performance (less then 5 % difference) as with Data.Iteratee (as expected, as it is not a monad a bottlenec when using chunks). However, with Word8' streams I slows down to point six times slower then lazy IO. this is still may be acceptable if IO actions has to be performed while making nontrivial list fusions, but in general it is fail.

Disappointing, but I'm not surprised. I think getting good performance is possible in principle, but currently there's something missing from the implementations. Whether work needs to be done on GHC's optimizer or the iteratee code, I can't say. Honestly I'm not too interested in pursuing this myself now, but if somebody else wants to it could be fruitful.

John