
On 12/15/2010 05:48 PM, John Lato wrote:
From: Permjacov Evgeniy
mailto:permeakra@gmail.com> current links
https://github.com/permeakra/Rank2Iteratee https://github.com/permeakra/PassiveIteratee
The main difference from 'original' iteratees I read about is that both do not use 'chunks' and pass data one-by-one. So, what I wrote may be slower, but should be easier to maintain and more transparent for ghc optimising facilities. I wanted as clean and simple code as possible, but it is still very, very messy at some places and I want it cleaner. Any suggestions? I also want to check, how good ghc does its work with this messy modules. They may become interesting benchmarks.
Have you tried comparing it to either iteratee or enumerator (which had mostly comparable performance last time I checked, with a slight edge to iteratee)? Or to Oleg's library? Try writing test cases, a simple byte-counting application, or similar, so you can compare the performance with the other versions. Both enumerator and iteratee include demo programs that you could use as a starting point.
Ok, I tested with ByteString chunks and got roughly the same performance (less then 5 % difference) as with Data.Iteratee (as expected, as it is not a monad a bottlenec when using chunks). However, with Word8' streams I slows down to point six times slower then lazy IO. this is still may be acceptable if IO actions has to be performed while making nontrivial list fusions, but in general it is fail. Well, ghc has another complicated case for compiler optimisation tests. CPS-style with rank2 types provides boost to performance, but when using chunks it is insignificant, so haskell-98 version of iteratees may be used with no worries.
I agree that iteratees which work on a per-element level are very clean and should be amenable to optimization by GHC. It also shows a very clear relationship with stream-fusion techniques. Unfortunately when I last tried it I couldn't get acceptable performance. I was using ghc-6.12.1 IIRC, so it could be different now.
John