Re: [Haskell-cafe] ANNOUNCE: enumerator, an alternative iteratee package

From: John Millikin
On Sat, Aug 21, 2010 at 15:35, Paulo Tanimoto
wrote: Apologies if I'm asking you to repeat yourself, but I couldn't find the explanation. Â What was the reason why you went with IterateeM instead of IterateeMCPS? Â Simplicity?
Iteratees are difficult enough to understand already -- requiring prospective users to learn and understand CPS would just be another roadblock. The CPS implementation is also slower -- I performed some basic benchmarking of IterateeM.hs and IterateeMCPS.hs, and CPS is only faster without optimizations. At -O, they are equal, and at -O2, IterateeM is faster.
Apologies if this discussion has already moved on; I'm just catching up on weekend email but wanted to respond to this directly. It's not necessary to understand CPS to use CPS-based iteratees. The CPS implementation generally simplifies the types and removes the necessity for special combinators like ($$) and (>>==), so I strongly suspect newcomers will find it easier to use than other variants (although unfortunately I can no longer say this from personal experience). It incorporates the best features of Oleg's two implementations in IterateeM.hs. The only drawback is the added thought overhead of CPS, but users need not be aware of this for the most part. For those who do want to have a thorough understanding of the implementation, I think that the CPS variant is usually more understandable than alternatives. The "take" family and stream converters (maps, convStream) are all simplified compared to alternative definitions. This isn't always true; enumPair is a counterexample. But I think it's helpful in many common cases, and enumPair is tricky in any implementation. It might be true that many programmers will find CPS difficult in the abstract (I certainly do), but when it occurs in a specific implementation the concepts are usually much more tractable. At least for iteratees, there's a very direct correspondence between the CPS-style and IterateeM-style, which greatly eases understanding. Also, while the IterateeM implementations may be faster for certain operations than CPS, they are also slower for others, sometimes significantly so. My tests (all with -O2, and various other compiler options tried) prior to switching to the CPS implementation showed that it is competitive with, if not being, the fastest implementation in all cases. Most importantly, I didn't find any comparatively slow operations, which wasn't true for either of the IterateeM implementations. I think that a CPS implementation of iteratees is the best of all current alternatives for ease of use, and possibly the best-performing implementation depending on exactly what operations are being performed. Even if it's not the absolute fastest, it should be close enough that the other benefits outweigh a performance gain. Cheers, John

On Mon, Aug 23, 2010 at 6:16 AM, John Lato
It's not necessary to understand CPS to use CPS-based iteratees. The CPS implementation generally simplifies the types and removes the necessity for special combinators like ($$) and (>>==), so I strongly suspect newcomers will find it easier to use than other variants (although unfortunately I can no longer say this from personal experience). It incorporates the best features of Oleg's two implementations in IterateeM.hs. The only drawback is the added thought overhead of CPS, but users need not be aware of this for the most part.
I agree with you, John. Personally, I find the CPS version easier to use, that's why I asked. But since people have different styles, I guess it's not a bad thing that the two packages use a different implementation. When I was reimplementing Iteratees I also didn't find any noticeable slowdown with CPS, but my benchmarks were very simple -- unlike yours. You are comparing the darcs branch to the version on Hackage, right? Paulo

On Mon, Aug 23, 2010 at 4:24 PM, Paulo Tanimoto
On Mon, Aug 23, 2010 at 6:16 AM, John Lato
wrote: It's not necessary to understand CPS to use CPS-based iteratees. The CPS implementation generally simplifies the types and removes the necessity for special combinators like ($$) and (>>==), so I strongly suspect newcomers will find it easier to use than other variants (although unfortunately I can no longer say this from personal experience). It incorporates the best features of Oleg's two implementations in IterateeM.hs. The only drawback is the added thought overhead of CPS, but users need not be aware of this for the most part.
I agree with you, John. Personally, I find the CPS version easier to use, that's why I asked. But since people have different styles, I guess it's not a bad thing that the two packages use a different implementation.
When I was reimplementing Iteratees I also didn't find any noticeable slowdown with CPS, but my benchmarks were very simple -- unlike yours. You are comparing the darcs branch to the version on Hackage, right?
I'm actually referring to benchmarks from about 10-8 months ago. I did have them on the website, but it looks like I took them down. I haven't run any comparisons recently, except for a few to determine where INLINEs are beneficial. I'll make a current set and post them when they're ready. John
participants (2)
-
John Lato
-
Paulo Tanimoto