Re: [Haskell-cafe] safe lazy IO or Iteratee?

Subject: Re: [Haskell-cafe] safe lazy IO or Iteratee?
Downside: iteratees are very hard to understand. I wrote a decently-sized article about them trying to figure out how to make them useful, and some comments in one of Oleg's implementations suggest that the "iteratee" package is subtly wrong. Oleg has written at least three versions (non-monadic, monadic, monadic CPS) and I've no idea why or whether their differences are important. Even dons says he didn't understand them until after writing his own iteratee-based IO layer.
More significant than, and orthogonal to, the differences between non-monadic and monadic are the two primary implementations Oleg has written. They are[1]: Design 1: newtype Iteratee el m a = Iteratee{runIter:: Stream el -> m (IterV el m a)} data IterV el m a = IE_done a (Stream el) | IE_cont (Iteratee el m a) (Maybe ErrMsg) Design 2: newtype Iteratee el m a = Iteratee{runIter:: m (IterV el m a)} data IterV el m a = IE_done a (Stream el) | IE_cont (Stream el -> Iteratee el m a) (Maybe ErrMsg With the first design, it's impossible to get the state of an iteratee without feeding it a chunk. There are other consequences too. The second design seems to require some specialized combinators, that is (>>==) and ($$), which are not required for the first version. Neither situation is ideal. The CPS version appears to remedy both flaws, but at the expense of introducing CPS at a low level (this can be hidden from the end user in many cases). I already think of iteratees as holding continuations, so to me the so-called "CPS version" is to me a double CPS. Both designs appear to offer similar performance in aggregate, although there are differences for particular functions. I haven't yet had a chance to test the performance of the CPS variant, although Oleg has indicated he expects it will be higher. The monadic/non-monadic issue is related. Non-monadic iteratees are iteratees that can't perform monadic effects when they're running (although they can still be fed from a monadic enumerator). Essentially it's the difference between "fold" and "foldM". They are simpler and more efficient because of this, but also much less powerful. Any iteratee design can support both non-monadic and monadic, but *I* don't want to support both. At least, I don't want to have double modules for everything for nearly identical functions, and polymorphic code that can handle non-monadic and monadic iteratees is non-trivial[2]. Much of my recent work has been in the consequences of these various design considerations for the next version of the iteratee library. Currently undecided, although I'm leaning towards CPS. It seems to solve a lot of problems, and the implementation details are generally cleaner too. Cheers, John [1] Both taken from http://okmij.org/ftp/Haskell/Iteratee/IterateeM.hs. Design 1 is commented out on that page. [2] At least for me. Maybe others can provide a better solution.

I didn't count the commented-out designs in Oleg's code, only those which are "live".
Both designs appear to offer similar performance in aggregate, although there are differences for particular functions. I haven't yet had a chance to test the performance of the CPS variant, although Oleg has indicated he expects it will be higher.
I wrote some criterion benchmarks for IterateeM vs IterateeCPS, and
the CPS version was notably slower. I don't understand enough about
CPS to diagnose why, but the additional runtime was present in even
simple cases (reading from a file, writing back out).
On Fri, Feb 5, 2010 at 06:04, John Lato
Subject: Re: [Haskell-cafe] safe lazy IO or Iteratee?
Downside: iteratees are very hard to understand. I wrote a decently-sized article about them trying to figure out how to make them useful, and some comments in one of Oleg's implementations suggest that the "iteratee" package is subtly wrong. Oleg has written at least three versions (non-monadic, monadic, monadic CPS) and I've no idea why or whether their differences are important. Even dons says he didn't understand them until after writing his own iteratee-based IO layer.
More significant than, and orthogonal to, the differences between non-monadic and monadic are the two primary implementations Oleg has written. They are[1]:
Design 1: newtype Iteratee el m a = Iteratee{runIter:: Stream el -> m (IterV el m a)} data IterV el m a = IE_done a (Stream el) | IE_cont (Iteratee el m a) (Maybe ErrMsg)
Design 2: newtype Iteratee el m a = Iteratee{runIter:: m (IterV el m a)} data IterV el m a = IE_done a (Stream el) | IE_cont (Stream el -> Iteratee el m a) (Maybe ErrMsg
With the first design, it's impossible to get the state of an iteratee without feeding it a chunk. There are other consequences too. The second design seems to require some specialized combinators, that is (>>==) and ($$), which are not required for the first version. Neither situation is ideal. The CPS version appears to remedy both flaws, but at the expense of introducing CPS at a low level (this can be hidden from the end user in many cases). I already think of iteratees as holding continuations, so to me the so-called "CPS version" is to me a double CPS.
Both designs appear to offer similar performance in aggregate, although there are differences for particular functions. I haven't yet had a chance to test the performance of the CPS variant, although Oleg has indicated he expects it will be higher.
The monadic/non-monadic issue is related. Non-monadic iteratees are iteratees that can't perform monadic effects when they're running (although they can still be fed from a monadic enumerator). Essentially it's the difference between "fold" and "foldM". They are simpler and more efficient because of this, but also much less powerful. Any iteratee design can support both non-monadic and monadic, but *I* don't want to support both. At least, I don't want to have double modules for everything for nearly identical functions, and polymorphic code that can handle non-monadic and monadic iteratees is non-trivial[2].
Much of my recent work has been in the consequences of these various design considerations for the next version of the iteratee library. Currently undecided, although I'm leaning towards CPS. It seems to solve a lot of problems, and the implementation details are generally cleaner too.
Cheers, John
[1] Both taken from http://okmij.org/ftp/Haskell/Iteratee/IterateeM.hs. Design 1 is commented out on that page.
[2] At least for me. Maybe others can provide a better solution.

John Lato
wrote: Both designs appear to offer similar performance in aggregate, although there are differences for particular functions. I haven't yet had a chance to test the performance of the CPS variant, although Oleg has indicated he expects it will be higher.
@jwlato:
Do you mind creating `IterateeCPS' tree in
http://inmachina.net/~jwlato/haskell/iteratee/src/Data/, so we can
start writing CPS performance testing code?
AFAICS, you have benchmarks for IterateeM-driven code already:
http://inmachina.net/~jwlato/haskell/iteratee/tests/benchmarks.hs
John Millikin
I wrote some criterion benchmarks for IterateeM vs IterateeCPS, and the CPS version was notably slower. I don't understand enough about CPS to diagnose why, but the additional runtime was present in even simple cases (reading from a file, writing back out).
@jmillikin: Could you please publish those benchmarks? Thanks. -- vvv

On Fri, Feb 5, 2010 at 4:31 PM, Valery V. Vorotyntsev
John Lato
wrote: Both designs appear to offer similar performance in aggregate, although there are differences for particular functions. I haven't yet had a chance to test the performance of the CPS variant, although Oleg has indicated he expects it will be higher.
@jwlato: Do you mind creating `IterateeCPS' tree in http://inmachina.net/~jwlato/haskell/iteratee/src/Data/, so we can start writing CPS performance testing code?
I'm working on the CPS version and will make it public when it's done. It may take a week or so; this term started at 90 and has picked up. I have several benchmark sources that aren't public yet, but I can put them online for your perusal.
AFAICS, you have benchmarks for IterateeM-driven code already: http://inmachina.net/~jwlato/haskell/iteratee/tests/benchmarks.hs
Those will make more sense when I've added the context of the codebases in use. There are several more sets of output that I simply haven't published yet, including bytestring-based variants.
John Millikin
wrote: I wrote some criterion benchmarks for IterateeM vs IterateeCPS, and the CPS version was notably slower. I don't understand enough about CPS to diagnose why, but the additional runtime was present in even simple cases (reading from a file, writing back out).
That's very interesting. I wonder if I'll see the same, and if I'd be able to figure it out myself... Did you benchmark any cases without doing IO? Sometimes the cost of the IO can overwhelm any other measurable differences, and also disk caching can affect results. Criterion should highlight any major outliers, but I still like to avoid IO when benchmarking unless strictly necessary.
@jmillikin: Could you please publish those benchmarks?
+1 John

Benchmark attached. It just enumerates a list until EOF is reached.
An interesting thing I've noticed is that IterateeMCPS performs better
with no optimization, but -O2 gives IterateeM the advantage. Their
relative performance depends heavily on the chunk size -- for example,
CPS is much faster at chunk size 1, but slower with 100-element
chunks.
On Fri, Feb 5, 2010 at 08:56, John Lato
On Fri, Feb 5, 2010 at 4:31 PM, Valery V. Vorotyntsev
wrote: John Lato
wrote: Both designs appear to offer similar performance in aggregate, although there are differences for particular functions. I haven't yet had a chance to test the performance of the CPS variant, although Oleg has indicated he expects it will be higher.
@jwlato: Do you mind creating `IterateeCPS' tree in http://inmachina.net/~jwlato/haskell/iteratee/src/Data/, so we can start writing CPS performance testing code?
I'm working on the CPS version and will make it public when it's done. It may take a week or so; this term started at 90 and has picked up. I have several benchmark sources that aren't public yet, but I can put them online for your perusal.
AFAICS, you have benchmarks for IterateeM-driven code already: http://inmachina.net/~jwlato/haskell/iteratee/tests/benchmarks.hs
Those will make more sense when I've added the context of the codebases in use. There are several more sets of output that I simply haven't published yet, including bytestring-based variants.
John Millikin
wrote: I wrote some criterion benchmarks for IterateeM vs IterateeCPS, and the CPS version was notably slower. I don't understand enough about CPS to diagnose why, but the additional runtime was present in even simple cases (reading from a file, writing back out).
That's very interesting. I wonder if I'll see the same, and if I'd be able to figure it out myself...
Did you benchmark any cases without doing IO? Sometimes the cost of the IO can overwhelm any other measurable differences, and also disk caching can affect results. Criterion should highlight any major outliers, but I still like to avoid IO when benchmarking unless strictly necessary.
@jmillikin: Could you please publish those benchmarks?
+1
John

I've put my benchmarking code online at:
http://inmachina.net/~jwlato/haskell/research-iteratee.tar.bz2
unpack it so you have this directory structure:
./iteratee
./research-iteratee/
Also download my criterionProcessor programs. The darcs repo is at
http://inmachina.net/~jwlato/haskell/criterionProcessor/
to use it, go into the criterionProcessor directory, edit the
testrunner.hs script for your environment, and run it. This runs all
the benchmarks. Then you can use the CritProc program (build with
cabal) to generate pictures. I'm pretty sure you need Chart HEAD in
order to build CritProc (I hacked my Chart install, but I think the
only important change has been applied to HEAD).
I make no guarantees that these will all build properly, it's
basically a work-in-progress dump.
John
On Fri, Feb 5, 2010 at 10:25 PM, John Millikin
Benchmark attached. It just enumerates a list until EOF is reached.
An interesting thing I've noticed is that IterateeMCPS performs better with no optimization, but -O2 gives IterateeM the advantage. Their relative performance depends heavily on the chunk size -- for example, CPS is much faster at chunk size 1, but slower with 100-element chunks.
On Fri, Feb 5, 2010 at 08:56, John Lato
wrote: On Fri, Feb 5, 2010 at 4:31 PM, Valery V. Vorotyntsev
wrote: John Lato
wrote: Both designs appear to offer similar performance in aggregate, although there are differences for particular functions. I haven't yet had a chance to test the performance of the CPS variant, although Oleg has indicated he expects it will be higher.
@jwlato: Do you mind creating `IterateeCPS' tree in http://inmachina.net/~jwlato/haskell/iteratee/src/Data/, so we can start writing CPS performance testing code?
I'm working on the CPS version and will make it public when it's done. It may take a week or so; this term started at 90 and has picked up. I have several benchmark sources that aren't public yet, but I can put them online for your perusal.
AFAICS, you have benchmarks for IterateeM-driven code already: http://inmachina.net/~jwlato/haskell/iteratee/tests/benchmarks.hs
Those will make more sense when I've added the context of the codebases in use. There are several more sets of output that I simply haven't published yet, including bytestring-based variants.
John Millikin
wrote: I wrote some criterion benchmarks for IterateeM vs IterateeCPS, and the CPS version was notably slower. I don't understand enough about CPS to diagnose why, but the additional runtime was present in even simple cases (reading from a file, writing back out).
That's very interesting. I wonder if I'll see the same, and if I'd be able to figure it out myself...
Did you benchmark any cases without doing IO? Sometimes the cost of the IO can overwhelm any other measurable differences, and also disk caching can affect results. Criterion should highlight any major outliers, but I still like to avoid IO when benchmarking unless strictly necessary.
@jmillikin: Could you please publish those benchmarks?
+1
John
participants (3)
-
John Lato
-
John Millikin
-
Valery V. Vorotyntsev