Re: [Haskell-cafe] ANNOUNCE: enumerator, an alternative iteratee package

19 Aug 2010

      Hi John,

Thanks for creating a competitor to the iteratee library.  I think iteratees
are an important abstraction, but there are some things about the iteratee
library that I'm not fond of, despite John Lato doing a great job.  I think
having a bit of healthy competition to explore the design space is
excellent.

I have questions for you below.

On Wed, Aug 18, 2010 at 9:31 PM, John Millikin  wrote:
...
Most of you have probably read Oleg's essays on using left-fold
enumerators for incremental IO. In short, by encapsulating monadic
left-folds in an "Iteratee" type, incremental pure processing is
possible without using lazy IO. Sources to read:
[snip]
...
While I appreciate Mr. Lato's development of the package, I find it
far too large, and its documentation too sparse, to effectively use.
To correct this, I've written the "enumerator" package. It is also
derived from Oleg's IterateeM.hs , but with a simplified API and
significantly reduced dependency list.
I don't mind the dependency list, but I was mildly concerned that iteratee
appears to work only on unix and that the API is a bit rough.
...
Hackage entry: http://hackage.haskell.org/package/enumerator
Haddock docs: http://ianen.org/haskell/enumerator/api-docs/
Source code (literate PDF):
http://ianen.org/haskell/enumerator/enumerator.pdf
darcs get http://ianen.org/haskell/enumerator/
Additionally, I've included examples of using enumerators to implement
simplified versions of the "cat" and "wc" utilities. These should
serve as a useful starting point for anybody who wants to use
enumerators in their own code:
http://patch-tag.com/r/jmillikin/enumerator/snapshot/current/content/pretty/...
http://patch-tag.com/r/jmillikin/enumerator/snapshot/current/content/pretty/...
The main reason I would use iteratees is for performance reasons.  To help
me, as a potential consumer of your library, could you please provide
benchmarks for comparing the performance of enumerator with say, a)
iteratee, b) lazy/strict bytestring, and c) Prelude functions?

I'm interested in both max memory consumption and run-times.  Using
criterion and/or progression to get the run-times would be icing on an
already delicious cake!
...
There are already a few libraries using the existing "iteratee"
package (snap, attoparsec-iteratee, hexpat-iteratee); I am very
interested in advice from the authors of these libraries. In
particular, are any of the removed features (ListLike,
WrappedByteString, seeking) something your libraries depend on? Are
there any useful combinators you'd like to see included?
The only reason iteratee provides WrappedByteString is because the type
class used to abstract over the stream type requires something with kind *
-> * and ByteString has kind *.  The extra wrapping just adds an ignored
phantom type to bytestrings.  So if you don't require specific kinds I don't
think you'd need to provide a WrappedByteString.

ListLike is possibly nice, but in the type indexed iteratee implementation
that I started (but could not finish due to some issues with the type
indexing) I didn't use it.  ListLike doesn't support type threaded lists at
all.  On a side note, in my type threaded iteratee library, I initially
elided StreamChunk but later added something similar in because I found it
useful.  I can't recall of the top of my head what the reasoning was, but I
could dig deeper if it interests you.  I was also following a fairly
faithful re-implementation of John Lato's implementation, just with type
indexing.  I should probably post my partial library regardless.  Perhaps
others can find ways around the bits I was stuck on.

I can see seeking as being important as your library moves into new domains
of use.  Particularly when reading large binary streams when the data is
sparse.

Thanks and congrats!
Jason

Re: [Haskell-cafe] ANNOUNCE: enumerator, an alternative iteratee package

Jason Dagit