
Hi Ketil,
By the way, what is the advantage of using iteratees here? For my testing, I just used:
My initial move to iteratees was more a clutch call I made when I was still using bytestring-trie, and was having immense memory consumption problems. bytestring-trie uses strict byte strings as an index, and since I was getting only lazy byte strings, the only way to make them strict would be to use (S.concat . L.toChunks) (L and S being the lazy/strict byte string imports,) which felt *wrong*. In short, I thought iteratee would give me enough magic fairy dust to actually have a decent control over how much data I'm holding in RAM at any given point — that was not the case, since I didn't know about the pointer mechanic of strict ByteStrings and hence was oblivious to the bad impact that would have on garbage collection performance. Even so, I think I can still justify using iteratees in the current design: a) I don't like lazy IO (conceptually,) b) I'm gonna write a left-fold somewhere anyway, might as well use a decent infrastructure for it c) I can strictly control the chunk size, and I'm not going to have any bad effects with accidental eager evaluation somewhere down the pipe. c) being the only "legitimate" reason (though the reason for a) is c) ) — adjusting the chunk size might actually yield noticeable performance differences when reading through files that are well into the realm of gigabytes. And the chunk size "limit" will protect me from an accidental strict fold or so that would leave me with a 4GB file in memory. About a): Lazy IO just doesn't "feel" right for me. I want my pure computations to actually be pure. If I put a ' on one of my functions *within* my pure code, this might have *side effects* — now, instead of reading in only part of the file, this will demand the *whole* file, and that is *quite* a side effect! So, suddenly I have to worry about side effects in my pure code. Ugh. That's why I'm going to continue using iteratees. I don't know if that's the right justification, but it's a "hey, it works for me!" justification I can comfortably live with. Besides, I don't think the iteratee interface is all that opaque. I found arrows in HXT, for example, much more difficult to deal with conceptually. (That said, I'm still using HDBC over Takusen, because the latter's API just didn't make sense to me.) Regards, Aleks