
Valery V. Vorotyntsev wrote:
The following pattern appears quite often in my code:
results <- map someConversion `liftM` replicateM nbytes Iter.head
The meaning is: take `nbytes' from stream, apply `someConversion' to every byte and return the list of `results'. But there's more than one way to do it:
i1, i2, i3 :: Monad m => Int -> IterateeG [] Word8 m [String] i1 n = map conv `liftM` replicateM n Iter.head i2 n = map conv `liftM` joinI (Iter.take n stream2list) i3 n = joinI $ Iter.take n $ joinI $ mapStream conv stream2list
Of those i1, i2, i3 which one is "better" and why? Or is there another - preferable - way of applying iteratees to this task?
My nai:ve guess is that i1 will have worse performance with big n's. It looks like `i1' is reading bytes one by one, while `i2' takes whole chunks of data... I'm not sure though.
You are correct: i2 and i3 can process a chunk of elements at a time, if an enumerator supplies it. That means an iteratee like i2 or i3 can do more work per invocation -- which is always good. Since you have to get the results as a list, you pretty much have to use stream2list. It should be noted that stream2list isn't very efficient: it returns the accumulated list only when it is done -- which happens when the stream is terminated, normally or abnormally. So, stream2list has a terrible latency, and is useful only at the last stage of processing. I found it is most useful for testing (to see the resulting stream) and for writing Unit tests (to compare the produced results with the expected). For incremental processing, it is better to stay within Iteratees. Although I think i2 and i3 should be close in performance (only benchmarking can tell for sure, of course), i3 is more extensible because stream2list is at the end of the chain. If later on further processing is required (or, the latency imposed by stream2list becomes noticeable), the chain can be easily extended. The advantage of the arrangement of i3 is that if some Iteratee further down the chain decided that it has had enough (elements), Iter.take can quickly skip the remaining elements without the need to convert them.