
On Thu, Aug 19, 2010 at 14:29, wren ng thornton
I was under the impression Jason was asking about the performance of the iteratee package vs the enumerator package. I'd certainly be interested in seeing that. Right now I'm using attoparsec-iteratee, but if I could implement an attoparsec-enumerator which has the same/better performance, then I might switch over.
Oh, sorry -- both packages have the same performance. At least, if there is a difference, it's less than the margin of error on my benchmark (counting lines in the ubuntu 10.04 ISO, with cleared filesystem caches). Here's my Iteratee benchmark. I think this is the proper way to implement "wc -l", but if you see any errors in it which could cause poor performance, please let me know. -------------------------------------------------------------------------------- import qualified Data.ByteString.Char8 as B iterLines :: Monad m => IterateeG WrappedByteString Word8 m Integer iterLines = IterateeG (step 0) where step acc s@(EOF _) = return $ Done acc s step acc (Chunk wrapped) = return $ Cont (IterateeG (step acc')) Nothing where acc' = acc + countChar '\n' (unWrap wrapped) countChar :: Char -> B.ByteString -> Integer countChar c = B.foldl (\acc c' -> if c' == c then acc + 1 else acc) 0 -------------------------------------------------------------------------------- And here's typical times for various implementations -- numbers are real / user / sys, as reported by "time". They're mostly as expected, except (to my surprise) lazy bytestrings are as fast as strict bytestrings: wc -l ==================== 5.451 / 0.030 / 0.190 5.426 / 0.060 / 0.150 5.466 / 0.130 / 0.200 enumerator ==================== 8.235 / 5.270 / 1.010 8.278 / 5.270 / 0.880 8.264 / 5.370 / 0.860 iteratee ==================== 8.239 / 5.270 / 0.980 8.255 / 5.320 / 0.790 8.265 / 5.140 / 0.900 strict bytestrings ==================== 5.425 / 2.030 / 0.360 5.402 / 2.180 / 0.330 5.446 / 2.240 / 0.400 lazy bytestrings ==================== 5.467 / 1.910 / 0.260 5.428 / 1.990 / 0.280 5.433 / 2.140 / 0.190
So far I've been very pleased with John Lato's work, quality-wise. Reducing dependencies is nice, but my main concern is the lack of documentation. I know the ideas behind iteratee and have read numerous tutorials on various people's simplified versions. However, because the iteratee package uses somewhat different terminology and types, it's not always clear exactly how to translate my knowledge into being able to use the library effectively. The enumerator package seems to have fixed this :)
Glad to hear it. My goal is not to supplant "iteratee", but to supplement it -- if enumerator becomes the simple/learning version, and most major packages use "iteratee", that's fine.