
Gregory,
Thank you, your code helps, now my it runs in the speed of lazy
bytestring test but uses less memory with it.
I've only added to your code more strictness in the recursion, my
version is below.
BTW, I think it is more useful to let user set the chunk size for
reading, so I'd like to see this possibility in the iteratee package.
{-# LANGUAGE RankNTypes, ScopedTypeVariables #-}
import qualified Data.Iteratee.IO.Fd as I
import qualified Data.Iteratee as I
import qualified Data.Iteratee.WrappedByteString as I
import qualified Data.ByteString.Char8 as S
import System.Environment
import System.IO
count :: FilePath -> IO Int
count s = I.fileDriverFd cnt s
cnt :: (Monad m) => I.IterateeG I.WrappedByteString Char m Int
cnt = go 0
where
go n = I.IterateeG $ \ch ->
case ch of
(I.EOF Nothing) -> return $ I.Done n ch
(I.EOF (Just e)) -> return $ I.Cont cnt (Just e)
(I.Chunk (I.WrapBS s)) -> do
let n' = n + S.count '\n' s
return $ n' `seq` I.Cont (go n') Nothing
main :: IO ()
main = do
[f] <- getArgs
print =<< count f
2010/3/17 Gregory Collins
Vasyl Pasternak
writes: Hi Cafe,
Yesterday I played with iteratee package, and wanted to check its performance. I tried to count lines in a file, as Oleg in his famous lazy_vs_correct[1] article. The results somewhat disappointed me.
eris:benchmark greg$ time ./IterateeTest /usr/share/dict/words 234936
real 0m0.027s user 0m0.010s sys 0m0.015s eris:benchmark greg$ time ./ByteStringTest /usr/share/dict/words 234936
real 0m0.024s user 0m0.015s sys 0m0.007s
Note also that the Bytestring I/O functions use a 32KB buffer and the iteratee "enumFd" function uses a 4KB buffer; if the buffers were the same the performance would be comparable. Here is my code:
------------------------------------------------------------------------
{-# LANGUAGE RankNTypes, ScopedTypeVariables #-}
import qualified Data.Iteratee.IO.Fd as I import qualified Data.Iteratee as I import qualified Data.Iteratee.WrappedByteString as I import qualified Data.ByteString.Char8 as S
import System.Environment import System.IO
count :: FilePath -> IO Int count s = I.fileDriverFd cnt s
cnt :: (Monad m) => I.IterateeG I.WrappedByteString Char m Int cnt = go 0 where go n = I.IterateeG $ \ch -> case ch of (I.EOF Nothing) -> return $ I.Done n ch (I.EOF (Just e)) -> return $ I.Cont cnt (Just e) (I.Chunk (I.WrapBS s)) -> do let n' = n + S.count '\n' s return $ I.Cont (go n') Nothing
main :: IO () main = do [f] <- getArgs print =<< count f
-- Gregory Collins