
I am trying to play with iteratee making parser for squid log files, but found that my code do not run in constant space when it tries to process compressed log files. So i simplified my code down to this snippet: import Data.ByteString (ByteString) import Data.Iteratee as I import Data.Iteratee.Char import Data.Iteratee.ZLib import System main = do args <- getArgs let fname = args !! 0 let blockSize = read $ args !! 1 fileDriver (leak blockSize) fname >>= print leak :: Int -> Iteratee ByteString IO () leak blockSize = joinIM $ enumInflate GZip defaultDecompressParams chunkedRead where consChunk :: Iteratee ByteString IO String consChunk = (joinI $ I.take blockSize I.length) >>= return . show chunkedRead :: Iteratee ByteString IO () chunkedRead = joinI $ convStream consChunk printLines First argument - file name (/var/log/messages.1.gz will do) second - size of block to consume input. with low size (10 bytes) of consumed blocks it leaks very fast, with larger blocks (~10000) it works almost without leaks. So. Is it bugs within my code, or iteratee-compress should behave differently?

On Fri, 2011-02-18 at 17:27 +0300, Michael A Baikov wrote:
I am trying to play with iteratee making parser for squid log files, but found that my code do not run in constant space when it tries to process compressed log files. So i simplified my code down to this snippet:
import Data.ByteString (ByteString) import Data.Iteratee as I import Data.Iteratee.Char import Data.Iteratee.ZLib import System
main = do args <- getArgs let fname = args !! 0 let blockSize = read $ args !! 1
fileDriver (leak blockSize) fname >>= print
leak :: Int -> Iteratee ByteString IO () leak blockSize = joinIM $ enumInflate GZip defaultDecompressParams chunkedRead where consChunk :: Iteratee ByteString IO String consChunk = (joinI $ I.take blockSize I.length) >>= return . show
chunkedRead :: Iteratee ByteString IO () chunkedRead = joinI $ convStream consChunk printLines
First argument - file name (/var/log/messages.1.gz will do) second - size of block to consume input. with low size (10 bytes) of consumed blocks it leaks very fast, with larger blocks (~10000) it works almost without leaks.
So. Is it bugs within my code, or iteratee-compress should behave differently?
It may be a bug - I'll look into it. Regards PS. Please CC me and/or just send e-mail to me - I may miss mails to the cafe list but I won't miss (or rather it by several orders of magnitude less likely) anything that is sent to me

On Fri, 2011-02-18 at 17:27 +0300, Michael A Baikov wrote:
I am trying to play with iteratee making parser for squid log files, but found that my code do not run in constant space when it tries to process compressed log files. So i simplified my code down to this snippet:
import Data.ByteString (ByteString) import Data.Iteratee as I import Data.Iteratee.Char import Data.Iteratee.ZLib import System
main = do args <- getArgs let fname = args !! 0 let blockSize = read $ args !! 1
fileDriver (leak blockSize) fname >>= print
leak :: Int -> Iteratee ByteString IO () leak blockSize = joinIM $ enumInflate GZip defaultDecompressParams chunkedRead where consChunk :: Iteratee ByteString IO String consChunk = (joinI $ I.take blockSize I.length) >>= return . show
chunkedRead :: Iteratee ByteString IO () chunkedRead = joinI $ convStream consChunk printLines
First argument - file name (/var/log/messages.1.gz will do) second - size of block to consume input. with low size (10 bytes) of consumed blocks it leaks very fast, with larger blocks (~10000) it works almost without leaks.
So. Is it bugs within my code, or iteratee-compress should behave differently?
After looking into problem (or rather onto your code) - the problem have nothing to do with iteratee-compress I believe. I get similar behaviour and results when I replace "joinIM $ enumInflate GZip defaultDecompressParams chunkedRead" by chunkedRead. (The memory is smaller but it is due to decompression not iteratee fault). Regards
participants (2)
-
Maciej Piechotka
-
Michael A Baikov