
Am Dienstag, 10. Januar 2006 17:44 schrieb Ian Lynagh:
Hi all,
I am having space issues with some decompression code; I've attached a much simplified version as Test1.hs.
At the bottom (foo/bar) is the equivalent of deflate. This should be a standalone module which doesn't know about the rest.
In the middle (readChunks) is the equivalent of gunzip. It repeatedly calls foo until there is no more input left.
At the top is a simple main function that calls them.
If I do
dd if=/dev/zero of=data bs=1000 count=3000" # making data around 3MB ghc --make Test1 -o Test1 -O -Wall ./Test1
then in top I see Test1 increasing memory usage to around 150MB. I think this is because the "let (ys, zs) = foo xs" means zs holds on to xs (it's hard to be sure as compiling for profiling is too happy to change the behaviour).
I had 72 Mb space usage for a 3Mb file. I believe, it's the 'put zs' that's consuming the memory. I changed it to readChunks = do xs <- get if null xs then return [] else do let (ys, zs) = foo xs rest = evalState readChunks zs return (ys ++ rest) and got much smaller memory usage (10Mb) -- not sure, how sensible that would be for real work and why it reduces memory. If bar can start returning before it's finished, then the same holds for the modified readChunks, but the original would have to wait for the completion of bar (via foo) until zs can be put, so the complete ys would have to be in memory at once. Just checked, modified version also runs in 10Mb for a 12mb data file, so indeed bar starts returning before finishing and it seems the above is right. ./test4 +RTS -sstderr True 1,496,184,404 bytes allocated in the heap 987,852,924 bytes copied during GC 3,226,492 bytes maximum residency (162 sample(s)) 5707 collections in generation 0 ( 12.66s) 162 collections in generation 1 ( 14.82s) 10 Mb total memory in use INIT time 0.00s ( 0.01s elapsed) MUT time 6.88s ( 15.06s elapsed) GC time 27.48s ( 55.93s elapsed) EXIT time 0.00s ( 0.00s elapsed) Total time 34.36s ( 71.00s elapsed) %GC time 80.0% (78.8% elapsed) Alloc rate 217,468,663 bytes per MUT second Productivity 20.0% of total user, 9.7% of total elapsed
I tried (Test2) changing foo to be a monad transformer over the calling monad, so the caller's remaining input was updated as we went along, but (as well as memory usage not obviously being fixed) this is giving me a stack overflow.
Has anyone got any suggestions for making a constant space, constant stack version?
Thanks Ian
Hope that helps, Daniel