
Hello, I came across the following space leak problem today. How can I fix this? (Tested on Mac OS X 10.5.8, GHC 6.10.3) -- test.hs module Main where import System import qualified Data.ByteString.Lazy.Char8 as L main = do args <- getArgs let n = read $ args !! 0 cs <- L.getContents let !a = L.take n cs mapM_ (print . L.length) $ L.lines cs print a -- gen.hs module Main where main = do putStrLn $ take 1000000 $ cycle "foo" main These are compiled with the following options: $ ghc --make -O2 test $ ghc --make -O2 gen The memory usage seems to depend on the argument(=17000) passed. On my MacBook(Core2 Duo 2.0GHz), 16000 works fine. $ ./gen | head -1000 | ./test 17000 +RTS -sstderr ... 3,793,673,564 bytes allocated in the heap 9,901,516 bytes copied during GC 635,576,092 bytes maximum residency (11 sample(s)) 248,725,136 bytes maximum slop 1759 MB total memory in use (562 MB lost due to fragmentation) Generation 0: 6941 collections, 0 parallel, 16.91s, 18.15s elapsed Generation 1: 11 collections, 0 parallel, 0.03s, 0.03s elapsed INIT time 0.00s ( 0.00s elapsed) MUT time 3.04s ( 27.64s elapsed) GC time 16.94s ( 18.18s elapsed) EXIT time 0.00s ( 0.01s elapsed) Total time 19.99s ( 45.82s elapsed) %GC time 84.8% (39.7% elapsed) Alloc rate 1,245,766,300 bytes per MUT second Productivity 15.2% of total user, 6.6% of total elapsed $ ./gen | head -1000 | ./test 16000 +RTS -sstderr ... 4,000,652,128 bytes allocated in the heap 7,428,180 bytes copied during GC 1,057,588 bytes maximum residency (1001 sample(s)) 525,092 bytes maximum slop 5 MB total memory in use (1 MB lost due to fragmentation) Generation 0: 6362 collections, 0 parallel, 0.10s, 0.12s elapsed Generation 1: 1001 collections, 0 parallel, 0.09s, 0.09s elapsed INIT time 0.00s ( 0.00s elapsed) MUT time 2.59s ( 23.26s elapsed) GC time 0.18s ( 0.22s elapsed) EXIT time 0.00s ( 0.00s elapsed) Total time 2.77s ( 23.47s elapsed) %GC time 6.6% (0.9% elapsed) Alloc rate 1,545,246,968 bytes per MUT second Productivity 93.4% of total user, 11.0% of total elapsed -- Regards, Yasuo Uchida

Am Montag 15 Februar 2010 16:44:51 schrieb Uchida Yasuo:
Hello,
I came across the following space leak problem today. How can I fix this? (Tested on Mac OS X 10.5.8, GHC 6.10.3)
-- test.hs module Main where
import System import qualified Data.ByteString.Lazy.Char8 as L
main = do args <- getArgs let n = read $ args !! 0 cs <- L.getContents let !a = L.take n cs
The problem is this. The Bang pattern does less than you probably think. The definition of lazy ByteStrings is data ByteString = Empty | Chunk {-# UNPACK #-} !S.ByteString ByteString , so when you write let !a = L.take n cs , you force the constructor (null cs ? Empty : Chunk start rest), Since cs is not empty, it's Chunk, and that forces the first part of the ByteString, which will be as long as the prefix which stdin immediately delivers, but at most the default chunk size (32K or 64K, normally [minus two words for bookkeeping]). If n is larger than a) the default chunk size or b) what L.getContents got immediately[*], a holds on to the (almost) entire input and you have a bad memory leak. Fix: force a to be completely evaluated, e.g. let !a = L.take n cs !l = L.length a By evaluating the length, a doesn't keep references to cs and all can be garbage collected. [*] how long the first chunk is, depends in this pipeline on scheduling, number of available cores/CPUs, OS buffer size.
mapM_ (print . L.length) $ L.lines cs print a
-- gen.hs module Main where
main = do putStrLn $ take 1000000 $ cycle "foo" main
These are compiled with the following options:
$ ghc --make -O2 test $ ghc --make -O2 gen
The memory usage seems to depend on the argument(=17000) passed. On my MacBook(Core2 Duo 2.0GHz), 16000 works fine.

Oh, What a relief! Thank you for your clear explanation! --- Daniel Fischer wrote:
Am Montag 15 Februar 2010 16:44:51 schrieb Uchida Yasuo:
Hello,
I came across the following space leak problem today. How can I fix this? (Tested on Mac OS X 10.5.8, GHC 6.10.3)
-- test.hs module Main where
import System import qualified Data.ByteString.Lazy.Char8 as L
main = do args <- getArgs let n = read $ args !! 0 cs <- L.getContents let !a = L.take n cs
The problem is this. The Bang pattern does less than you probably think. The definition of lazy ByteStrings is
data ByteString = Empty | Chunk {-# UNPACK #-} !S.ByteString ByteString
, so when you write
let !a = L.take n cs
, you force the constructor (null cs ? Empty : Chunk start rest), Since cs is not empty, it's Chunk, and that forces the first part of the ByteString, which will be as long as the prefix which stdin immediately delivers, but at most the default chunk size (32K or 64K, normally [minus two words for bookkeeping]).
If n is larger than a) the default chunk size or b) what L.getContents got immediately[*], a holds on to the (almost) entire input and you have a bad memory leak.
Fix: force a to be completely evaluated, e.g.
let !a = L.take n cs !l = L.length a
By evaluating the length, a doesn't keep references to cs and all can be garbage collected.
[*] how long the first chunk is, depends in this pipeline on scheduling, number of available cores/CPUs, OS buffer size.
__ Regards, Yasuo Uchida

On Tue, Feb 16, 2010 at 12:44:51AM +0900, Uchida Yasuo wrote:
The memory usage seems to depend on the argument(=17000) passed. On my MacBook(Core2 Duo 2.0GHz), 16000 works fine.
The default chunk size for bytestring-0.9.1.5 is 32k [1]. I don't know why 16k is the magical number here. However, you strictly compute take. When the number of bytes you take is available in the first chunk, everything is done. If you need more then one chunk, then that bang patter ! will force only the *first* chunk, the others will be take'en lazily. You can try to verify this by using L.toChunks and noting how many chunks are being created. [1] http://hackage.haskell.org/packages/archive/bytestring/0.9.1.5/doc/html/src/... HTH, -- Felipe.

That's right! You solved the mystery of the magic number! --- Felipe Lessa wrote:
You can try to verify this by using L.toChunks and noting how many chunks are being created.
-- test.hs module Main where import System import System.IO import qualified Data.ByteString.Lazy.Char8 as L import qualified Data.ByteString.Char8 as S main = do args <- getArgs let n = read $ args !! 0 cs <- L.getContents let !a = L.take n cs !l = L.length a hPutStrLn stderr $ "first chunk : " ++ show (S.length $ head $ L.toChunks cs) mapM_ (print . L.length) $ L.lines cs print a $ ./gen | head -1000 | ./test 17000 +RTS -sstderr > /dev/null ./test 17000 +RTS -sstderr first chunk : 16384 gen: <stdout>: commitAndReleaseBuffer: resource vanished (Broken pipe) 4,085,671,308 bytes allocated in the heap 7,659,656 bytes copied during GC 1,091,072 bytes maximum residency (1001 sample(s)) 540,552 bytes maximum slop 5 MB total memory in use (1 MB lost due to fragmentation) Generation 0: 6507 collections, 0 parallel, 0.10s, 0.13s elapsed Generation 1: 1001 collections, 0 parallel, 0.09s, 0.10s elapsed INIT time 0.00s ( 0.00s elapsed) MUT time 2.62s ( 23.11s elapsed) GC time 0.19s ( 0.23s elapsed) EXIT time 0.00s ( 0.00s elapsed) Total time 2.81s ( 23.34s elapsed) %GC time 6.8% (1.0% elapsed) Alloc rate 1,558,678,834 bytes per MUT second Productivity 93.2% of total user, 11.2% of total elapsed -- Regards, Yasuo Uchida

Am Montag 15 Februar 2010 23:42:03 schrieb Uchida Yasuo:
That's right! You solved the mystery of the magic number!
--- Felipe Lessa wrote:
You can try to verify this by using L.toChunks and noting how many chunks are being created.
$ ./gen | head -1000 | ./test 17000 +RTS -sstderr > /dev/null ./test 17000 +RTS -sstderr first chunk : 16384
It's not so easy, I get varying values: 4096, 8192, 32760, 28658, 16376, 12288, ... It may be different on other machines, but on my box, there's no hard "magic number", just a tendency to get 1xbuffer size or 2xbuffer size, or defaultChunkSize, the other values are rarer.
-- Regards, Yasuo Uchida
participants (3)
-
Daniel Fischer
-
Felipe Lessa
-
Uchida Yasuo