Re: [Haskell-cafe] Understanding GC time

10 Mar 2012


      On Sat, Mar 10, 2012 at 4:21 PM, Thiago Negri  wrote:
...
c:\tmp\hs>par +RTS -s -N1
par +RTS -s -N1
20000000
    803,186,152 bytes allocated in the heap
    859,916,960 bytes copied during GC
    233,465,740 bytes maximum residency (10 sample(s))
     30,065,860 bytes maximum slop
            483 MB total memory in use (0 MB lost due to fragmentation)
 Generation 0:  1523 collections,     0 parallel,  0.80s,  0.75s elapsed
 Generation 1:    10 collections,     0 parallel,  0.83s,  0.99s elapsed
 Parallel GC work balance: nan (0 / 0, ideal 1)
...
c:\tmp\hs>par +RTS -s -N2
par +RTS -s -N2
20000000
  1,606,279,644 bytes allocated in the heap
         74,924 bytes copied during GC
         28,340 bytes maximum residency (1 sample(s))
         29,004 bytes maximum slop
              2 MB total memory in use (0 MB lost due to fragmentation)
 Generation 0:  1566 collections,  1565 parallel,  0.00s,  0.01s elapsed
 Generation 1:     1 collections,     1 parallel,  0.00s,  0.00s elapsed
 Parallel GC work balance: 1.78 (15495 / 8703, ideal 2)
An important part of what happened is explained by this :
-N1
...
483 MB total memory in use (0 MB lost due to fragmentation)
-N2
...
2 MB total memory in use (0 MB lost due to fragmentation)
Thing is, in the first version, the list had to be present in memory
completely because you had two traversals and so the head was retained
during the first traversal so that the second traversal could work on
the same list. In the version where both traversals were done in
parallel, the list was produced and consumed in constant memory, since
both folds could progress simultaneously. So the memory use was much
simpler and smaller, which must explain in part why the collections
were so much faster (apparently there was still 0.01s elapsed for the
generation 0 collections).

-- 
Jedaï

Re: [Haskell-cafe] Understanding GC time

Chaddaï Fouché