
Am Donnerstag 24 Dezember 2009 02:14:51 schrieb Michael Lesniak:
Hello haskell-cafe (and merry christmas!),
I have a strange problem with the garbage collector / memory which I'm unable to find a solution for. I think the source of my problems has to do with lazy evaluation, but currently I'm unable to find it.
Using the attached program and threadscope I see that the GC is using a lot of time and the program comes (more or less) to a halt (see exa-1.png). When I increase the heap the program takes much longer and the GC is running more or less all the time (see exa-2.png).
Some more detailled information:
* I can see the described behaviour under both GHC 10.4 and GHC 12.1 * Linux kernel 2.6.31-16 on a dualcore * Program compiled with ghc --make -O2 -threaded -eventlog Example.hs -o exa * Started with exa +RTS -ls and one of { -N, -N1, -N2 }
Any help (pointing into the right direction, mention possibly helpful blog articles or paper, constructive critic in general) is appreciated!
I can't reproduce that (ghc-6.12.1, Linux linux-mkk1 2.6.27.39-0.2-pae #1 SMP 2009-11-23 12:57:38 +0100 i686 i686 i386 GNU/Linux, dual core): $ time ./exa +RTS -ls -N2 -sstderr ./exa +RTS -ls -N2 -sstderr 1 3 0 0 9 3 6 9 1 8 8 9 2 5 5 8 6 5 7 8 4 72,499,126,908 bytes allocated in the heap 45,280,708 bytes copied during GC 177,504 bytes maximum residency (10 sample(s)) 183,844 bytes maximum slop 4 MB total memory in use (1 MB lost due to fragmentation) Generation 0: 131527 collections, 131526 parallel, 7.18s, 3.88s elapsed Generation 1: 10 collections, 10 parallel, 0.00s, 0.00s elapsed Parallel GC work balance: 1.16 (10931266 / 9433437, ideal 2) MUT time (elapsed) GC time (elapsed) Task 0 (worker) : 115.10s (126.56s) 3.34s ( 1.84s) Task 1 (worker) : 124.21s (126.56s) 3.84s ( 2.04s) Task 2 (worker) : 0.09s (126.56s) 0.00s ( 0.00s) Task 3 (worker) : 0.00s (126.56s) 0.00s ( 0.00s) SPARKS: 21 (21 converted, 0 pruned) INIT time 0.00s ( 0.13s elapsed) MUT time 238.05s (126.56s elapsed) GC time 7.19s ( 3.88s elapsed) EXIT time 0.00s ( 0.00s elapsed) Total time 244.46s (130.57s elapsed) %GC time 2.9% (3.0% elapsed) Alloc rate 305,559,453 bytes per MUT second Productivity 97.1% of total user, 181.7% of total elapsed gc_alloc_block_sync: 151252 whitehole_spin: 0 gen[0].steps[0].sync_large_objects: 75620 gen[0].steps[1].sync_large_objects: 9662 gen[1].steps[0].sync_large_objects: 0 244.45user 2.06system 2:10.58elapsed 188%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+35736outputs (0major+2426minor)pagefaults 0swaps Garbage collection isn't even visible in the threadscope profile. With -N1: 71,999,280,108 bytes allocated in the heap 20,729,380 bytes copied during GC 92,492 bytes maximum residency (2 sample(s)) 190,872 bytes maximum slop 3 MB total memory in use (0 MB lost due to fragmentation) Generation 0: 130901 collections, 0 parallel, 2.64s, 2.68s elapsed Generation 1: 2 collections, 0 parallel, 0.00s, 0.00s elapsed Parallel GC work balance: nan (0 / 0, ideal 1) MUT time (elapsed) GC time (elapsed) Task 0 (worker) : 184.86s (206.14s) 2.38s ( 2.40s) Task 1 (worker) : 0.00s (216.71s) 0.00s ( 0.00s) Task 2 (worker) : 20.90s (216.71s) 0.26s ( 0.28s) SPARKS: 21 (19 converted, 2 pruned) INIT time 0.00s ( 0.05s elapsed) MUT time 215.33s (216.71s elapsed) GC time 2.64s ( 2.68s elapsed) EXIT time 0.00s ( 0.00s elapsed) Total time 217.96s (219.44s elapsed) %GC time 1.2% (1.2% elapsed) Alloc rate 334,374,215 bytes per MUT second Productivity 98.8% of total user, 98.1% of total elapsed gc_alloc_block_sync: 0 whitehole_spin: 0 gen[0].steps[0].sync_large_objects: 0 gen[0].steps[1].sync_large_objects: 0 gen[1].steps[0].sync_large_objects: 0 217.96user 0.88system 3:39.52elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k 1576inputs+19552outputs (5major+1686minor)pagefaults 0swaps
Best wishes, Michael