
Hi all, I played around with a parallel algorithm and tried to get the GC time down by specifying the RTS option '-A'. But if I'm specifying '-A' than also the usage of the cpu-cores seems to change. Without '-A' I'm getting a total cpu-core usage of '4.44s', with '-A' I'm getting only '1.67s'. Any ideas? Greetings, Daniel Without '-A': ------------- dan@machine ~> ghc-mod-dev find showWindows +RTS -s -N4 XMonad.Util.XUtils 980,770,296 bytes allocated in the heap 552,122,168 bytes copied during GC 163,683,152 bytes maximum residency (11 sample(s)) 4,369,800 bytes maximum slop 280 MB total memory in use (0 MB lost due to fragmentation) Tot time (elapsed) Avg pause Max pause Gen 0 1690 colls, 1690 par 2.45s 0.66s 0.0004s 0.0024s Gen 1 11 colls, 10 par 0.98s 0.26s 0.0239s 0.1022s Parallel GC work balance: 18.06% (serial 0%, perfect 100%) TASKS: 6 (1 bound, 5 peak workers (5 total), using -N4) SPARKS: 1025 (0 converted, 0 overflowed, 0 dud, 0 GC'd, 1025 fizzled) INIT time 0.00s ( 0.00s elapsed) MUT time 0.99s ( 0.83s elapsed) GC time 3.43s ( 0.93s elapsed) EXIT time 0.02s ( 0.02s elapsed) Total time 4.44s ( 1.78s elapsed) Alloc rate 995,571,064 bytes per MUT second Productivity 22.7% of total user, 56.5% of total elapsed gc_alloc_block_sync: 53552 whitehole_spin: 0 gen[0].sync: 75 gen[1].sync: 23678 With '-A': ---------- dan@machine ~> ghc-mod-dev find showWindows +RTS -s -N4 -A500m XMonad.Util.XUtils 979,761,872 bytes allocated in the heap 94,043,424 bytes copied during GC 118,609,784 bytes maximum residency (2 sample(s)) 2,844,808 bytes maximum slop 2196 MB total memory in use (0 MB lost due to fragmentati Tot time (elapsed) Avg pause Ma Gen 0 0 colls, 0 par 0.00s 0.00s 0.0000s Gen 1 2 colls, 1 par 0.47s 0.18s 0.0878s Parallel GC work balance: 57.02% (serial 0%, perfect 100%) TASKS: 6 (1 bound, 5 peak workers (5 total), using -N4) SPARKS: 1025 (616 converted, 0 overflowed, 0 dud, 0 GC'd, 409 fizzl INIT time 0.02s ( 0.02s elapsed) MUT time 1.16s ( 1.16s elapsed) GC time 0.47s ( 0.18s elapsed) EXIT time 0.02s ( 0.02s elapsed) Total time 1.67s ( 1.38s elapsed) Alloc rate 844,914,346 bytes per MUT second Productivity 70.8% of total user, 85.9% of total elapsed gc_alloc_block_sync: 33458 whitehole_spin: 0 gen[0].sync: 5372 gen[1].sync: 0