
Brandon S. Allbery KF8NH wrote:
On 2009 Mar 3, at 12:54, mwinter@brocku.ca wrote:
I am using GHC 6.8.3. The -O2 option made both runs faster but the 2 core run is still much slower that the 1 core version. Will switching to 6.10 make the difference?
If GC contention is the issue, it should.
I just tried it with GHC 6.10.1. Two capabilities is still slower. (See attachments. Compiled with -O2 -threaded.) In both cases, GC time is miniscule. [("GHC RTS", "Yes") ,("GHC version", "6.10.1") ,("RTS way", "rts_thr") ,("Host platform", "i386-unknown-mingw32") ,("Build platform", "i386-unknown-mingw32") ,("Target platform", "i386-unknown-mingw32") ,("Compiler unregisterised", "NO") ,("Tables next to code", "YES") ] Cores1 +RTS -N1 -s 16,918,324 bytes allocated in the heap 1,055,836 bytes copied during GC 1,005,356 bytes maximum residency (1 sample(s)) 29,760 bytes maximum slop 1260 MB total memory in use (112 MB lost due to fragmentation) Generation 0: 32 collections, 0 parallel, 0.03s, 0.03s elapsed Generation 1: 1 collections, 0 parallel, 0.00s, 0.00s elapsed Task 0 (worker) : MUT time: 2.53s ( 5.11s elapsed) GC time: 0.02s ( 0.02s elapsed) Task 1 (worker) : MUT time: 0.00s ( 5.11s elapsed) GC time: 0.00s ( 0.00s elapsed) Task 2 (worker) : MUT time: 2.30s ( 5.11s elapsed) GC time: 0.02s ( 0.02s elapsed) INIT time 0.02s ( 0.00s elapsed) MUT time 4.83s ( 5.11s elapsed) GC time 0.03s ( 0.03s elapsed) EXIT time 0.00s ( 0.00s elapsed) Total time 4.88s ( 5.14s elapsed) %GC time 0.6% (0.6% elapsed) Alloc rate 3,492,815 bytes per MUT second Productivity 99.0% of total user, 93.9% of total elapsed recordMutableGen_sync: 0 gc_alloc_block_sync: 0 whitehole_spin: 0 gen[0].steps[0].sync_todo: 0 gen[0].steps[0].sync_large_objects: 0 gen[0].steps[1].sync_todo: 0 gen[0].steps[1].sync_large_objects: 0 gen[1].steps[0].sync_todo: 0 gen[1].steps[0].sync_large_objects: 0 Cores1 +RTS -N2 -s 16,926,532 bytes allocated in the heap 1,243,560 bytes copied during GC 794,980 bytes maximum residency (2 sample(s)) 12,012 bytes maximum slop 1927 MB total memory in use (160 MB lost due to fragmentation) Generation 0: 23 collections, 8 parallel, 0.00s, 0.00s elapsed Generation 1: 2 collections, 0 parallel, 0.02s, 0.02s elapsed Parallel GC work balance: 1.00 (1267 / 1267, ideal 2) Task 0 (worker) : MUT time: 0.00s ( 0.00s elapsed) GC time: 0.00s ( 0.00s elapsed) Task 1 (worker) : MUT time: 3.63s ( 4.67s elapsed) GC time: 0.00s ( 0.00s elapsed) Task 2 (worker) : MUT time: 0.00s ( 4.67s elapsed) GC time: 0.00s ( 0.00s elapsed) Task 3 (worker) : MUT time: 3.42s ( 4.67s elapsed) GC time: 0.02s ( 0.02s elapsed) Task 4 (worker) : MUT time: 0.00s ( 4.67s elapsed) GC time: 0.00s ( 0.00s elapsed) INIT time 0.02s ( 0.00s elapsed) MUT time 7.05s ( 4.67s elapsed) GC time 0.02s ( 0.02s elapsed) EXIT time 0.00s ( 0.00s elapsed) Total time 7.08s ( 4.69s elapsed) %GC time 0.2% (0.3% elapsed) Alloc rate 2,396,677 bytes per MUT second Productivity 99.6% of total user, 150.3% of total elapsed recordMutableGen_sync: 0 gc_alloc_block_sync: 0 whitehole_spin: 0 gen[0].steps[0].sync_todo: 0 gen[0].steps[0].sync_large_objects: 0 gen[0].steps[1].sync_todo: 0 gen[0].steps[1].sync_large_objects: 0 gen[1].steps[0].sync_todo: 0 gen[1].steps[0].sync_large_objects: 0