
Am Dienstag, 17. Juni 2008 18:32 schrieb Dan Doel:
On Tuesday 17 June 2008, Simon Marlow wrote:
So I tried your examples and the Addr# version looks slower than the MBA# version:
Hmm...
I tried with 6.8.2 and 6.8.3, using -O2 in both cases. I tried the Ptr version with and without -fvia-C -optc-O2, no difference.
I had forgotten about the via-c in the pragma when I sent it, but I've tested it both via-c and with the new backend (and triple checked since your message), and I always come away with the Ptr version being faster. -fvia-c doesn't seem to affect the speed of the Addr# version much, while it improves the speed of the MBA# version. However, even with the improved speed, Addr# seems to edge it out here.
With the new backend, I get the results I sent in my initial mail. The ByteArray version takes 11 - 12 seconds to reverse a size 10 array 250 million times, whereas the Addr# version takes around 7 seconds.
I've experimented a bit and found that Ptr is faster for small arrays (only very slightly so if compiled with -fvia-C -optc-O3), but ByteArr performs much better for larger arrays dafis@linux:~/Documents/haskell/move> ./PtrC +RTS -sstderr -RTS 20 10000000 ./PtrC 20 10000000 +RTS -sstderr Done. 481,596,836 bytes allocated in the heap 257,665,360 bytes copied during GC (scavenged) 171,919,440 bytes copied during GC (not scavenged) 117,149,696 bytes maximum residency (8 sample(s)) 919 collections in generation 0 ( 3.44s) 8 collections in generation 1 ( 24.99s) 226 Mb total memory in use INIT time 0.00s ( 0.00s elapsed) MUT time 8.16s ( 9.06s elapsed) GC time 28.43s ( 30.11s elapsed) EXIT time 0.00s ( 0.00s elapsed) Total time 36.59s ( 39.16s elapsed) %GC time 77.7% (76.9% elapsed) Alloc rate 59,019,220 bytes per MUT second Productivity 22.3% of total user, 20.8% of total elapsed dafis@linux:~/Documents/haskell/move> ./ByteArrC +RTS -sstderr -RTS 20 10000000 ./ByteArrC 20 10000000 +RTS -sstderr Done. 40,041,976 bytes allocated in the heap 1,272 bytes copied during GC (scavenged) 0 bytes copied during GC (not scavenged) 16,384 bytes maximum residency (1 sample(s)) 2 collections in generation 0 ( 0.00s) 1 collections in generation 1 ( 0.00s) 40 Mb total memory in use INIT time 0.00s ( 0.02s elapsed) MUT time 5.03s ( 5.32s elapsed) GC time 0.00s ( 0.01s elapsed) EXIT time 0.00s ( 0.00s elapsed) Total time 5.03s ( 5.35s elapsed) %GC time 0.0% (0.3% elapsed) Alloc rate 7,960,631 bytes per MUT second Productivity 100.0% of total user, 94.0% of total elapsed Using GHC 6.8.2 The GC time for the Addr# version is frightening