Re: Low-level array performance

17 Jun 2008

      I see that Dan Doel's post favoring Ptr/Addr#
has the same allocation amounts (from +RTS -sstderr) for Ptr/Addr# and the 
MutableByteArray#

Everyone else sees more allocation for Ptr/Addr# than MBA# and see MBA# as 
faster in these cases.

I myself (on G4) see more allocation [just like Simon Marlow] for Ptr/Addr# and 
find it slower.  If I boost the initial memory with "-A 100m" then Ptr still 
allocated more, but the timing difference becomes quite small:

The Ptr/Addr# code now runs in:

pamac-cek10:tmp chrisk$ time ./addr 100 1000000 +RTS -sstderr -A100m
./a 100 1000000 +RTS -sstderr -A100m
Done.
  48,182,068 bytes allocated in the heap
         276 bytes copied during GC (scavenged)
           0 bytes copied during GC (not scavenged)
      20,480 bytes maximum residency (1 sample(s))

           1 collections in generation 0 (  0.00s)
           1 collections in generation 1 (  0.00s)

          97 Mb total memory in use

   INIT  time    0.00s  (  0.01s elapsed)
   MUT   time    1.54s  (  2.43s elapsed)
   GC    time    0.00s  (  0.00s elapsed)
   EXIT  time    0.00s  (  0.00s elapsed)
   Total time    1.55s  (  2.44s elapsed)

   %GC time       0.2%  (0.1% elapsed)

   Alloc rate    31,205,254 bytes per MUT second

   Productivity  99.6% of total user, 63.1% of total elapsed

real	0m2.728s
user	0m1.548s
sys	0m0.207s

And the MutableByteArray# code now runs in:

pamac-cek10:tmp chrisk$ time ./mba 100 1000000 +RTS -sstderr -A100m
./m 100 1000000 +RTS -sstderr -A100m
Done.
   4,023,784 bytes allocated in the heap
         276 bytes copied during GC (scavenged)
           0 bytes copied during GC (not scavenged)
      20,480 bytes maximum residency (1 sample(s))

           1 collections in generation 0 (  0.00s)
           1 collections in generation 1 (  0.00s)

         101 Mb total memory in use

   INIT  time    0.00s  (  0.01s elapsed)
   MUT   time    1.50s  (  2.30s elapsed)
   GC    time    0.00s  (  0.00s elapsed)
   EXIT  time    0.00s  (  0.00s elapsed)
   Total time    1.51s  (  2.32s elapsed)

   %GC time       0.3%  (0.2% elapsed)

   Alloc rate    2,668,201 bytes per MUT second

   Productivity  99.6% of total user, 65.0% of total elapsed

real	0m2.335s
user	0m1.513s
sys	0m0.049s

Re: Low-level array performance

haskell＠list.mightyreason.com