Re: Low-level array performance

17 Jun 2008

      Dan Doel wrote:
...
Issue 2: Reading from/writing to a MutableByteArray# is slower than an Addr#
This is, I think, the crux of the issue. The main content of the benchmark is 
reversing/shifting items in an array. To get a somewhat easier look at the 
core, I boiled things down to a benchmark that just reverses a small array 
many times. In the interest of further reducing things, I wrote a version of 
the benchmark that uses raw Addr#s, and a version that uses raw 
MutableByteArray#s. I've attached both versions.
So I tried your examples and the Addr# version looks slower than the MBA# 
version:

$ ./Ptr 100 1000000 +RTS -sstderr
Done.
  48,196,560 bytes allocated in the heap
  27,381,764 bytes copied during GC (scavenged)
  18,260,784 bytes copied during GC (not scavenged)
  14,389,248 bytes maximum residency (5 sample(s))

          92 collections in generation 0 (  0.09s)
           5 collections in generation 1 (  0.13s)

          28 Mb total memory in use

   INIT  time    0.00s  (  0.00s elapsed)
   MUT   time    0.68s  (  0.69s elapsed)
   GC    time    0.22s  (  0.28s elapsed)
   EXIT  time    0.00s  (  0.00s elapsed)
   Total time    0.90s  (  0.97s elapsed)

$ ./ByteArr 100 1000000 +RTS -sstderr
Done.
   4,042,700 bytes allocated in the heap
       1,272 bytes copied during GC (scavenged)
           0 bytes copied during GC (not scavenged)
      16,384 bytes maximum residency (1 sample(s))

           2 collections in generation 0 (  0.00s)
           1 collections in generation 1 (  0.00s)

           5 Mb total memory in use

   INIT  time    0.00s  (  0.00s elapsed)
   MUT   time    0.53s  (  0.54s elapsed)
   GC    time    0.00s  (  0.00s elapsed)
   EXIT  time    0.00s  (  0.00s elapsed)
   Total time    0.53s  (  0.54s elapsed)

I tried with 6.8.2 and 6.8.3, using -O2 in both cases.  I tried the Ptr 
version with and without -fvia-C -optc-O2, no difference.

Are these exactly the same programs you measured?  What parameters did you use?

Cheers,
	Simon

Re: Low-level array performance

Simon Marlow