Re: Low-level array performance

18 Jun 2008

      On Wednesday 18 June 2008, Daniel Fischer wrote:
...
Am Dienstag, 17. Juni 2008 22:37 schrieb Dan Doel:
...
I'll attach new, hopefully bug-free versions of the benchmark to this
message.
With -O2 -fvia-C -optc-O3, the difference is small (less than 1%), but
today, ByteArr is faster more often.
Hmm, well, I'm a bit flummoxed. I still get Addr# outperforming MBA# by 
perhaps 10% - 15%, even with -fvia-C -optc-O3 (and before the slight speedup 
below). Perhaps gcc's optimizer isn't doing as good a job for me for some 
reason.

In any case, I've entered a bug for this on the GHC trac:

  http://hackage.haskell.org/trac/ghc/ticket/2374

It contains a Ptr benchmark that performs slightly faster on very small arrays 
(under, say, 40 elements; I noticed such runs were taking more time than 
those with larger arrays with correspondingly fewer iterations, so I 
eliminated the replicateM_ in favor of an explicit loop. It gains a little 
time on the small arrays, but not enough to match the performance on the 
larger arrays, so I guess there are yet more factors. :) In any case, it 
makes it closer to being the same code as ByteArr).

The bug is filed against the native code generator, since it shows up more 
clearly there. I haven't gotten to looking at C-- or assembly yet, but 
hopefully I will in the near future. I'll try to do further followup on the 
bug report, since that's probably easier for the developers to keep track of.

Cheers,
-- Dan