
On Wednesday 18 June 2008, Daniel Fischer wrote:
Am Dienstag, 17. Juni 2008 22:37 schrieb Dan Doel:
I'll attach new, hopefully bug-free versions of the benchmark to this message.
With -O2 -fvia-C -optc-O3, the difference is small (less than 1%), but today, ByteArr is faster more often.
Hmm, well, I'm a bit flummoxed. I still get Addr# outperforming MBA# by perhaps 10% - 15%, even with -fvia-C -optc-O3 (and before the slight speedup below). Perhaps gcc's optimizer isn't doing as good a job for me for some reason. In any case, I've entered a bug for this on the GHC trac: http://hackage.haskell.org/trac/ghc/ticket/2374 It contains a Ptr benchmark that performs slightly faster on very small arrays (under, say, 40 elements; I noticed such runs were taking more time than those with larger arrays with correspondingly fewer iterations, so I eliminated the replicateM_ in favor of an explicit loop. It gains a little time on the small arrays, but not enough to match the performance on the larger arrays, so I guess there are yet more factors. :) In any case, it makes it closer to being the same code as ByteArr). The bug is filed against the native code generator, since it shows up more clearly there. I haven't gotten to looking at C-- or assembly yet, but hopefully I will in the near future. I'll try to do further followup on the bug report, since that's probably easier for the developers to keep track of. Cheers, -- Dan