Re: Low-level array performance

17 Jun 2008

      On Tuesday 17 June 2008, haskell@list.mightyreason.com wrote:
...
I see that Dan Doel's post favoring Ptr/Addr#
has the same allocation amounts (from +RTS -sstderr) for Ptr/Addr# and the
MutableByteArray#
Everyone else sees more allocation for Ptr/Addr# than MBA# and see MBA# as
faster in these cases.
I myself (on G4) see more allocation [just like Simon Marlow] for Ptr/Addr#
and find it slower.  If I boost the initial memory with "-A 100m" then Ptr
still allocated more, but the timing difference becomes quite small:
Pardon my noise, but is this still with the version of Ptr.hs that would (in 
your case) allocate a 1 million element list and traverse it twice, or the 
revision that fills the array in a loop with an Int#?

If it's the former, and Addr# is tying MutableByteArray# even with operations 
on a 40-some megabyte list (if the allocation is any indication), then the 
actual Addr# operations are probably faster for you, too. :)

I'll attach new, hopefully bug-free versions of the benchmark to this message.

Of course, without the list overhead, the ByteArr appears to allocate much 
more than Ptr for large arrays, because the n*w byte array shows up in the 
heap allocation, whereas the malloced memory does not. None of this should be 
a factor in the actual fannkuch benchmark, of course, which only allocates 3 
arrays of size 11.

Cheers,
-- Dan