readArray is faster than unsafeRead

29 May 2007

      Hello,

I'm writing some matrix multiplication and inversion functions for
small matrices (3x3 and 4x4 mostly, for 3d graphics, modeling,
simulation, etc.)  I noticed that the matrix multiplication was a
bottleneck so I set out to optimize and found that using unsafeRead
instead of (!) (or readArray in stateful code) helped a lot. So then I
went to optimize my gaussian elimination function and found just the
opposite. unsafeRead is slower than readArray. This struck me as very
odd considering that readArray calls unsafeRead.

If there is a "good" reason why the compiler optimized readArray
better than unsafeRead, I'd like to know what it is so that I can make
all my array code safe as well as fast. (By "good" reason I mean
something deterministic and repeatable, not just luck.)

On the otherhand, if this is a fluke, I'm inclined to think that it's
not the safe code which is freakishly fast, but the unsafe code which
is needlessly slow. That is, something about my program is hindering
optimization of the unsafe code. What is it?

Attached is the profiling results and a test program with a handful of
matrix multiplication and gaussian elimination functions to illustrate
what I've seen. This happens both on amd64 and intel core
architectures.

Thanks for any insight,
Scott

readArray is faster than unsafeRead

Scott Dillard