
On Fri, 27 Jun 2003 09:36:36 +0100
"Simon Peyton-Jones"
Several comments
1. In Haskell "mod" is a lot more expensive than "rem", because it involves careful jiggery pokery with the sign of the result. That's why your boxed version was slower (nothing to do with boxing).
Ah. I guess I should've checked the core closer (or more than a quick glimpse) before making wild assumptions. The (relatively) few other times I've unboxed by hand, I've found GHC had already done it. I did think mod might have been a problem, but I didn't expect it to be -that- much of a difference, hence assuming it was boxing related (I didn't see a modInt# so I figured remInt# was more or less it, it didn't know how much less).
2. GHC does indeed optimise away the array access if you aren't careful, and does in Derek's code. Below is a version that avoids doing so, in the same way as the C version, by returning one of the values.
Yeah, I was having a very hard time believing it, as the code was comparable to gcc optimising out the array access (which took ~.6 seconds on my computer).
3. Derek correctly points out that gcc does not do array-bound checks. GHC should eliminate them, but it's a non-trivial analysis and GHC does not currently attempt it. You can always use unsafeAt, as Derek does.
I was thinking that that would be nice, but was not surprised that it wasn't there. Are there any concoctions of flags/pragmas/techniques to coax GHC into unrolling a function (or rather an IO computation in this case) a couple of iterations. There was another (admittedly also noddy) program that I could get to near gcc, but only by unrolling by hand (though Ian's TH stuff likely could have done it).