
On Sat, Jun 23, 2007 at 07:07:49PM +0100, Philip Armstrong wrote:
On Sat, Jun 23, 2007 at 12:05:01PM +0100, Claus Reinke wrote:
try making ray_sphere and intersect' local to intersect, then drop their constant ray parameter. saves me 25%. claus
also try replacing that (foldl' intersect') with (foldr (flip intersect'))!
Thanks guys, this is exactly the kind of advice I was seeking.
OK, next question: Given that I'm using all the results from intersect', why is the lazy version better than the strict one? Is ghc managing to do some loop fusion?
Incidentally, replacing foldl' (intersect ray) hit scene with foldr (flip (intersect ray)) hit scene makes the current version (without the lifting of ray out of intersect & ray_scene) almost exactly as fast as the OCaml version on my hardware. That's almost a 40% speedup!
using a recent ghc head instead of ghc-6.6.1 also seems to make a drastic difference (wild guess, seeing the unroll 1000 for ocaml: has there been a change to default unrolling in ghc?).
Um. I tried ghc head on the current version and it was about 15% *slower* than 6.6.1
Perhaps it does better on the (slightly) optimised version?
Nope, just tried it on the foldr version. It's still slower than 6.6.1 (although not by as much: 26s vs 24s for the 6.6.1 binary). This is ghc head from this Friday. Jon is probably correct that hoisting ray out is a code level transformation that makes the Haskell version different to the OCaml one, but personally I'd suggest that replacing foldl with foldr is not: the end result is the same & both have to walk the entire list in order to get a result. So, on my hardware at least, the Haskell and OCaml version have equivalent performance. I think that's pretty impressive. Getting close to the C++ would probably going to require rather more effort! Phil -- http://www.kantaka.co.uk/ .oOo. public key: http://www.kantaka.co.uk/gpg.txt