
2009/2/24 Claus Reinke
In the recently burried haskell-cafe thread "speed: ghc vs gcc", Bulat pointed out some of the optimizations that GHC doesn't do, such as loop unrolling. I suggested a way of experimenting with loop unrolling, using template haskell to bypass GHC's blindspot (it usually doesn't unfold recursive definitions http://www.haskell.org/pipermail/glasgow-haskell-users/2007-July/012936.html , but if we unfold a loop combinator at compile time, GHC's normal optimizations can take over from there):
http://www.haskell.org/pipermail/haskell-cafe/2009-February/056241.html
Just a note - there is a solution that doesn't require Template Haskell which I use in my own code. Here is a sketch: fact = fix4 fact_worker {-# INLINE fact_worker #-} fact_worker recurse n | n <= 0 = 1 | otherwise = n * recurse (n - 1) {-# INLINE fix4 #-} fix4 f = f1 where f1 = f f2 f2 = f f3 f3 = f f4 f4 = f f1 There is probably a way to generalise this to arbitrary levels of unrolling by using instances of a typeclass on type level numerals. Cheers, Max