
2009/3/1 Claus Reinke
It might be useful to point out that the interaction goes both ways. Not only are fused loops candidates for unrolling, but unrolling can also enable fusion, giving one example of why Core-level unrolling (in addition to backend-level loop restructuring) would be useful.
Yes - this is why my use of a kind of unrolling fixes concatMap for streams, because GHC is able to specialise the "unrolled" function body on a particular lambda abstraction. However, this is really a somewhat seperate issue than plain unrolling, as we just want to be able to /specialise/ recursive functions on particular arguments rather than reduce loop overhead / reassociate arithmetic over several iterations. This is why the static argument transformation is such a big win (as I've mentioned before, 12% decrease in nofib runtime if you use it) - because it finds instances of recursive definitions where it's a REALLY GOOD idea to specialise on a particular argument (since that argument is actually /invariant/) and gives GHC the opportunity to specialise on it by creating a nonrecursive wrapper around the recursive worker loop. In general, the compiler wants to be able to determine the structure of the argument of a loop body in a more fine grained way than just "invariant vs non-invariant" as SAT does. A particularly tempting example of an optimisation you could do if we dealt with recursive functions better is strength reduction. This is part of what I'm looking at implementing for GHC currently. Cheers, Max