
On May 18, 2009, at 20:54 , Claus Reinke wrote:
As I said, I don't get the fusion if I just add the function above to the original Dist.hs, export it and compile the module with '-c -O2 -ddump-simpl': I can't reproduce this.
Interesting. I'm using ghc 6.11.20090320 (windows), uvector-0.1.0.3. I attach the modified Dist.hs and its simpl output, created via:
ghc -c Dist.hs -O2 -ddump-tc -ddump-simpl-stats -ddump-simpl > Dist.dumps
Perhaps others can confirm the effect? Note that the 'dist_fast' in the same module does get fused, so it is not likely an options issue. I still suspect that the inlining of the 'Dist.zipWith' wrapper in the 'dist_fast_inlined' '__inline_me' has some significance - it is odd to see inlined code in an '__inline_me' and the fusion rule won't trigger on 'Dist.sumU . Dist. $wzipWithU', right?
As far as I can tell, the dist_fast_inlined doesn't get fused, i.e. I'm seeing zipWithU and sumU being used in it, which is not the case in dist_fast. This is on OS X/PowerPC, using GHC 6.10.1.
Does the complete program fragment I posted earlier yield the desired result?
Yes. Note that the original poster also reported slowdown from use of 'dist_fast_inlined'.
Don, you were defining dist inside the main module, while in our case the dist functions are defined in a seperate Dist.hs module... Would that matter? K. -- Kenneth Hoste Paris research group - ELIS - Ghent University, Belgium email: kenneth.hoste@elis.ugent.be website: http://www.elis.ugent.be/~kehoste blog: http://boegel.kejo.be