
Am Dienstag 15 Dezember 2009 21:43:46 schrieb Johann Höchtl:
Please describe for me as a beginner, why there _is_ a difference:
1. does len (x:xs) l = l `seq` len xs (l+1) vs. len xs $! (l+1) expand into sthg. different?
Yes. How different depends on optimisation level and compiler version. Without optimisations, seq is inlined, giving one self-contained loop, while ($!) is not inlined, so in every iteration of the loop there's a call to ($!) - bad for performance. ghc-6.10.3 and ghc-6.12.1 produce nearly identical core for each of the functions. With optimisations (-O2), both functions get compiled into a self-contained loop, 6.12.1 produces near identical core for the two functions. The core for the second contains one case-expression the core for the first doesn't, but they should produce the same assembly. One improvement versus the unoptimised core is that plusInteger is called instead of GHC.Num.+. 6.10.3 produces very different core with -O2. The core for the second variant is close to that which 6.12.1 produces, I know not enough about core to see how that would influence performance. For the first variant, 6.10.3 produces different core, special casing for small and large Integers, which proves to be more efficient. Again, I'm not specialist enough to know why a) it produces so different core b) that core is so much faster.
2. Do I understand right, that the first expression "should" actually be slower but (for what reason ever in an unoptimized case isn't?
No. In principle, with len (x:xs) l = l `seq` len xs (l+1), the evaluation of l lags one step behind, so you'd have the reduction len [1,2,3] 0 ~> len [2,3] (0+1) ~> len [3] (1+1) ~> len [] (2+1) ~> (2+1) ~> 3 while len (x:xs) l = let l' = l+1 in l' `seq` len xs l' gives len [1,2,3] 0 ~> len [2,3] 1 ~> len [3] 2 ~> len [] 3 ~> 3 but that difference isn't measurable (if they produce different machine instructions at all, the difference is at most a handful of clock cycles). *If* len xs $! (l+1) were expanded into the latter, both would be - for all practical purposes at least - equally fast.
3. The function is anotated with Integer. Why is suddenly Int of importance?
Thomas DuBuisson tried Int to investigate. It's always interesting what changes when you change types.
(4. When optimizing is switched on, the second expession executes faster; as such I assume, that there is a difference between these two statements)
Not here. With 6.12.1 and -O(2), both are equally fast, with 6.10.3, the first is faster. I would rather expect 6.10.4 to behave more like 6.10.3. It may be, of course, that it's a hardware/OS issue which code is faster. Can you ghc-6.10.4 -O2 -fforce-recomp -ddump-simpl --make Whatever.hs > Whatever.core so I can see what core that produces?
Thank you!