
G'day all.
Quoting Simon Peyton-Jones
So the best way to transform f depends on how it is used. When it's used locally and just once, GHC inlines it at the call site and all is good. But when it's exported or called many times, GHC never "floats" a let *between* two lambdas. So it won't transform f into f_opt. On the other hand, if you write f_opt, GHC will keep it that way.
While this reasoning makes perfect sense, it does tend to violate the principle of least surprise. I would expect a Haskell implementation either to provide full laziness or not. (Possibly with a compiler switch.) This looks more like a quirk of the STG Core. An intermediate representation which treated multiple lambdas separately in its intermediate language would have far fewer qualms about applying let-floating in this case. Actually, it might be mildly amusing to see if Gofer runs this code faster than Hugs or GHCi. (As an aside, this issue bit me not so long ago. I was trying to unroll a recursive function at run-time. It took quite a bit of eta-conversion to get it right, and it was only in reading this mail that I finally worked out what was going wrong.) Perhaps Haskell' might like to look into this. H98 demands laziness, not full laziness, but it seems to me that this is exactly the sort of thing that a programmer might unconsciously rely on that becomes a hard to track down performance bug when switching implementations. Cheers, Andrew Bromage