
Suppose I want a foldl which is evaluated many times on the same list but with different folding functions. I would write something like this, to perform pattern-matching on the list only once: module F where myFoldl :: [a] -> (b -> a -> b) -> b -> b myFoldl [] = \f a -> a myFoldl (x:xs) = let y = myFoldl xs in \f a -> y f (f a x) However, for some reason GHC eta-expands it back. Here's what I see in the core: % ghc -O2 -ddump-simpl -fforce-recomp -dsuppress-module-prefixes \ -dsuppress-uniques -dsuppress-coercions F.hs ==================== Tidy Core ==================== Rec { myFoldl [Occ=LoopBreaker] :: forall a b. [a] -> (b -> a -> b) -> b -> b [GblId, Arity=3, Caf=NoCafRefs, Str=DmdType SLL] myFoldl = \ (@ a) (@ b) (ds :: [a]) (eta :: b -> a -> b) (eta1 :: b) -> case ds of _ { [] -> eta1; : x xs -> myFoldl @ a @ b xs eta (eta eta1 x) } end Rec } Why does it happen and can it be suppressed? This is GHC 7.0.4. -- Roman I. Cheplyaka :: http://ro-che.info/

Combining lambdas makes a big difference in GHC. For example f = \x. let y = E in \z. BODY The function f takes one argument, and returns a heap-allocated lambda. If E is cheap (say just a constructor) it might well be more efficient to transform to f = \xz. let y = E in BODY Pattern matching is another example, and GHC indeed eta expands through that by default, if it makes two lambdas into one. To switch it off try -fno-do-lambda-eta-expansion. Simon | -----Original Message----- | From: glasgow-haskell-users-bounces@haskell.org [mailto:glasgow-haskell-users- | bounces@haskell.org] On Behalf Of Roman Cheplyaka | Sent: 04 October 2011 07:40 | To: glasgow-haskell-users@haskell.org | Subject: Unwanted eta-expansion | | Suppose I want a foldl which is evaluated many times on the same | list but with different folding functions. | | I would write something like this, to perform pattern-matching on the | list only once: | | module F where | myFoldl :: [a] -> (b -> a -> b) -> b -> b | myFoldl [] = \f a -> a | myFoldl (x:xs) = let y = myFoldl xs in \f a -> y f (f a x) | | However, for some reason GHC eta-expands it back. Here's what I see in | the core: | | % ghc -O2 -ddump-simpl -fforce-recomp -dsuppress-module-prefixes \ | -dsuppress-uniques -dsuppress-coercions F.hs | | ==================== Tidy Core ==================== | Rec { | myFoldl [Occ=LoopBreaker] | :: forall a b. [a] -> (b -> a -> b) -> b -> b | [GblId, Arity=3, Caf=NoCafRefs, Str=DmdType SLL] | myFoldl = | \ (@ a) (@ b) (ds :: [a]) (eta :: b -> a -> b) (eta1 :: b) -> | case ds of _ { | [] -> eta1; : x xs -> myFoldl @ a @ b xs eta (eta eta1 x) | } | end Rec } | | Why does it happen and can it be suppressed? | | This is GHC 7.0.4. | | | -- | Roman I. Cheplyaka :: http://ro-che.info/ | | _______________________________________________ | Glasgow-haskell-users mailing list | Glasgow-haskell-users@haskell.org | http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

On Tue, Oct 4, 2011 at 2:39 AM, Roman Cheplyaka
Suppose I want a foldl which is evaluated many times on the same list but with different folding functions.
I would write something like this, to perform pattern-matching on the list only once:
module F where myFoldl :: [a] -> (b -> a -> b) -> b -> b myFoldl [] = \f a -> a myFoldl (x:xs) = let y = myFoldl xs in \f a -> y f (f a x)
However, for some reason GHC eta-expands it back.
It seems to be a common misconception that eta-abstracting your functions in this way will speed up or otherwise improve your code. Simon PJ has already provided a good explanation of why GHC eta expands. Let me take another tack and describe why the code you wrote without eta expansion probably doesn't provide you with any actual benefit. Roughly speaking, you're creating a chain of closures whose contents exactly describe the contents of your list (ie you've created something that's isomorphic to your original list structure), and so you should expect no benefit at all. Let's consider a specific concrete call to myFoldl: myFoldl (1:2:3:4:[]) = r1 where rt = \ f a -> a x4 = 4 r4 = \ f a -> rt f (f a x4) x3 = 3 r3 = \ f a -> r4 f (f a x3) x2 = 2 r2 = \ f a -> r3 f (f a x2) x1 = 1 r1 = \ f a -> r2 f (f a x1) Each of the bindings above is a separate heap-allocated object. Now consider the data representation for the function closures r1..r4. Each such closure has two free variables: the previous closure (r2..r4 or rt) and the value of the current list element (x1..x4). If you write that out schematically in box-and-pointer notation, you'll quickly see that the result has the exact same memory structure as your original list. Now, you'll probably want to point out that transforming your list into a string of closures like this eliminates the need to pattern match on the list structure. This is true, but that's because you've replaced that pattern match with a call to an unknown closure. That's because in most circumstances we'll get just one copy of the code for r1..r4. GHC will actually generate code a bit like the following [*]: myFoldl (1:2:3:4:[]) = r1 where closure x r = \ f a -> r f (f a x) rt = \ f a -> a x4 = 4 r4 = closure x4 rt x3 = 3 r3 = closure x3 r4 x2 = 2 r2 = closure x2 r3 x1 = 1 r1 = closure x1 r2 So there are two different kinds of closure that get passed to r in closure x r: 1) rt 2) a call to closure xn r(n+1) Distinguishing these two cases (even if just by branching to a closure code pointer) leads to overheads comparable to (and generally larger than) those of pattern matching. GHC used to distinguish (:) and [] by branching to an unknown function pointer (exactly as is happening here) and switched to pointer tagging instead because it was faster. All this assumes everything has been completely evaluated already. Laziness drowns out most of the relative advantages and disadvantages here (and there's an even-more-involved explanation of why you might lose strictness information in your eta-abstracted function, making things worse still). It also assumes you are not able to specialize your code to a particular higher-order function. Any time you can do that, it's potentially very beneficial. For example, the following *might* be a worthwhile definition of foldl: {-# INLINE foldl #-} foldl f = loop where loop a [] = a loop a (x:xs) = loop (f a x) xs -Jan-Willem Maessen [*] Note that GHC actually treats the free variables of the closure (x and r in this case) a little bit specially, so the code isn't necessarily *literally* equivalent to what I've shown here, but it's pretty close.

* Jan-Willem Maessen
It seems to be a common misconception that eta-abstracting your functions in this way will speed up or otherwise improve your code.
Simon PJ has already provided a good explanation of why GHC eta expands. Let me take another tack and describe why the code you wrote without eta expansion probably doesn't provide you with any actual benefit. Roughly speaking, you're creating a chain of closures whose contents exactly describe the contents of your list (ie you've created something that's isomorphic to your original list structure), and so you should expect no benefit at all.
Thanks for the analysis. I used myFoldl as a minimal example to ask the question. Here's an example of real code where this does make a difference: https://github.com/feuerbach/regex-applicative/tree/03ca9c852f06bf9a9d515056... You can run the benchmark (on a POSIX system) using cd benchmark && ./runbenchmark.sh With the current version I get 2.62 seconds. If I remove -fno-do-lambda-eta-expansion for Text/Regex/Applicative/Compile.hs, the benchmark takes 2.74 seconds. What's even more interesting (and puzzling!), if remove -fno-do-lambda-eta-expansion for Text/Regex/Applicative/Types.hs, the benchmark takes 2.82 seconds. I appreciate any thoughts about this. -- Roman I. Cheplyaka :: http://ro-che.info/

On Sun, Oct 9, 2011 at 10:54 AM, Roman Cheplyaka
* Jan-Willem Maessen
[2011-10-08 12:32:18-0400] It seems to be a common misconception that eta-abstracting your functions in this way will speed up or otherwise improve your code.
Simon PJ has already provided a good explanation of why GHC eta expands. Let me take another tack and describe why the code you wrote without eta expansion probably doesn't provide you with any actual benefit. Roughly speaking, you're creating a chain of closures whose contents exactly describe the contents of your list (ie you've created something that's isomorphic to your original list structure), and so you should expect no benefit at all.
Thanks for the analysis.
I used myFoldl as a minimal example to ask the question.
Here's an example of real code where this does make a difference: https://github.com/feuerbach/regex-applicative/tree/03ca9c852f06bf9a9d515056...
Ah, now things get more complicated! :-) I suspect here you're actually entering the regexp closures, and compiling it down is actually saving you a teensy bit of interpretive overhead.
... What's even more interesting (and puzzling!), if remove -fno-do-lambda-eta-expansion for Text/Regex/Applicative/Types.hs, the benchmark takes 2.82 seconds.
That *Is* odd. The only obvious code generated here would be the newtype instances, for which this should surely be irrelevant? Can someone at GHC HQ explain this one? -Jan

Roman, Jan-Willem
I'm maxed out at the moment, and will be so for at least a week.
If you think there is something mysterious and J-W agrees, could you create a ticket, with the smallest example you can, and instructions to reproduce? That'd be brilliant.
Of course, Jan-Willem, if you have a moment to diagnose a bit more about WHY it's going slower that would be even better!
Thanks
Simon
| -----Original Message-----
| From: glasgow-haskell-users-bounces@haskell.org [mailto:glasgow-haskell-users-
| bounces@haskell.org] On Behalf Of Jan-Willem Maessen
| Sent: 10 October 2011 02:51
| To: Roman Cheplyaka
| Cc: glasgow-haskell-users@haskell.org
| Subject: Re: Unwanted eta-expansion
|
| On Sun, Oct 9, 2011 at 10:54 AM, Roman Cheplyaka

The only difference in generated code is in the showsPrec function for a
newtype-wrapped Int (ThreadId).
The code actually never (explicitly) uses Show instance -- it's only
needed as a dependency for Num instance.
Could it be some cache effect?
I'll see if I can reduce that code and then will file a bug.
* Simon Peyton-Jones
Roman, Jan-Willem
I'm maxed out at the moment, and will be so for at least a week.
If you think there is something mysterious and J-W agrees, could you create a ticket, with the smallest example you can, and instructions to reproduce? That'd be brilliant.
Of course, Jan-Willem, if you have a moment to diagnose a bit more about WHY it's going slower that would be even better!
Thanks
Simon
| -----Original Message----- | From: glasgow-haskell-users-bounces@haskell.org [mailto:glasgow-haskell-users- | bounces@haskell.org] On Behalf Of Jan-Willem Maessen | Sent: 10 October 2011 02:51 | To: Roman Cheplyaka | Cc: glasgow-haskell-users@haskell.org | Subject: Re: Unwanted eta-expansion | | On Sun, Oct 9, 2011 at 10:54 AM, Roman Cheplyaka
wrote: | > * Jan-Willem Maessen [2011-10-08 12:32:18-0400] | >> It seems to be a common misconception that eta-abstracting your | >> functions in this way will speed up or otherwise improve your code. | >> | >> Simon PJ has already provided a good explanation of why GHC eta | >> expands. Let me take another tack and describe why the code you wrote | >> without eta expansion probably doesn't provide you with any actual | >> benefit. Roughly speaking, you're creating a chain of closures whose | >> contents exactly describe the contents of your list (ie you've created | >> something that's isomorphic to your original list structure), and so | >> you should expect no benefit at all. | > | > Thanks for the analysis. | > | > I used myFoldl as a minimal example to ask the question. | > | > Here's an example of real code where this does make a difference: | > https://github.com/feuerbach/regex- | applicative/tree/03ca9c852f06bf9a9d51505640b8b72f07291c7d | | Ah, now things get more complicated! :-) I suspect here you're | actually entering the regexp closures, and compiling it down is | actually saving you a teensy bit of interpretive overhead. | | > ... | > What's even more interesting (and puzzling!), if remove | > -fno-do-lambda-eta-expansion for Text/Regex/Applicative/Types.hs, | > the benchmark takes 2.82 seconds. | | That *Is* odd. The only obvious code generated here would be the | newtype instances, for which this should surely be irrelevant? Can | someone at GHC HQ explain this one? | | -Jan | | _______________________________________________ | Glasgow-haskell-users mailing list | Glasgow-haskell-users@haskell.org | http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
-- Roman I. Cheplyaka :: http://ro-che.info/

Hi, Am Dienstag, den 04.10.2011, 09:39 +0300 schrieb Roman Cheplyaka:
Suppose I want a foldl which is evaluated many times on the same list but with different folding functions.
I used this pattern successfully in SAT-Britney, where I generate a huge list quite quickly, and I don’t want this list to stay in memory. I had to pay a lot of attention to sharing, e.g. by making sure the parameters to the function that generate the left fold are only passed when the folding functions are also given, see in http://git.nomeata.de/?p=sat-britney.git;a=commitdiff;h=e8a1eea156b76d76729a... the changes to TransRules.hs. Otherwise, I’d get a huge thunk of closures representing the list, as Jan-Willem predicted. Note that I want to achieve something differently than you, unless I am mistaken: In my case, I want to make sure the list can be fused or at least immediately garbage-collected with every use, even if the code that calculates the list has to be run multiple times. Greetings, Joachim -- Joachim "nomeata" Breitner mail@joachim-breitner.de | nomeata@debian.org | GPG: 0x4743206C xmpp: nomeata@joachim-breitner.de | http://www.joachim-breitner.de/
participants (4)
-
Jan-Willem Maessen
-
Joachim Breitner
-
Roman Cheplyaka
-
Simon Peyton-Jones