Eta-expansion destroys memoization?

Hi all,
See below for this message from one of my students which has me
stumped. Just when you think you understand Haskell... ;)
I've cc'ed him on this message; please include him on any replies as I
don't think he is subscribed to -cafe.
-Brent
----- Forwarded message from Yue Wang

On Thu, Oct 7, 2010 at 6:17 AM, Brent Yorgey
The source code seems to be easy to read, but I don't think I understand that. For me I think if I change the first line from fib = ((map fib' [0 ..]) !!) to fib x = ((map fib' [0 ..]) !!) x It should do the same thing since I think the previous version is just an abbreviation for the second one.
Semantically, yes. And it's possible that ghc -O is clever enough to notice that. But at least under ghci's naive evaluation strategy, lambdas determine the lifetime of expressions. Any expression within a lambda will be re-evaluated each time the lambda is expanded. Thus: fib = (map fib' [0..] !!) -- fast fib = \x -> map fib' [0..] !! x -- slow fib = let memo = map fib' [0..] in \x -> memo !! x -- fast The section works because "(a %^&)" (for some operator %^&) is short for "(%^&) a" and "(%^& a)" is short for "flip (%^&) a". Sections don't expand into lambdas. In other words, in the middle expression, there is a new "map fib' [0..]" for each x, whereas in the others, it is shared between invocations. Does that make sense? Luke

On Thu, Oct 7, 2010 at 8:44 AM, Luke Palmer
On Thu, Oct 7, 2010 at 6:17 AM, Brent Yorgey
wrote: The source code seems to be easy to read, but I don't think I understand that. For me I think if I change the first line from fib = ((map fib' [0 ..]) !!) to fib x = ((map fib' [0 ..]) !!) x It should do the same thing since I think the previous version is just an abbreviation for the second one.
Semantically, yes. And it's possible that ghc -O is clever enough to notice that. But at least under ghci's naive evaluation strategy, lambdas determine the lifetime of expressions. Any expression within a lambda will be re-evaluated each time the lambda is expanded. Thus:
fib = (map fib' [0..] !!) -- fast fib = \x -> map fib' [0..] !! x -- slow fib = let memo = map fib' [0..] in \x -> memo !! x -- fast
The section works because "(a %^&)" (for some operator %^&) is short for "(%^&) a" and "(%^& a)" is short for "flip (%^&) a". Sections don't expand into lambdas.
In other words, in the middle expression, there is a new "map fib' [0..]" for each x, whereas in the others, it is shared between invocations.
Does that make sense?
In general, f is not semantically equivalent to \x -> f x in Haskell. However, that is not what Brent said. The Report -defines- (m !!) as \x -> m !! x. GHC simply does not follow the Report here. You can witness this via: (() `undefined`) `seq` 0. By the Report this should evaluate to 0, in GHC it evaluates to undefined. As for the rest... The operational behavior of the above is implementation dependent, but GHC, and I imagine most implementations, more or less do the natural thing. The Report gives no way to control sharing behavior, but being able to control it is rather important, so predictable behavior here is desirable.

On 07/10/2010 14:03, Derek Elkins wrote:
On Thu, Oct 7, 2010 at 8:44 AM, Luke Palmer
wrote: On Thu, Oct 7, 2010 at 6:17 AM, Brent Yorgey
wrote: The source code seems to be easy to read, but I don't think I understand that. For me I think if I change the first line from fib = ((map fib' [0 ..]) !!) to fib x = ((map fib' [0 ..]) !!) x It should do the same thing since I think the previous version is just an abbreviation for the second one.
Semantically, yes. And it's possible that ghc -O is clever enough to notice that. But at least under ghci's naive evaluation strategy, lambdas determine the lifetime of expressions. Any expression within a lambda will be re-evaluated each time the lambda is expanded. Thus:
fib = (map fib' [0..] !!) -- fast fib = \x -> map fib' [0..] !! x -- slow fib = let memo = map fib' [0..] in \x -> memo !! x -- fast
The section works because "(a %^&)" (for some operator %^&) is short for "(%^&) a" and "(%^& a)" is short for "flip (%^&) a". Sections don't expand into lambdas.
In other words, in the middle expression, there is a new "map fib' [0..]" for each x, whereas in the others, it is shared between invocations.
Does that make sense?
In general, f is not semantically equivalent to \x -> f x in Haskell. However, that is not what Brent said. The Report -defines- (m !!) as \x -> m !! x. GHC simply does not follow the Report here. You can witness this via: (() `undefined`) `seq` 0. By the Report this should evaluate to 0, in GHC it evaluates to undefined.
Interesting. You're absolutely right, GHC doesn't respect the report, on something as basic as sections! The translation we use is (e op) ==> (op) e once upon a time, when the translation in the report was originally written (before seq was added) this would have been exactly identical to \x -> e op x, so the definition in the report was probably used for consistency with left sections. We could make GHC respect the report, but we'd have to use (e op) ==> let z = e in \x -> z op x to retain sharing without relying on full laziness. This might be a good idea in fact - all other things being equal, having lambdas be more visible to the compiler is a good thing. Cheers, Simon

On 8 October 2010 10:54, Simon Marlow
We could make GHC respect the report, but we'd have to use
(e op) ==> let z = e in \x -> z op x
to retain sharing without relying on full laziness.
This might be a good idea in fact - all other things being equal, having lambdas be more visible to the compiler is a good thing.
Of course, this change could cause a performance regression to existing code. Personally, I think that this is a report bug: there is no syntactic lambda in (`op` e) or (`op` e) so I would expect the evaluation of e to be shared. Max

Simon Marlow wrote:
Interesting. You're absolutely right, GHC doesn't respect the report, on something as basic as sections! The translation we use is
(e op) ==> (op) e
once upon a time, when the translation in the report was originally written (before seq was added) this would have been exactly identical to \x -> e op x, so the definition in the report was probably used for consistency with left sections.
We could make GHC respect the report, but we'd have to use
(e op) ==> let z = e in \x -> z op x
to retain sharing without relying on full laziness.
We should keep in mind that this was changed deliberately in ghc 6.6, in order to support "postfix" operators. http://www.haskell.org/ghc/docs/6.6/html/users_guide/release-6-6.html The motivating example was the factorial operator which can currently be written as (n !) in ghc-Haskell. Cheers, Bertram

On Tue, Oct 12, 2010 at 4:34 AM, Bertram Felgenhauer
Simon Marlow wrote:
Interesting. You're absolutely right, GHC doesn't respect the report, on something as basic as sections! The translation we use is
(e op) ==> (op) e
once upon a time, when the translation in the report was originally written (before seq was added) this would have been exactly identical to \x -> e op x, so the definition in the report was probably used for consistency with left sections.
We could make GHC respect the report, but we'd have to use
(e op) ==> let z = e in \x -> z op x
to retain sharing without relying on full laziness.
We should keep in mind that this was changed deliberately in ghc 6.6, in order to support "postfix" operators.
http://www.haskell.org/ghc/docs/6.6/html/users_guide/release-6-6.html
The motivating example was the factorial operator which can currently be written as (n !) in ghc-Haskell.
From http://www.haskell.org/ghc/docs/6.6/html/users_guide/syntax-extns.html#postf... "Since this extension goes beyond Haskell 98, it should really be enabled by a flag; but in fact it is enabled all the time. (No Haskell 98 programs change their behaviour, of course.) "
Which is not true, but is probably true enough. Of course, there is now a flag http://www.haskell.org/ghc/docs/6.12.2/html/users_guide/syntax-extns.html#po... but it seems that the non-standard interpretation of (e !) is still kept even without it. Without the flag, it type checks as if you had written \x -> e ! x but it still behaves as if you had written (!) e.

On Thu, Oct 7, 2010 at 1:44 PM, Luke Palmer
The section works because "(a %^&)" (for some operator %^&) is short for "(%^&) a" and "(%^& a)" is short for "flip (%^&) a". Sections don't expand into lambdas.
According to the report they do: http://haskell.org/onlinereport/exps.html#sections http://haskell.org/onlinereport/haskell2010/haskellch3.html#x8-300003.5 but GHC is different, I think: http://www.haskell.org/ghc/docs/6.12.2/html/users_guide/syntax-extns.html#po... I'm not sure if the significance of this difference is explored anywhere, but notice that: ghci> (() `undefined`) `seq` () *** Exception: Prelude.undefined ghci> (`undefined` ()) `seq` () ()

On Thursday 07 October 2010 14:17:18, Brent Yorgey wrote:
Hi all,
See below for this message from one of my students which has me stumped. Just when you think you understand Haskell... ;)
I've cc'ed him on this message; please include him on any replies as I don't think he is subscribed to -cafe.
-Brent
cf. http://www.haskell.org/pipermail/haskell-cafe/2009-October/067811.html If fib is defined without a type signature, the monomorphism restriction also kicks in, when fib is bound via a function binding, fib x = ... fib has polymorphic type (Num a) => Int -> a and that prevents sharing the list (since there are lists for all Num types). If bound via a simple pattern binding, fib = (map fib' [0 .. ] !!) where ... and the MR is not disabled, fib gets the monomorphic type Int -> Integer and that allows the list to be shared. If you give fib an explicit monomorphic type fib :: Int -> Integer the list will still not be shared if it's bound via a function binding because fib' and the list are bound inside the lambda: fib = \k -> let fib' ... in (map fib' [0 .. ] !!) k fib' is not a constant, so it's not floated out of the lambda, so it's not shared (and a fortiori map fib' [0 .. ] isn't shared). if fib is bound via a simple pattern binding (and no explicit lambda is given), fib' and the list are bound outside the lambda and hence shared: fib = let fib' = ...; lst = map fib' [0 .. ] in \k -> lst !! k Note however that using the list as "map fib' [0 .. ]" is more brittle than giving it a name and binding it explicitly in the where clause: fib :: Int -> Integer fib = (fibList !!) where fibList = map fib' [0 .. ] fib' 0 = 0 ... For the time being, GHC treats both the same, but the latter is less likely to break.
----- Forwarded message from Yue Wang
----- From: Yue Wang
Date: Tue, 5 Oct 2010 21:23:57 -0400 To: Brent Yorgey Subject: store evaluated values Hi, Anthony (or Brent):
Last time (in Anthony's TA office hour) we talked about how to store evaluated values in the program for later used. I googled for a while and wrote some code. But I still encountered two problems. Can you take a look? Thanks.
First, let's write the naive fib function:
naive_fib 0 = 0 naive_fib 1 = 1 naive_fib n = trace(show(n)) naive_fib (n - 1) + naive_fib (n - 2)
this works good except it tries to calculate the same expression many times. So when we evaluate it ghci will show the following: *Main> naive_fib 5 5 4 3 2 2 3 2 5
Then there is a clever way to do that on haskell wiki:
fib = ((map fib' [0 ..]) !!) where fib' 0 = 0 fib' 1 = 1 fib' n =trace(show(n)) fib (n - 1) + fib (n - 2)
When we evaluate the same expression we can see it does not evaluate the redundant expression over and over: *Main> fib 5 5 4 3 2 5
The source code seems to be easy to read, but I don't think I understand that. For me I think if I change the first line from fib = ((map fib' [0 ..]) !!) to fib x = ((map fib' [0 ..]) !!) x It should do the same thing since I think the previous version is just an abbreviation for the second one. But After I change that, *Main> fib 5 5 4 3 2 2 3 2 5
So why does the x here matters?

What people seem to be missing here is that the location of the
where-binding with respect to the lambda changes in each case. As a
result, I think the forgoing explanations were rather confusing;
there's no magic going on here.
On Thu, Oct 7, 2010 at 8:17 AM, Brent Yorgey
----- Forwarded message from Yue Wang
----- From: Yue Wang
Then there is a clever way to do that on haskell wiki: fib = ((map fib' [0 ..]) !!) where fib' 0 = 0 fib' 1 = 1 fib' n =trace(show(n)) fib (n - 1) + fib (n - 2)
This is indeed equivalent to: fib = let fib' 0 = 0 fib' 1 = 1 fib' n = fib (n-1) + fib (n-2) in (map fib' [0..] !!) But adding the argument embeds the let inside the function call: fib x = let fib' 0 = 0 fib' 1 = 1 fib' n = fib (n-1) + fib (n-2) in (map fib' [0..] !!) Now we create a new fib' for each invocation of fib. Not efficient at all! (Much *less* efficient the the recursive fib). There's no evaluation magic here---all that's happening is GHC is executing the program exactly as written. It can't float the list out of the function, as that can lead to unexpected space leaks (if you didn't intend to keep the list of fibs around forever). -Jan-Willem Maessen

On Thu, Oct 7, 2010 at 9:33 AM, Jan-Willem Maessen
There's no evaluation magic here---all that's happening is GHC is executing the program exactly as written. It can't float the list out of the function, as that can lead to unexpected space leaks (if you didn't intend to keep the list of fibs around forever).
To echo Jan-Willem a bit, the impact of let floating can be seen by
compiling the eta expanded version (i.e. fib x = map fib' [0..] !! x)
with different options.
$ ghc --make -O -fforce-recomp FibMemo.hs
[1 of 1] Compiling Main ( FibMemo.hs, FibMemo.o )
Linking FibMemo ...
$ ./FibMemo
5
3
2
4
5
$ ghc --make -O -fforce-recomp FibMemo.hs -fno-full-laziness
[1 of 1] Compiling Main ( FibMemo.hs, FibMemo.o )
Linking FibMemo ...
$ ./FibMemo
5
3
2
4
2
3
2
5
My understanding is that this is just a case where GHCi is able to
float a binding in the partial application formulation because the
dependency analysis is trivial due to there being no name for the
binding.
Looking at the core for the optimized version of the eta expanded
code, there is a a top-level function for building a list of fibonacci
numbers, a top-level value of type [Type.Integer] set to an
application of the builder function to zero, and finally the fib
function that indexes into that list.
The version with -fno-full-laziness leaves the let binding under the
lambda as expected.
So the optimizer is clever and can see through the lambda, but we make
the interpreter's job easier by not putting a lambda over its eyes to
begin with.
Anthony
On Thu, Oct 7, 2010 at 9:33 AM, Jan-Willem Maessen
What people seem to be missing here is that the location of the where-binding with respect to the lambda changes in each case. As a result, I think the forgoing explanations were rather confusing; there's no magic going on here.
On Thu, Oct 7, 2010 at 8:17 AM, Brent Yorgey
wrote: ----- Forwarded message from Yue Wang
----- From: Yue Wang
Then there is a clever way to do that on haskell wiki: fib = ((map fib' [0 ..]) !!) where fib' 0 = 0 fib' 1 = 1 fib' n =trace(show(n)) fib (n - 1) + fib (n - 2)
This is indeed equivalent to: fib = let fib' 0 = 0 fib' 1 = 1 fib' n = fib (n-1) + fib (n-2) in (map fib' [0..] !!)
But adding the argument embeds the let inside the function call: fib x = let fib' 0 = 0 fib' 1 = 1 fib' n = fib (n-1) + fib (n-2) in (map fib' [0..] !!)
Now we create a new fib' for each invocation of fib. Not efficient at all! (Much *less* efficient the the recursive fib).
There's no evaluation magic here---all that's happening is GHC is executing the program exactly as written. It can't float the list out of the function, as that can lead to unexpected space leaks (if you didn't intend to keep the list of fibs around forever).
-Jan-Willem Maessen _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Brent & Yeu, I recently ran into the same question. You can see the thread[1] which includes lots of references to papers that describe the behavior you're seeing along with examples. Implementations of call-by-name lambda calculus all tend to have the same runtime behavior that you're describing. It's usually called sharing. You can look at how ghc reduces expressions like this directly(see [2] and [3]). However, this is really low level and doesn't help us understand the way other Haskell implementations work. John Launchbury[4] came along and said "hey, these denotational definitions of call-by-name lambda calculus don't help us understand the runtime of current implementations". He developed a high level operational semantics for call-by-name that correctly, and simply, accounts for sharing across all implementations. I found the paper accessible and very helpful when analyzing the sometimes mysterious runtime behavior of Haskell. Ariola et. all[5] improved upon on Launchbury's work and made an even more high level operational semantics for call-by-name which they called, confusingly enough, "call-by-need". This paper is not as accessible as Launchbury, but presents a clever way to describe how sharing works without using a heap construct. If you read carefully through the thread I mentioned in the beginning and Launchbury's paper, you should be equipped to write down reductions by hand that illustrate the different runtime behaviors of your semantically equivalent expressions. David [1] http://comments.gmane.org/gmane.comp.lang.haskell.cafe/81029 [2] http://www.haskell.org/haskellwiki/Ministg [3] http://research.microsoft.com/en-us/um/people/simonpj/Papers/eval-apply/inde... [4] http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.35.2016 [5] http://homepages.inf.ed.ac.uk/wadler/papers/need/need.ps -- David Sankel Sankel Software www.sankelsoftware.com 585 617 4748 (Office)
participants (11)
-
Anthony Cowley
-
Ben Millwood
-
Bertram Felgenhauer
-
Brent Yorgey
-
Daniel Fischer
-
David Sankel
-
Derek Elkins
-
Jan-Willem Maessen
-
Luke Palmer
-
Max Bolingbroke
-
Simon Marlow