does the compiler optimize repeated calls?

Hi, I have a question about coding and compilers. Suppose that a function is invoked with the same parameters inside another function declaration, eg -- this example does nothing particularly meaningless g a b c = let something1 = f a b something2 = externalsomething (f a b) 42 something3 = externalsomething2 (137 * (f a b)) in ... Does it help (performancewise) to have g a b c = let resultoff = f a b something2 = externalsomething resultoff 42 something3 = externalsomething2 (137 * resultoff) in ... or does the compiler perform this optimization? More generally, if a function is invoked with the same parameters again (and it doesn't involve anything like monads), does does it makes sense (performancewise) to "store" the result somewhere? Thank you, Tamas PS: I realize that I am asking a lot of newbie questions, and I would like to thank everybody for their patience and elucidating replies.

Hallo,
On 9/6/06, Tamas K Papp
PS: I realize that I am asking a lot of newbie questions, and I would like to thank everybody for their patience and elucidating replies.
I am a newbie myself (second week of learning Haskell), but I'll give it a shot: Since functions have no side effects, the compiler executes the function only once. -- -alex http://www.ventonegro.org/

[Warnings: newbie and not having tested or read the generated assembly
code.]
On Wed, Sep 06, 2006 at 09:32:07AM -0300,
Alex Queiroz
I am a newbie myself (second week of learning Haskell), but I'll give it a shot: Since functions have no side effects, the compiler executes the function only once.
Sure, "pure" programming languages *allow* the compiler to perform such optimizations but it does not mean it is *actually* done (promises are cheap but writing compiler code is expensive).

Furthermore, doing that optimization (common subexpression elimination) can lead to space leaks. So you should not count on the compiler doing it. Besides, I often find it more readable and less error prone to name a common subexpression; only one place to change when you need to change the call. -- Lennart On Sep 6, 2006, at 08:38 , Stephane Bortzmeyer wrote:
[Warnings: newbie and not having tested or read the generated assembly code.]
On Wed, Sep 06, 2006 at 09:32:07AM -0300, Alex Queiroz
wrote a message of 18 lines which said: I am a newbie myself (second week of learning Haskell), but I'll give it a shot: Since functions have no side effects, the compiler executes the function only once.
Sure, "pure" programming languages *allow* the compiler to perform such optimizations but it does not mean it is *actually* done (promises are cheap but writing compiler code is expensive).
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Hallo,
On 9/6/06, Lennart Augustsson
Furthermore, doing that optimization (common subexpression elimination) can lead to space leaks. So you should not count on the compiler doing it. Besides, I often find it more readable and less error prone to name a common subexpression; only one place to change when you need to change the call.
So there is no benefits from the "functions have no side-effects" property, performance-wise. Pity. -- -alex http://www.ventonegro.org/

On 9/6/06, Alex Queiroz
Hallo,
On 9/6/06, Lennart Augustsson
wrote: Furthermore, doing that optimization (common subexpression elimination) can lead to space leaks. So you should not count on the compiler doing it. Besides, I often find it more readable and less error prone to name a common subexpression; only one place to change when you need to change the call.
So there is no benefits from the "functions have no side-effects" property, performance-wise. Pity.
I think most compilers actually do CSE (to varying extents), but it's not a good idea to rely on it since nobody will ever *guarantee* that it's done. Purely functional does give you some performance benefits, though. Such as getting pass-by-reference performance "for free" (i.e. without actually having to worry about pass-by-reference semantics -- send a huge data structure "by value" to a function and it will really only be a pointer that gets sent). You can also do parallel and other multithreading stuff much easier which should surely count as a major performance benefit. /S -- Sebastian Sylvan +46(0)736-818655 UIN: 44640862

On Wed, Sep 06, 2006 at 02:12:28PM +0100,
Sebastian Sylvan
I think most compilers actually do CSE
And automatic memoization? Because common subexpressions can be difficult to recognize but, at run-time, it is much easier to recognize a function call that has already been done. Any common compiler / interpreter which automatically memoizes? In theory, it is also a huge advantage of "pure" functional programming languages.

On 9/6/06, Stephane Bortzmeyer
On Wed, Sep 06, 2006 at 02:12:28PM +0100, Sebastian Sylvan
wrote a message of 36 lines which said: I think most compilers actually do CSE
And automatic memoization? Because common subexpressions can be difficult to recognize but, at run-time, it is much easier to recognize a function call that has already been done. Any common compiler / interpreter which automatically memoizes? In theory, it is also a huge advantage of "pure" functional programming languages.
Not that I'm aware of. I don't think it's very easy to do well. The compiler would have to keep a buffer of input/output pairs for each function (even ones which never gets called with the same input twice) which eats up memory - it's probably very difficult to statically say anything about the frequency of certain inputs to functions so you'd have to store caches for every function in the program. If you have a function with a certain range which is used often, it's very easy to do memoization yourself in Haskell (untested!): import Data.Array fac n | n <= 100 = facArr ! n | otherwise = fac' n where fac' x = product [1..x] facArr = listArray (1,100) (map fac' [1..100]) Basically this stores the first 100 fac numbers in an array (note that the contents of the array is lazily evaluated, each entry gets evaluated the first time it is required), and checks against that first, reverting to the regular fac function for all other inputs. /S -- Sebastian Sylvan +46(0)736-818655 UIN: 44640862

On 9/6/06, Sebastian Sylvan
On 9/6/06, Stephane Bortzmeyer
wrote: On Wed, Sep 06, 2006 at 02:12:28PM +0100, Sebastian Sylvan
wrote a message of 36 lines which said: I think most compilers actually do CSE
And automatic memoization? Because common subexpressions can be difficult to recognize but, at run-time, it is much easier to recognize a function call that has already been done. Any common compiler / interpreter which automatically memoizes? In theory, it is also a huge advantage of "pure" functional programming languages.
Not that I'm aware of. I don't think it's very easy to do well. The compiler would have to keep a buffer of input/output pairs for each function (even ones which never gets called with the same input twice) which eats up memory - it's probably very difficult to statically say anything about the frequency of certain inputs to functions so you'd have to store caches for every function in the program.
If you have a function with a certain range which is used often, it's very easy to do memoization yourself in Haskell (untested!):
import Data.Array
fac n | n <= 100 = facArr ! n | otherwise = fac' n where fac' x = product [1..x] facArr = listArray (1,100) (map fac' [1..100])
Sorry, the facArr needs to be toplevel otherwise it probably won't retain it's values. So something like: module Fac (fac) where -- we don't export fac' only fac import Data.Array fac n | n <= 100 = facArr ! n | otherwise = fac' n fac' x = product [1..x] facArr = listArray (1,100) (map fac' [1..100]) -- Sebastian Sylvan +46(0)736-818655 UIN: 44640862

On Wed, Sep 06, 2006 at 03:26:29PM +0200, Stephane Bortzmeyer wrote:
On Wed, Sep 06, 2006 at 02:12:28PM +0100, Sebastian Sylvan
wrote a message of 36 lines which said: I think most compilers actually do CSE
And automatic memoization? Because common subexpressions can be difficult to recognize but, at run-time, it is much easier to recognize a function call that has already been done. Any common compiler / interpreter which automatically memoizes? In theory, it is also a huge advantage of "pure" functional programming languages.
Have you even considered the space costs of this? For almost any non-trivial bit of code I've written, automatic memoization would result in a completely useless binary, as it almost guarantees horrific space leaks. In order to do this, you'd need to have some rather sophisticated analysis to figure out the memory costs of both the output of a function, and its input, both of which would need to be stored in order to memoize the thing. Before something like this were to be implemented, I'd rather see automatic strictifying for avoidence of stack overflows (i.e. when the stack starts filling up, start evaluating things strictly), which is most likely another insanely dificult bit of compiler code, which would also involve figuring out which code paths would save memory, and which would increase the memory use. -- David Roundy

On Wed, Sep 06, 2006 at 09:44:05AM -0400,
David Roundy
Have you even considered the space costs of this? For almost any non-trivial bit of code I've written, automatic memoization would result in a completely useless binary, as it almost guarantees horrific space leaks.
And a compile-time option, where the programmer explain which functions should be memoized? ghc -fmemoize=foo,bar darcs.hs

On 06/09/06, David Roundy
Before something like this were to be implemented, I'd rather see automatic strictifying for avoidence of stack overflows (i.e. when the stack starts filling up, start evaluating things strictly), which is most likely another insanely dificult bit of compiler code, which would also involve figuring out which code paths would save memory, and which would increase the memory use.
Keep in mind that strict and lazy semantics aren't interchangeable, that is automatic strictifying could change program behaviour unless the compiler knew which bits need to be lazy. If it knew that, we might as well adopt the 'strict where possible, lazy when necessary' semantics. -- -David House, dmhouse@gmail.com

Purely functional does give you some performance benefits, though.
Nice theory. People don't use functional languages for their performance, mind you. All the optimizations that are supposedly made possible by having a pure functional language tend to be either not quite doable (because of non-termination making the language a bit less pure) or simply too hard: the difficulty being to decide when/where the optimization is indeed going to improve performance rather than worsen it. It's much too difficult for a compiler to figure out which functions might benefit from memoization (and with which specific form of memoization). Especially compared with how easy it is for the programmer to do the memoization explicitly. Stefan

Hi
All the optimizations that are supposedly made possible by having a pure functional language tend to be either not quite doable (because of non-termination making the language a bit less pure) or simply too hard: the difficulty being to decide when/where the optimization is indeed going to improve performance rather than worsen it.
There are an absolute ton of optimisations that aren't possible in a non-pure language. In a strict language for example, inlining is not allowed, while in a lazy language its always valid. I suggest you take a look at GRIN.
It's much too difficult for a compiler to figure out which functions might benefit from memoization (and with which specific form of memoization). Especially compared with how easy it is for the programmer to do the memoization explicitly.
The reason is probably more that the number of functions which benefit from memoization is probably pretty low. And the number where it harms the performance is probably pretty high. Thanks Neil

Hi,
On 9/6/06, Tamas K Papp
or does the compiler perform this optimization? More generally, if a function is invoked with the same parameters again (and it doesn't involve anything like monads), does does it makes sense (performancewise) to "store" the result somewhere?
I was wondering something like this too, and then I found this email: http://www.haskell.org/pipermail/glasgow-haskell-bugs/2004-December/004530.h... So I guess it is as Stephane said: theoretically possible but not actually done? --eric the perpetual newbie -- Eric Kow http://www.loria.fr/~kow PGP Key ID: 08AC04F9 Merci de corriger mon français.

tpapp:
Hi,
I have a question about coding and compilers. Suppose that a function is invoked with the same parameters inside another function declaration, eg
-- this example does nothing particularly meaningless g a b c = let something1 = f a b something2 = externalsomething (f a b) 42 something3 = externalsomething2 (137 * (f a b)) in ...
Does it help (performancewise) to have
g a b c = let resultoff = f a b something2 = externalsomething resultoff 42 something3 = externalsomething2 (137 * resultoff) in ...
or does the compiler perform this optimization? More generally, if a function is invoked with the same parameters again (and it doesn't involve anything like monads), does does it makes sense (performancewise) to "store" the result somewhere?
on the wiki, http://www.haskell.org/haskellwiki/Performance/GHC#Common_subexpressions -- Don
participants (11)
-
Alex Queiroz
-
David House
-
David Roundy
-
dons@cse.unsw.edu.au
-
Eric Kow
-
Lennart Augustsson
-
Neil Mitchell
-
Sebastian Sylvan
-
Stefan Monnier
-
Stephane Bortzmeyer
-
Tamas K Papp