
Hello One thing is magic for me: how GHC can know what function results to remember and what results can be forgotten ? Is it just a stupid buffer algorithm or is there some mathematics truths behind this ? I'm very happy about Haskell, it's so great to put some smart ideas in a computer. thanks -- Emmanuel Chantréau

On Thu, 3 Dec 2009, Emmanuel CHANTREAU wrote:
Hello
One thing is magic for me: how GHC can know what function results to remember and what results can be forgotten ?
Is it just a stupid buffer algorithm or is there some mathematics truths behind this ?
Although it is not required by the Haskell 98 report, the 'let' expression usually stores variable values (sharing) and top-level constants are also stored. I wonder how much of currently existing Haskell code would still work, if this would be changed, since this behavior is essential for memory usage and speed. If you want to cache function results, see: http://www.haskell.org/haskellwiki/Memoization

Hi.
There is actually no magic at all going on. Haskell has a reasonably
well-defined evaluation model; you can approximate it, at least not
taking IO into account, with lazy graph reduction (look that up on
google). Probably that is the "mathematical truth" you're looking for.
Actually, if Haskell had any magic, it would be bad, because magic
can't be reliable (if you can't describe it, you can't rely on it) and
your magically-efficient program would suddenly and unexplainably
break at random changes in the source or in the compiler version.
I remember being slightly shocked when I discovered that even in
Prolog no magic is going on and it has a well-defined evaluation
model, too (before that, when I only heard of Prolog but hadn't read
anything serious about it, I thought that it is a bunch of unthinkable
theorem proving wizardry).
2009/12/3 Emmanuel CHANTREAU
Hello
One thing is magic for me: how GHC can know what function results to remember and what results can be forgotten ?
Is it just a stupid buffer algorithm or is there some mathematics truths behind this ?
I'm very happy about Haskell, it's so great to put some smart ideas in a computer.
thanks
-- Emmanuel Chantréau _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
-- Eugene Kirpichov Web IR developer, market.yandex.ru

Does this really mean that you want to know how the garbage collector works? Emmanuel CHANTREAU wrote:
Hello
One thing is magic for me: how GHC can know what function results to remember and what results can be forgotten ?
Is it just a stupid buffer algorithm or is there some mathematics truths behind this ?
I'm very happy about Haskell, it's so great to put some smart ideas in a computer.
thanks

Hello
First, thank you for all answers.
Le Thu, 03 Dec 2009 18:58:33 +0300,
Miguel Mitrofanov
Does this really mean that you want to know how the garbage collector works?
Well, I try to understand your question... I will take an example: f x y= x+y The program ask the user to enter two numbers and print the sum. If the user enter "1 2" "f 1 2=3" is stored and a gargage collector is used to remove this dandling expression later ? If the user enter again "1 2", ghc search in dandling results to try to find the result without computing it again ? For Eugene: I use "magic" to mean beautiful and difficult to understand. I believe in magic: I believe everything can be understood with time. -- Emmanuel Chantréau

I will take an example:
f x y= x+y
The program ask the user to enter two numbers and print the sum. If the user enter "1 2" "f 1 2=3" is stored and a gargage collector is used to remove this dandling expression later ? If the user enter again "1 2", ghc search in dandling results to try to find the result without computing it again ?
If you do 'let x = f 1 2' then 'x' will be computed once it's demanded, and at that point the thunk will be overwritten with a number. Further references just return the number. But when 'x' is out of scope it gets GCed. So it depends if you keep 'x' in scope, just like strict languages. The interesting thing is CAFs, which at the top level will never be out of scope and hence live forever. However, it's my vague understanding that due to optimizations, a non-global looking CAF can wind up at the top level. For example, given these: expensive arg = trace "exp" arg f args key = Map.lookup key fm where fm = Map.fromList (expensive args) main = let g = f [('a', 1), ('b', 2)] in print (g 'a', g 'c') g args = (v, args) where v = expensive 42 main = print (g 12, g 90) 'trace' implies that expensive is only called once in both cases, but I have to pass -O2, otherwise it gets called twice. That implies inner definitions with no free variables are only promoted (I believe it's called "let floating"?) with -O2. I think the 'f' example is impressive. The GHC optimizer is pretty neat! http://www.haskell.org/haskellwiki/Constant_applicative_form

2009/12/4 Evan Laforge
The interesting thing is CAFs, which at the top level will never be out of scope and hence live forever.
Untrue! CAFs can be garbage collected as well. See: http://www.haskell.org/pipermail/glasgow-haskell-users/2005-September/009051...

I will take an example:
f x y= x+y
The program ask the user to enter two numbers and print the sum. If the user enter "1 2" "f 1 2=3" is stored and a gargage collector is used to remove this dandling expression later ?
It's not stored in any way.
If the user enter again "1 2", ghc search in dandling results to try to find the result without computing it again ?
No, it wouldn't. It would calculate it another time.

Emmanuel CHANTREAU wrote:
I will take an example:
f x y= x+y
The program ask the user to enter two numbers and print the sum. If the user enter "1 2" "f 1 2=3" is stored and a gargage collector is used to remove this dandling expression later ? If the user enter again "1 2", ghc search in dandling results to try to find the result without computing it again ?
Hi, I think what you're asking is how Haskell knows at run-time for which expressions it can re-use the results. The answer is: it doesn't, it works it out at compile-time. So if you have: f x y = x + y And at some point in your program you call f 1 2, and later on from a totally separate function you call f 1 2, the function will be evaluated twice (assuming 1 and 2 weren't known constants at compile-time). But let's say you have: g x y = f x y * f x y Now the compiler (i.e. at compile-time) can do some magic. It can spot the common expression and know the result of f x y must be the same both times, so it can convert to: g x y = let z = f x y in z * z Now, the Haskell run-time will evaluate f x y once, store the result in z, and use it twice. That's how it can use commonalities in your code and avoid multiple evaluations of the same function call, which I *think* was your question. Thanks, Neil.

On Fri, Dec 4, 2009 at 3:36 AM, Neil Brown
But let's say you have:
g x y = f x y * f x y
Now the compiler (i.e. at compile-time) can do some magic. It can spot the common expression and know the result of f x y must be the same both times, so it can convert to:
g x y = let z = f x y in z * z
GHC does *not* do this by default, quite intentionally, even when optimizations are enabled. The reason is because it can cause major changes in the space complexity of a program. Eg. x = sum [1..10^6] + product [1..10^6] x' = let l = [1..10^6] in sum l + product l x runs in constant space, but x' keeps the whole list in memory. The CSE here has actually wasted both time and space, since it is harder to save [1..10^6] than to recompute it! (Memory vs. arithmetic ops) So GHC leaves it to the user to specify sharing. If you want an expression shared, let bind it and reuse. Luke

On Dec 4, 2009, at 2:43 AM, Luke Palmer wrote:
So GHC leaves it to the user to specify sharing. If you want an expression shared, let bind it and reuse.
Does GHC treat where and let the same in this regard? Or in code, are these treated the same?
x'' = sum l + product l where l = [1..10^6]
x' = let l = [1..10^6] in sum l + product l
I couldn't tell if the report implies that or not. - Mark Mark Lentczner http://www.ozonehouse.com/mark/ mark@glyphic.com

On Fri, Dec 4, 2009 at 9:44 AM, Mark Lentczner
On Dec 4, 2009, at 2:43 AM, Luke Palmer wrote:
So GHC leaves it to the user to specify sharing. If you want an expression shared, let bind it and reuse.
Does GHC treat where and let the same in this regard? Or in code, are these treated the same?
where is just sugar for let.
x'' = sum l + product l where l = [1..10^6]
x' = let l = [1..10^6] in sum l + product l
I couldn't tell if the report implies that or not.
- Mark
Mark Lentczner http://www.ozonehouse.com/mark/ mark@glyphic.com
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Luke Palmer wrote:
On Fri, Dec 4, 2009 at 9:44 AM, Mark Lentczner
wrote: On Dec 4, 2009, at 2:43 AM, Luke Palmer wrote:
So GHC leaves it to the user to specify sharing. If you want an expression shared, let bind it and reuse. Does GHC treat where and let the same in this regard? Or in code, are these treated the same?
where is just sugar for let.
Well, they have different scoping properties because they're in different syntactic classes in the parser (let is an expression whereas where is a modifier on statements), but other than that yes they're the same. -- Live well, ~wren

Although, in Luke's example,
x = sum [1..10^6] + product [1..10^6] x' = let l = [1..10^6] in sum l + product l
We can do much much better, if we're sufficiently smart. -- Define: bar m n = foo (enumFromTo m n) foo xs = sum xs + prod xs -- We're given: sum = foldl (+) 0 product = foldl (*) 1 foldl f z xs = case xs of [] -> [] x:xs -> foldl f (f z x) xs enumFromTo m n = case m < n of True -> [] False -> m : enumFromTo (m+1) n -- The fused loop becomes: foo xs = go0 0 1 xs where go0 a b xs = case xs of [] -> a+b x:xs -> go0 (a+x) (b*x) xs -- Now inline foo in bar: bar m n = go2 0 1 m n where go2 = go0 a b (go1 m n) go0 a b xs = case xs of [] -> a+b x:xs -> go0 (a+x) (b*x) xs go1 m n = case m < n of True -> [] False -> m : go1 (m+1) n -- considering go2 go2 = go0 a b (go1 m n) ==> case (go1 m n) of [] -> a+b x:xs -> go0 (a+x) (b*x) xs ==> case (case m < n of True -> [] False -> m : go1 (m+1) n) of [] -> a+b x:xs -> go0 (a+x) (b*x) xs ==> case m < n of True -> case [] of [] -> a+b x:xs -> go0 (a+x) (b*x) xs False -> case (m : go1 (m+1) n) of [] -> a+b x:xs -> go0 (a+x) (b*x) xs ==> case m < n of True -> a+b False -> go0 (a+m) (b*m) (go1 (m+1) n) -- So, go2 = case m < n of True -> a+b False -> go0 (a+m) (b*m) (go1 (m+1) n) -- And by the original def of go2 go2 = go0 a b (go1 m n) -- We get go2 = case m < n of True -> a+b False -> go2 (a+m) (b*m) (m+1) n -- go0 and go1 and now dead in bar bar m n = go2 0 1 m n where go2 = case m < n of True -> a+b False -> go2 (a+m) (b*m) (m+1) n -- (furthermore, if (+) here is for Int/Double etc, -- we can reduce go2 further to operate on machine -- ints/doubles and be a register-only non-allocating loop) -- So now finally returning to our original code:
x = sum [1..10^6] + product [1..10^6] x' = let l = [1..10^6] in sum l + product l
-- We get:
x' = bar 1 (10^6)
And the intermediate list never exists at all.
Matt
On 12/4/09, Luke Palmer
On Fri, Dec 4, 2009 at 3:36 AM, Neil Brown
wrote: But let's say you have:
g x y = f x y * f x y
Now the compiler (i.e. at compile-time) can do some magic. It can spot the common expression and know the result of f x y must be the same both times, so it can convert to:
g x y = let z = f x y in z * z
GHC does *not* do this by default, quite intentionally, even when optimizations are enabled. The reason is because it can cause major changes in the space complexity of a program. Eg.
x = sum [1..10^6] + product [1..10^6] x' = let l = [1..10^6] in sum l + product l
x runs in constant space, but x' keeps the whole list in memory. The CSE here has actually wasted both time and space, since it is harder to save [1..10^6] than to recompute it! (Memory vs. arithmetic ops)
So GHC leaves it to the user to specify sharing. If you want an expression shared, let bind it and reuse.
Luke _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Fixing my errors:
x = sum [1..10^6] + product [1..10^6] x' = let l = [1..10^6] in sum l + product l
-- Define: bar m n = foo (enumFromTo m n) foo xs = sum xs + prod xs -- We're given: sum = foldl (+) 0 product = foldl (*) 1 foldl f z xs = case xs of [] -> [] x:xs -> foldl f (f z x) xs enumFromTo m n = case n < m of True -> [] False -> m : enumFromTo (m+1) n -- The fused loop becomes: foo xs = go0 0 1 xs where go0 a b xs = case xs of [] -> a+b x:xs -> go0 (a+x) (b*x) xs -- Now inline foo in bar: bar m n = go2 0 1 m n where go2 a b m n = go0 a b (go1 m n) go0 a b xs = case xs of [] -> a+b x:xs -> go0 (a+x) (b*x) xs go1 m n = case m < n of True -> [] False -> m : go1 (m+1) n -- considering go2 go2 a b m n = go0 a b (go1 m n) ==> case (go1 m n) of [] -> a+b x:xs -> go0 (a+x) (b*x) xs ==> case (case n < m of True -> [] False -> m : go1 (m+1) n) of [] -> a+b x:xs -> go0 (a+x) (b*x) xs ==> case n < m of True -> case [] of [] -> a+b x:xs -> go0 (a+x) (b*x) xs False -> case (m : go1 (m+1) n) of [] -> a+b x:xs -> go0 (a+x) (b*x) xs ==> case n < m of True -> a+b False -> go0 (a+m) (b*m) (go1 (m+1) n) -- So, go2 a b m n = case n < m of True -> a+b False -> go0 (a+m) (b*m) (go1 (m+1) n) -- And by the original def of go2 go2 a b m n = go0 a b (go1 m n) -- We get go2 a b m n = case m < n of True -> a+b False -> go2 (a+m) (b*m) (m+1) n -- go0 and go1 and now dead in bar bar m n = go2 0 1 m n where go2 a b m n = case n < m of True -> a+b False -> go2 (a+m) (b*m) (m+1) n -- (furthermore, if (+) here is for Int/Double etc, -- we can reduce go2 further to operate on machine -- ints/doubles and be a register-only non-allocating loop) -- So now finally returning to our original code:
x = sum [1..10^6] + product [1..10^6] x' = let l = [1..10^6] in sum l + product l
-- We get:
x' = bar 1 (10^6)
Matt
On 12/4/09, Matt Morrow
Although, in Luke's example,
x = sum [1..10^6] + product [1..10^6] x' = let l = [1..10^6] in sum l + product l
We can do much much better, if we're sufficiently smart.
-- Define: bar m n = foo (enumFromTo m n) foo xs = sum xs + prod xs
-- We're given: sum = foldl (+) 0 product = foldl (*) 1 foldl f z xs = case xs of [] -> [] x:xs -> foldl f (f z x) xs enumFromTo m n = case m < n of True -> [] False -> m : enumFromTo (m+1) n
-- The fused loop becomes: foo xs = go0 0 1 xs where go0 a b xs = case xs of [] -> a+b x:xs -> go0 (a+x) (b*x) xs
-- Now inline foo in bar: bar m n = go2 0 1 m n where go2 = go0 a b (go1 m n) go0 a b xs = case xs of [] -> a+b x:xs -> go0 (a+x) (b*x) xs go1 m n = case m < n of True -> [] False -> m : go1 (m+1) n
-- considering go2 go2 = go0 a b (go1 m n)
==> case (go1 m n) of [] -> a+b x:xs -> go0 (a+x) (b*x) xs
==> case (case m < n of True -> [] False -> m : go1 (m+1) n) of [] -> a+b x:xs -> go0 (a+x) (b*x) xs
==> case m < n of True -> case [] of [] -> a+b x:xs -> go0 (a+x) (b*x) xs
False -> case (m : go1 (m+1) n) of [] -> a+b x:xs -> go0 (a+x) (b*x) xs
==> case m < n of True -> a+b False -> go0 (a+m) (b*m) (go1 (m+1) n)
-- So, go2 = case m < n of True -> a+b False -> go0 (a+m) (b*m) (go1 (m+1) n)
-- And by the original def of go2 go2 = go0 a b (go1 m n)
-- We get go2 = case m < n of True -> a+b False -> go2 (a+m) (b*m) (m+1) n
-- go0 and go1 and now dead in bar bar m n = go2 0 1 m n where go2 = case m < n of True -> a+b False -> go2 (a+m) (b*m) (m+1) n
-- (furthermore, if (+) here is for Int/Double etc, -- we can reduce go2 further to operate on machine -- ints/doubles and be a register-only non-allocating loop)
-- So now finally returning to our original code:
x = sum [1..10^6] + product [1..10^6] x' = let l = [1..10^6] in sum l + product l
-- We get: x' = bar 1 (10^6)
And the intermediate list never exists at all.
Matt
On 12/4/09, Luke Palmer
wrote: On Fri, Dec 4, 2009 at 3:36 AM, Neil Brown
wrote: But let's say you have:
g x y = f x y * f x y
Now the compiler (i.e. at compile-time) can do some magic. It can spot the common expression and know the result of f x y must be the same both times, so it can convert to:
g x y = let z = f x y in z * z
GHC does *not* do this by default, quite intentionally, even when optimizations are enabled. The reason is because it can cause major changes in the space complexity of a program. Eg.
x = sum [1..10^6] + product [1..10^6] x' = let l = [1..10^6] in sum l + product l
x runs in constant space, but x' keeps the whole list in memory. The CSE here has actually wasted both time and space, since it is harder to save [1..10^6] than to recompute it! (Memory vs. arithmetic ops)
So GHC leaves it to the user to specify sharing. If you want an expression shared, let bind it and reuse.
Luke _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Hi, Am Freitag, den 04.12.2009, 10:36 +0000 schrieb Neil Brown:
But let's say you have:
g x y = f x y * f x y
Now the compiler (i.e. at compile-time) can do some magic. It can spot the common expression and know the result of f x y must be the same both times, so it can convert to:
g x y = let z = f x y in z * z
Now, the Haskell run-time will evaluate f x y once, store the result in z, and use it twice. That's how it can use commonalities in your code and avoid multiple evaluations of the same function call, which I *think* was your question.
Note that although the compiler _could_ do this transformation, it does not actually do it because of some unwanted subtleties: http://www.haskell.org/haskellwiki/GHC:FAQ#Does_GHC_do_common_subexpression_... (I was a bit disappointed when I found out about this, after first hearing how much great optimization a haskell compiler _could_ do, but that’s reality.) Greetings, Joachim -- Joachim "nomeata" Breitner mail: mail@joachim-breitner.de | ICQ# 74513189 | GPG-Key: 4743206C JID: nomeata@joachim-breitner.de | http://www.joachim-breitner.de/ Debian Developer: nomeata@debian.org
participants (12)
-
Emmanuel CHANTREAU
-
Eugene Kirpichov
-
Evan Laforge
-
George Pollard
-
Henning Thielemann
-
Joachim Breitner
-
Luke Palmer
-
Mark Lentczner
-
Matt Morrow
-
Miguel Mitrofanov
-
Neil Brown
-
wren ng thornton