Re[Haskell-cafe] duction Sequence of simple Fibonacci sequence implementation

Hallo everyone, as Haskell uses the Lazy Evaluation reduction policy also known as Outermost Graph Reduction, I was wondering if the following simple implementation of the Fibonacci sequence would result in linear runtime behaviour. fib :: Integer -> Integer fib 0 = 0 fib 1 = 1 fib n = fib (n - 2) + fib (n - 1) Is the following reduction sequence realistic, or am I admitting to much intelligence to the Haskell Compiler? http://www.bilder-hochladen.net/files/bxlk-6-jpg.html http://www.bilder-hochladen.net/files/thumbs/bxlk-6.jpg I hope someone can help... -- View this message in context: http://www.nabble.com/Reduction-Sequence-of-simple-Fibonacci-sequence-implem... Sent from the Haskell - Haskell-Cafe mailing list archive at Nabble.com.

Hello SimonK77, Thursday, August 27, 2009, 11:24:14 PM, you wrote: for linear timing it needs memoization of previous results. haskell compilers doesn't do it automatically since it may change space properties of program. well-known example is: main = print (sum[1..1000] + prod[1..1000]) in order to use memoization you should declare fib as array: fib = array (1,999) ([1,1]++map (\i -> fib!(i-1) + fib!(i-2)) [3..999])
Hallo everyone, as Haskell uses the Lazy Evaluation reduction policy also known as Outermost Graph Reduction, I was wondering if the following simple implementation of the Fibonacci sequence would result in linear runtime behaviour. fib :: Integer -> Integer fib 0 = 0 fib 1 = 1 fib n = fib (n - 2) + fib (n - 1) Is the following reduction sequence realistic, or am I admitting to much intelligence to the Haskell Compiler? I hope someone can help...
View this message in context: Reduction Sequence of simple Fibonacci sequence implementation Sent from the Haskell - Haskell-Cafe mailing list archive at Nabble.com.
-- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

On Thu, Aug 27, 2009 at 12:32 PM, Bulat
Ziganshin
Hello SimonK77,
Thursday, August 27, 2009, 11:24:14 PM, you wrote:
for linear timing it needs memoization of previous results. haskell compilers doesn't do it automatically since it may change space properties of program. well-known example is:
main = print (sum[1..1000] + prod[1..1000])
in order to use memoization you should declare fib as array:
fib = array (1,999) ([1,1]++map (\i -> fib!(i-1) + fib!(i-2)) [3..999])
Unless I'm mistaken this one is also memoized and simpler: fibs = 1 : 1 : zipWith (+) fibs (tail fibs) Then to get a specific fib, zero index, you do: fib n = fibs !! n Jason

Am Donnerstag 27 August 2009 21:41:49 schrieb Jason Dagit:
On Thu, Aug 27, 2009 at 12:32 PM, Bulat
Ziganshin
wrote: Hello SimonK77,
Thursday, August 27, 2009, 11:24:14 PM, you wrote:
for linear timing it needs memoization of previous results. haskell compilers doesn't do it automatically since it may change space properties of program. well-known example is:
main = print (sum[1..1000] + prod[1..1000])
in order to use memoization you should declare fib as array:
fib = array (1,999) ([1,1]++map (\i -> fib!(i-1) + fib!(i-2)) [3..999])
Unless I'm mistaken this one is also memoized and simpler: fibs = 1 : 1 : zipWith (+) fibs (tail fibs)
Should be fibs = 0:1:zipWith (+) fibs (tail fibs) of course.
Then to get a specific fib, zero index, you do: fib n = fibs !! n
Jason

SimonK77
as Haskell uses the Lazy Evaluation reduction policy also known as Outermost Graph Reduction, I was wondering if the following simple implementation of the Fibonacci sequence would result in linear runtime behaviour.
fib :: Integer -> Integer fib 0 = 0 fib 1 = 1 fib n = fib (n - 2) + fib (n - 1)
Is the following reduction sequence realistic, or am I admitting to much intelligence to the Haskell Compiler?
You are, as easily seen when you try to calculate (fib 30). A version that works: import Data.MemoTrie fib :: Integer -> Integer fib = memo \n -> case n of 0 -> 0 1 -> 1 n -> fib (n-1) + fib (n-2)

Thanks for the memo trick! Now I understand that the haskell compiler cannot memoize functions of integers, because it could change the space behaviour. However I think it could memoize everything else. Because all types that are data objects sitting in memory (so the arg is essentially a reference) can be memoized, without changing the space properties (except for overall constants). Does haskell do this? And if it doesn't can you turn it on? Cheers, Gerben -- View this message in context: http://www.nabble.com/Reduction-Sequence-of-simple-Fibonacci-sequence-implem... Sent from the Haskell - Haskell-Cafe mailing list archive at Nabble.com.

Hello staafmeister, Friday, August 28, 2009, 1:54:38 PM, you wrote: it just works other way. imagine a whole haskell program as a graph. i.e. expression 1+2, for example, forms a node with (=) in its root and 1 and 2 as its subtrees. computation of program is a series of graph reductions, replacing nodes with results, f.e. 1+2 => 3 this graph can share computations in only one way - when you give sharec node a name and use this name twice. for example, the following sum[1..1000] + prod[1..1000] don't share anything, but this let list = [1..1000] in sum list + prod list share the list. performing sharing via explicit naming common subexpressions is the only way to memoize results you imagine something highly inefficient like storing results of every computation ever done. are you think it really makes a sense? sometimes haskell compilers may deduce that some computation is used twice. if result of this computation definitely require less memory than computation itself, compiler may perform optimization by storing its result. it's called Common Subexpression Elimination. but its' not guaranteed, and afaik is pretty limited in ghc
Thanks for the memo trick! Now I understand that the haskell compiler cannot memoize functions of integers, because it could change the space behaviour. However I think it could memoize everything else. Because all types that are data objects sitting in memory (so the arg is essentially a reference) can be memoized, without changing the space properties (except for overall constants). Does haskell do this? And if it doesn't can you turn it on?
Cheers, Gerben
-- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

Bulat Ziganshin-2 wrote:
this graph can share computations in only one way - when you give sharec node a name and use this name twice. for example, the following
sum[1..1000] + prod[1..1000]
don't share anything, but this
let list = [1..1000] in sum list + prod list
share the list. performing sharing via explicit naming common subexpressions is the only way to memoize results
you imagine something highly inefficient like storing results of every computation ever done. are you think it really makes a sense?
Well in case I call (prod list) again it could lookup the reference and see that this computation has already been performed and just lookup the answer. In case `list' becomes GCed the GC should also destroy the lookup in the cache. The overhead is a O(1) overhead for the function because it needs to check if a computation has already performed. And the space overhead is not so big because every data object in memory there are a couple of references to be stored in lookup tables. So although there is space overhead it is not excessive. -- View this message in context: http://www.nabble.com/Reduction-Sequence-of-simple-Fibonacci-sequence-implem... Sent from the Haskell - Haskell-Cafe mailing list archive at Nabble.com.

Hello staafmeister, Friday, August 28, 2009, 2:34:07 PM, you wrote:
Well in case I call (prod list) again it could lookup the reference and see
so it should keep a list of all values ever computed in program, together with their expressions? :) are you like idea of prod[1..10^6] computation taking 10 mbytes of memory? -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

Bulat Ziganshin-2 wrote:
Hello staafmeister,
Friday, August 28, 2009, 2:34:07 PM, you wrote:
Well in case I call (prod list) again it could lookup the reference and see
so it should keep a list of all values ever computed in program, together with their expressions? :) are you like idea of prod[1..10^6] computation taking 10 mbytes of memory?
The list you give prod is also 10 MB so it not a terribly inefficient program. It's a factor of 2. Well haskell also has often a factor of 2 overhead with respect to C and people are not terribl concerned about that -- View this message in context: http://www.nabble.com/Reduction-Sequence-of-simple-Fibonacci-sequence-implem... Sent from the Haskell - Haskell-Cafe mailing list archive at Nabble.com.

On Fri, Aug 28, 2009 at 1:03 PM, staafmeister
The list you give prod is also 10 MB so it not a terribly inefficient program.
That list takes memory only if it is forced. If it is passed to a lazy function, all the list may not be in memory at once.

david48 wrote:
On Fri, Aug 28, 2009 at 1:03 PM, staafmeister
wrote: The list you give prod is also 10 MB so it not a terribly inefficient program.
That list takes memory only if it is forced. If it is passed to a lazy function, all the list may not be in memory at once.
In that case the GC cleaned up the whole list and while cleaning up it should also clean up the references in the cache lookup table. So then there is no space overhead either. -- View this message in context: http://www.nabble.com/Reduction-Sequence-of-simple-Fibonacci-sequence-implem... Sent from the Haskell - Haskell-Cafe mailing list archive at Nabble.com.

Hello staafmeister, Friday, August 28, 2009, 3:03:13 PM, you wrote:
so it should keep a list of all values ever computed in program, together with their expressions? :) are you like idea of prod[1..10^6] computation taking 10 mbytes of memory?
The list you give prod is also 10 MB so it not a terribly inefficient program.
no, it's produced on need so it needs O(1) memory
It's a factor of 2. Well haskell also has often a factor of 2 overhead with respect to C and people are not terribl concerned about that
factor of 2 compared to ALL VALUES ever produced when evaluating program. since computer performs about 10^9 computations per second, you are going to store 10^9 values each second, some of those may be multi-megabyte by itself -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

Bulat Ziganshin-2 wrote:
Hello staafmeister,
Friday, August 28, 2009, 3:03:13 PM, you wrote:
so it should keep a list of all values ever computed in program, together with their expressions? :) are you like idea of prod[1..10^6] computation taking 10 mbytes of memory?
The list you give prod is also 10 MB so it not a terribly inefficient program.
no, it's produced on need so it needs O(1) memory
It's a factor of 2. Well haskell also has often a factor of 2 overhead with respect to C and people are not terribl concerned about that
factor of 2 compared to ALL VALUES ever produced when evaluating program. since computer performs about 10^9 computations per second, you are going to store 10^9 values each second, some of those may be multi-megabyte by itself
Hi Bulat, All the values that are computed but are also GCed (and they will be, 10^9 bytes is the mem limit). If the GC removes a value then all references in cache to those values can also be removed. Gerben -- View this message in context: http://www.nabble.com/Reduction-Sequence-of-simple-Fibonacci-sequence-implem... Sent from the Haskell - Haskell-Cafe mailing list archive at Nabble.com.

Hello staafmeister, Friday, August 28, 2009, 3:31:13 PM, you wrote:
All the values that are computed but are also GCed (and they will be, 10^9 bytes is the mem limit). If the GC removes a value then all references in cache to those values can also be removed.
it looks like cache of values computed since the last GC, because on GC all those intermediate results will be collected. i think it's not very useful outside of fib example that does exact that - reusing recently computed values -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

On Fri, Aug 28, 2009 at 6:04 AM, Bulat
Ziganshin
Hello staafmeister,
Friday, August 28, 2009, 3:31:13 PM, you wrote:
All the values that are computed but are also GCed (and they will be, 10^9 bytes is the mem limit). If the GC removes a value then all references in cache to those values can also be removed.
it looks like cache of values computed since the last GC, because on GC all those intermediate results will be collected. i think it's not very useful outside of fib example that does exact that - reusing recently computed values
I wouldn't be so quick to call it useless. This caching idea, when combined with HNF instead of WHNF reduction, leads to a strategy which is capable of automatically specializing away layers of interpretation. That is to say, it turns all interpreters into compilers. Now, there are some disadvantages too, some of which have to do with memory performance. But it's not clear that the disadvantages always outweigh the advantages; rather I suspect you get a strategy which implies a whole different set of trade-offs for engineering efficient programs. Luke

Luke Palmer wrote:
On Fri, Aug 28, 2009 at 6:04 AM, Bulat Ziganshin wrote:
it looks like cache of values computed since the last GC, because on GC all those intermediate results will be collected. i think it's not very useful outside of fib example that does exact that - reusing recently computed values
I wouldn't be so quick to call it useless. This caching idea, when combined with HNF instead of WHNF reduction, leads to a strategy which is capable of automatically specializing away layers of interpretation. That is to say, it turns all interpreters into compilers.
Now, there are some disadvantages too, some of which have to do with memory performance. But it's not clear that the disadvantages always outweigh the advantages; rather I suspect you get a strategy which implies a whole different set of trade-offs for engineering efficient programs.
Indeed. The fibs example highlights the biggest place where memoization wins: dynamic programming. DP can dramatically reduce the asymptotic complexity of all sorts of recursive algorithms. The idea of automatic pervasive memoization (done right) is nothing short of teaching the compiler to do DP internally. The analogy between laziness and memoization is so apt because laziness ala GHC is one restricted version of memoization. Both can improve asymptotic complexity over naive code, or can introduce undesired bloating. Both can improve clarity by saying what you mean, or can lead to inscrutable program performance depending on the whims of the Sufficiently Smart Compiler and the programmer's knowledge of its guts. Many of the laziness quirks about how to order traversals show up in spades with DP. (In fact, converting an algorithm into a DP version often amounts to 'just' reordering the traversals.) In Dyna we were working on automatic pervasive memoization from the perspective of Prolog and inductive databases. That approach works particularly well for a lot of natural language processing tasks, but it's biting off a lot (and inductive databases went the way of artificial intelligence because of that complexity). I'd be interested in seeing a language like Luke describes, which considers APM more from the direction of laziness and partial compilation. Putting the two together would make for a whole new era in programming (or at least an awesome language for academe). -- Live well, ~wren

staafmeister
The list you give prod is also 10 MB so it not a terribly inefficient program. It's a factor of 2.
No, it is not. The expression: prod [1..1000] + sum [1..1000] will first evaluate prod [1..1000], generating values from 1 to 1000 lazily and accumulating the product incrementally, allowing the used values to be GC'ed. Then the sum will be evaluated similarly, and the two results will be added. It will run in constant - and very small space. If you do common sub-expression elimination (CSE), and write let ls = [1..1000] in prod ls + sum ls the product will force ls to be evaluated, but since there is another reference to this value (for the sum), it cannot be garbage collected. Thus, the difference is more like a factor of 1000, which is a big deal. -k -- If I haven't seen further, it is by standing in the footprints of giants

staafmeister wrote:
The overhead is a O(1) overhead for the function because it needs to check if a computation has already performed. And the space overhead is not so big because every data object in memory there are a couple of references to be stored in lookup tables. So although there is space overhead it is not excessive.
Actually, having worked on designing the storage layer of a language which does automatic pervasive memoization (Dyna2), the space overhead for such languages can be incredibly large. Determining whether you have a memo can take much longer than O(1) when you have so many of them. Determining when it's worthwhile to memoize a result vs recalculating should it be needed again; determining which of the memos should be flushed when memory pressure gets too high; determining the most efficient storage representation for a particular memotable;... all of these are extremely difficult questions, and still open questions in the community. The databases community is still alive and kicking, and they only have to deal with the easier parts of these problems. To think about why it takes so much space, consider this: for some value, v, how many functions can accept v as an argument? This is the upper limit for the number of memos you need to keep while v is live. And since we're referentially transparent, a value (not a variable) is live so long as its type is live (the result of f on this v will be the same as on every other v, and as long as the type of v lives, we can produce another v so we need to keep the memo). ... Memoization in the small is easy. There are a number of different ways to go about it, and they all get the job done. But once you make it pervasive, the story changes. It's like managing memory: in the small you can do it manually, but once the idea of pervasive allocation shows up (either from OOP or from closures), doing it manually is a good way to ensure your project fails. And once you make it automatic, the story changes again. Here, a better analogy is with pervasive laziness. Having pervasive laziness and making it automatic/invisible to the coder is easy. Making it efficient enough to be invisible to the user is another matter entirely. The key to automating pervasive laziness is to find out when you can get away with not doing it; and the key to automating memoization is the same. Of course memoization has the additional problems that, when you can't get rid of it, you need to choose from one of many different models and datastructures for storing your memos; and those choices circle back to influence the answer to whether you should even bother memoizing. -- Live well, ~wren

On Fri, Aug 28, 2009 at 3:54 AM, staafmeister
Thanks for the memo trick! Now I understand that the haskell compiler cannot memoize functions of integers, because it could change the space behaviour. However I think it could memoize everything else. Because all types that are data objects sitting in memory (so the arg is essentially a reference) can be memoized, without changing the space properties (except for overall constants). Does haskell do this? And if it doesn't can you turn it on?
Integers are nothing special. Consider functions on: data Nat = Zero | Succ Nat Now, perhaps you mean memoize specific Nat *references* (a meaningless question for Haskell, only for specific implementations) rather than the *values*. Eg., for some f: would not. let x = Succ Zero in f x + f x would memoize the result of f, but: f (Succ Zero) + f (Succ Zero) would not. GHC does not do this. However, I am working in my free time on an experimental graph reducer which does. It implements a semantics called "complete laziness"[1]. I'm experimenting to see how that changes the engineering trade-offs (I have a blog post about what I expect those to be: http://lukepalmer.wordpress.com/2009/07/07/emphasizing-specialization/) For programs that do not benefit from the extra sharing, I should be lucky to run them only 50 times slower. This is an indication why GHC doesn't do this. Complete laziness is a fairly young research field. Maybe someday we'll get a smokin' fast completely lazy reducer. [1] http://www.lsv.ens-cachan.fr/Publis/PAPERS/PDF/sinot-wrs07.pdf

Hello SimonK77, Thursday, August 27, 2009, 11:24:14 PM, you wrote: list-based memoization for fibs should look as fibs = 1 : 1 : map (sum.take 2) (tails fibs)
Hallo everyone, as Haskell uses the Lazy Evaluation reduction policy also known as Outermost Graph Reduction, I was wondering if the following simple implementation of the Fibonacci sequence would result in linear runtime behaviour. fib :: Integer -> Integer fib 0 = 0 fib 1 = 1 fib n = fib (n - 2) + fib (n - 1) Is the following reduction sequence realistic, or am I admitting to much intelligence to the Haskell Compiler? I hope someone can help...
View this message in context: Reduction Sequence of simple Fibonacci sequence implementation Sent from the Haskell - Haskell-Cafe mailing list archive at Nabble.com.
-- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com
participants (10)
-
Bulat Ziganshin
-
Daniel Fischer
-
david48
-
Jason Dagit
-
Ketil Malde
-
Luke Palmer
-
Rohan Lean
-
SimonK77
-
staafmeister
-
wren ng thornton