Question about memory usage

newer
Which Mail editor or mail chain do...

older
Re: [Haskell-cafe] Re: String vs...

Tako Schotanus

14 Aug 2010 14 Aug '10

3:06 p.m.

I was reading this article: http://scienceblogs.com/goodmath/2009/11/writing_basic_functions_in_has.php And came to the part where it shows:

...

fiblist = 0 : 1 : (zipWith (+) fiblist (tail fiblist))

Very interesting stuff for somebody who comes from an imperative world of course. But then I read that "Once it's been referenced, then the list up to where you looked is concrete - the computations *won't* be repeated." and I started wondering how that works. Because this seems to mean that functions could have unknown (to the caller) memory requirements. How does one, programming in Haskell, keep that in check? And when does that memory get reclaimed? Cheers, -Tako

Attachments:

attachment.html (text/html — 2.7 KB)

Show replies by date

Andrew Coppin

14 Aug 14 Aug

3:41 p.m.

Tako Schotanus wrote:

...

...
fiblist = 0 : 1 : (zipWith (+) fiblist (tail fiblist))

Very interesting stuff for somebody who comes from an imperative world of course.

Oh yes, that old chestnut. There's a page on the wiki somewhere with a whole collection of these cryptic one-liners - Pascal's triangle, the prime numbers, the Fibonacci numbers, etc.

...

But then I read that "Once it's been referenced, then the list up to where you looked is concrete - the computations /won't/ be repeated." and I started wondering how that works.

fiblist is a global variable. It points to a list calculation expression. When a list cell is calculated, the calculation node is overwritten with the result of the calculation. When you access the list, you look at each cell; if it's already done, you just use it. If it's not done, execute it and then use it. (How it actually works below that is a problem for the runtime engine to figure out, and varies by Haskell implementation. For example, GHC uses the spineless tagless G-machine.)

...

Because this seems to mean that functions could have unknown (to the caller) memory requirements.

Yes.

...

How does one, programming in Haskell, keep that in check?

...and here begins an entire textbook that nobody has published yet.

...

And when does that memory get reclaimed?

It's called "garbage collection". The memory is reclaimed when it becomes unreachable becuase there are no pointers to it left. In the example above, fiblist is a global variable, so the answer to "when does it get freed?" would be "never". (I believe it's called a CAF leak.) Whether this is very cool or a major performance issue depends on your point of view; you're trading space for time. (Then again, the Fibonacci numbers can be computed in O(1) time, and nobody ever needs Fibonacci numbers in the first place, so this is obviously example code.)

Daniel Peebles

5:33 p.m.

...

In the example above, fiblist is a global variable, so the answer to "when does it get freed?" would be "never". (I believe it's called a CAF leak.)

Is that actually true? I've heard lots of references to this, but I'm not sure it is true. Sure, it's harder for it to get collected when everyone can potentially access it, but the mechanisms for checking who holds pointers to a value are still valid for CAFS, and collection can still work on them, I think. The commentary at http://is.gd/ehEuL seems to suggest that CAFs can be garbage collected, too. My apologies for confusing the issue, if this is not correct, though! Dan

Roel van Dijk

16 Aug 16 Aug

12:37 p.m.

On Sat, Aug 14, 2010 at 5:41 PM, Andrew Coppin wrote:

...

(Then again, the Fibonacci numbers can be computed in O(1) time, and nobody ever needs Fibonacci numbers in the first place, so this is obviously example code.)

A bit off-topic, but I don't think there exists a function that computes fibonacci numbers in O(1) time. There exists a closed-form expression to calculate the nth Fibonacci number, but the complexity of that expression is not O(1): phi = (1 + sqrt 5) / 2 fib n = ((phi ** n) - (1 - phi) ** n) / sqrt 5 The use of (**) should make the complexity at least O(n). Please correct me if I'm wrong (or sloppy). Regards, Roel

Daniel Fischer

1:13 p.m.

On Monday 16 August 2010 14:37:34, Roel van Dijk wrote:

...

On Sat, Aug 14, 2010 at 5:41 PM, Andrew Coppin

wrote:

...
(Then again, the Fibonacci numbers can be computed in O(1) time, and nobody ever needs Fibonacci numbers in the first place, so this is obviously example code.)

A bit off-topic, but I don't think there exists a function that computes fibonacci numbers in O(1) time. There exists a closed-form expression to calculate the nth Fibonacci number, but the complexity of that expression is not O(1):

phi = (1 + sqrt 5) / 2 fib n = ((phi ** n) - (1 - phi) ** n) / sqrt 5

The use of (**) should make the complexity at least O(n).

Using Double for phi and n, that expression is calculated in constant time. If you use (^) or (^^) instead of (**) and use Int [or Integer and neglect the compexity of Integer arithmetic, assuming additions and divisions of Integers to be O(1)] for n, it's O(log n). However, whether using (**), (^) or (^^), with Double for phi, that formula gives wrong results even for rather small n (> 70 resp. > 75). You can calculate the n-th Fibonacci number in O(log n) steps using Integer arithmetic to get correct results. Multiplying two Integers with k bits takes O(k^(1+ x)) with 0 < x <= 1, the exact value of x depends on the used algorithm of course. I don't remember what's the best found so far, but x < 0.5 is reached. Since the abovementioned O(log n) steps algorithm includes multiplication of Integers with Theta(n) bits, its complexity is O(n^(1 + y)) for some y

...

= x.

...

Please correct me if I'm wrong (or sloppy).

Regards, Roel

Sebastian Fischer

3:36 p.m.

[continuing off topic] On Aug 16, 2010, at 3:13 PM, Daniel Fischer wrote:

...

You can calculate the n-th Fibonacci number in O(log n) steps using Integer arithmetic to get correct results.

Yes, I was delighted when I saw this for the frist time. It works be computing /1 1\^n \1 0/ (This is meant to be a matrix to the nth power) by repeated squaring. Wikipedia knows the details: http://en.wikipedia.org/wiki/Fibonacci_number#Matrix_form My Haskell implementation of this approach is on Hackage: http://hackage.haskell.org/package/fibonacci and github: http://github.com/sebfisch/fibonacci/blob/master/Data/Numbers/Fibonacci.hs With this implementation, printing the result of a call to, say fib 100000000 takes *much* longer than computing it. [almost on topic again] I am a bit concerned about the memory usage. If you know how to fix it, please send me patches (ideally via github)! Cheers, Sebastian -- Underestimating the novelty of the future is a time-honored tradition. (D.G.)

Jacques Carette

4:03 p.m.

...

re := {fib(n+2) = fib(n+1) + fib(n)}; inits := {fib(0) = 1, fib(1) = 1}; {fib(n + 2) = fib(n + 1) + fib(n)} {fib(0) = 1, fib(1) = 1} gfun[rectoproc](re union inits, fib(n));

Since we're off-topic... Any sequence of numbers given by a linear recurrence equation with constant coefficients can be computed quickly using asymptotically efficient matrix operations. In fact, the code to do this can be derived automatically from the recurrence itself. Here is what you get for Fibonacci (using Maple): proc (n) local i1, loc0, loc1, loc2, tmp2, tmp1, i2; if n <= 44 then loc0 := 1; loc1 := 1; if n < 0 then error "index must be %1 or greater", 0 elif n = 0 then return loc0 elif n = 1 then return loc1 end if; for i1 to n-1 do loc2 := loc0+loc1; loc0 := loc1; loc1 := loc2 end do; loc1 else tmp2 := Matrix(2, 2, {(1, 1) = 1, (1, 2) = 1, (2, 1) = 1, (2, 2) = 0}); tmp1 := Vector(2, {(1) = 1, (2) = 1}); i2 := convert(n-1, base, 2); if i2[1] = 1 then tmp1 := Vector(2, {(1) = 2, (2) = 1}) end if; for i1 in subsop(1 = (), i2) do tmp2 := LinearAlgebra:-MatrixMatrixMultiply(tmp2, tmp2); if i1 = 1 then tmp1 := LinearAlgebra:-MatrixVectorMultiply(tmp2, tmp1) end if end do; tmp1[1] end if end proc Direct computation is done for small n, and then asymptotically fast linear algebra is used for larger n. This should be a nice Template Haskell exercise. Jacques Sebastian Fischer wrote:

...

[continuing off topic]

On Aug 16, 2010, at 3:13 PM, Daniel Fischer wrote:

...
You can calculate the n-th Fibonacci number in O(log n) steps using Integer arithmetic to get correct results.

Yes, I was delighted when I saw this for the frist time. It works be computing

/1 1\^n \1 0/

(This is meant to be a matrix to the nth power) by repeated squaring. Wikipedia knows the details:

http://en.wikipedia.org/wiki/Fibonacci_number#Matrix_form

My Haskell implementation of this approach is on Hackage:

http://hackage.haskell.org/package/fibonacci

and github:

http://github.com/sebfisch/fibonacci/blob/master/Data/Numbers/Fibonacci.hs

With this implementation, printing the result of a call to, say

fib 100000000

takes *much* longer than computing it.

[almost on topic again]

I am a bit concerned about the memory usage. If you know how to fix it, please send me patches (ideally via github)!

Cheers, Sebastian

Andrew Coppin

6:37 p.m.

Roel van Dijk wrote:

...

On Sat, Aug 14, 2010 at 5:41 PM, Andrew Coppin wrote:

...
(Then again, the Fibonacci numbers can be computed in O(1) time, and nobody ever needs Fibonacci numbers in the first place, so this is obviously example code.)

A bit off-topic, but I don't think there exists a function that computes fibonacci numbers in O(1) time. There exists a closed-form expression to calculate the nth Fibonacci number, but the complexity of that expression is not O(1):

phi = (1 + sqrt 5) / 2 fib n = ((phi ** n) - (1 - phi) ** n) / sqrt 5

The use of (**) should make the complexity at least O(n). Please correct me if I'm wrong (or sloppy).

This depends on your definition of "operation". As somebody else pointed out, if you're only dealing with limited-precision arithmetic, you can probably consider all the arithmetic operations to be O(1), in which case the Nth Fibonacci number is O(1). Only if you're dealing with utterly huge numbers do you need to even care about how long it takes to add two numbers. This neatly leads us back to my second assertion: In all my years of computer programming, I've never seen one single program that actually *needs* the Fibonacci numbers in the first place (let alone in arbitrary-precision).

Antoine Latter

9:14 p.m.

On Mon, Aug 16, 2010 at 1:37 PM, Andrew Coppin wrote:

...

This neatly leads us back to my second assertion: In all my years of computer programming, I've never seen one single program that actually *needs* the Fibonacci numbers in the first place (let alone in arbitrary-precision).

I think there are variants on AVL trees that use something related to a Fibonacci sequence for balancing. I don't remember the details, though. Antoine

Daniel Peebles

11:21 p.m.

There's a Fibonacci Heap: http://en.wikipedia.org/wiki/Fibonacci_heap Not sure what else though :) On Mon, Aug 16, 2010 at 11:14 PM, Antoine Latter wrote:

...

On Mon, Aug 16, 2010 at 1:37 PM, Andrew Coppin wrote:

...
This neatly leads us back to my second assertion: In all my years of computer programming, I've never seen one single program that actually *needs* the Fibonacci numbers in the first place (let alone in arbitrary-precision).

I think there are variants on AVL trees that use something related to a Fibonacci sequence for balancing. I don't remember the details, though.

Antoine _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Richard O'Keefe

17 Aug 17 Aug

1:53 a.m.

On Aug 17, 2010, at 12:37 AM, Roel van Dijk wrote:

...

phi = (1 + sqrt 5) / 2 fib n = ((phi ** n) - (1 - phi) ** n) / sqrt 5

The use of (**) should make the complexity at least O(n). Please correct me if I'm wrong (or sloppy).

Using the classic x**0 = 1 x**1 = x x**(2k+0) = (x**2)**k k > 0 x**(2k+1) = (x**2)**k + x k > 1 powers of smallish numbers or matrices can be computed in logarithmic time. Using x**n = exp(n*log(x)) and floating point arithmetic, the whole thing can be done in O(1) time, and the price of some inaccuracy. Using double precision arithmetic, I get fib(52) = 32951280099.0001 which is clearly wrong. Truncation will save you, up to fib(72) = 498454011879265.1875 which is wrong in the units place.

Roel van Dijk

6:39 a.m.

On Tue, Aug 17, 2010 at 3:53 AM, Richard O'Keefe wrote:

...

On Aug 17, 2010, at 12:37 AM, Roel van Dijk wrote:

...
phi = (1 + sqrt 5) / 2 fib n = ((phi ** n) - (1 - phi) ** n) / sqrt 5

The use of (**) should make the complexity at least O(n). Please correct me if I'm wrong (or sloppy).

Using the classic x**0 = 1 x**1 = x x**(2k+0) = (x**2)**k k > 0 x**(2k+1) = (x**2)**k + x k > 1 powers of smallish numbers or matrices can be computed in logarithmic time. That is an interesting trick. So there exists an algorithm that calculates Fibonacci numbers in O(log n) time.

...

Using x**n = exp(n*log(x)) and floating point arithmetic, the whole thing can be done in O(1) time, and the price of some inaccuracy. It will calculate a subset of the Fibonacci numbers in O(1) time. Thinking about it I think you can easily calculate subsets of any function in O(1) time, as long as the function is guaranteed to terminate. Simply precompute the answers and store them in an array.

Sebastian Fischer

7:02 a.m.

On Aug 17, 2010, at 8:39 AM, Roel van Dijk wrote:

...

That is an interesting trick. So there exists an algorithm that calculates Fibonacci numbers in O(log n) time.

This is what my program does. But it's only O(long n) if you assume multiplication is constant time (which it is not for large numbers). Sebastian -- Underestimating the novelty of the future is a time-honored tradition. (D.G.)

Ivan Lazar Miljenovic

7:09 a.m.

Sebastian Fischer writes:

...

On Aug 17, 2010, at 8:39 AM, Roel van Dijk wrote:

...
That is an interesting trick. So there exists an algorithm that calculates Fibonacci numbers in O(log n) time.

This is what my program does. But it's only O(long n) [snip]

Are you getting Haskell mixed up with C or something? We don't have a "long" value type... ;-) -- Ivan Lazar Miljenovic Ivan.Miljenovic@gmail.com IvanMiljenovic.wordpress.com

Richard O'Keefe

10:53 p.m.

On Aug 17, 2010, at 6:39 PM, Roel van Dijk wrote:

...

...
Using x**n = exp(n*log(x)) and floating point arithmetic, the whole thing can be done in O(1) time, and the price of some inaccuracy. It will calculate a subset of the Fibonacci numbers in O(1) time. Thinking about it I think you can easily calculate subsets of any function in O(1) time, as long as the function is guaranteed to terminate. Simply precompute the answers and store them in an array.

It is what, 30 years, since someone pointed out that the simplest way to implement the full range of factorial() that can be handled by a machine integer is using a tiny tiny array. F(n) is approximately phi^n/sqrt(5) So log2 F(n) is approximately 0.7*n - 1.2 The largest 64-bit IEEE-format number is about 2^1023, the largest 128-bit one about 2^216383. So the largest F(n) we can approximate with floats is n = 1,476 or thereabouts for 64-bit n = 23,608 or thereabouts for 128-bit That's more Fibonnaci numbers than I personally have any need for right now, but you're right, it's not all of them, and even the larger number is a quantity we *could* precompute and store, and the precomputation would of course be O(1) per stored item.

Christopher Lane Hinson

14 Aug 14 Aug

3:49 p.m.

On Sat, 14 Aug 2010, Tako Schotanus wrote:

...

I was reading this article:

http://scienceblogs.com/goodmath/2009/11/writing_basic_functions_in_has.php

And came to the part where it shows:

...
fiblist = 0 : 1 : (zipWith (+) fiblist (tail fiblist))

But then I read that "Once it's been referenced, then the list up to where you looked is concrete - the computations won't be repeated."

It is so implemented. If you *really* wanted a weak reference that could be garbage collected and rebuilt (you don't), it could be made to happen.

...

and I started wondering how that works. Because this seems to mean that functions could have unknown (to the caller) memory requirements.

This is true in any programming language or runtime. Nothing special has happened: you could implement the same thing in C/C++/Java/Python, but it would take 10-100 lines of code.

...

How does one, programming in Haskell, keep that in check? And when does that memory get reclaimed?

Haskell is garbage collected, as soon as the fiblist is not longer reachable, and the runtime wants to reclaim the memory, it will. If fiblist is a top-level declaration it will always be reachable. Friendly! --Lane

Tako Schotanus

7:46 p.m.

First of all, thanks to the people who responded :) On Sat, Aug 14, 2010 at 17:49, Christopher Lane Hinson < lane@downstairspeople.org> wrote:

...

On Sat, 14 Aug 2010, Tako Schotanus wrote:

I was reading this article:

...
http://scienceblogs.com/goodmath/2009/11/writing_basic_functions_in_has.php

And came to the part where it shows:

...
fiblist = 0 : 1 : (zipWith (+) fiblist (tail fiblist))

But then I read that "Once it's been referenced, then the list up to where you looked is concrete - the computations won't be repeated."

It is so implemented. If you *really* wanted a weak reference that could be garbage collected and rebuilt (you don't), it could be made to happen.

I understand, if you don't want to keep the memory tied up you either define the reference locally or you use another algorithm.

...

and I started wondering how that works.

...
Because this seems to mean that functions could have unknown (to the caller) memory requirements.

This is true in any programming language or runtime. Nothing special has happened: you could implement the same thing in C/C++/Java/Python, but it would take 10-100 lines of code.

Sure, although the memory use would normally be more obvious and it would actually be more work to make the result globally permanent. (the difference between the 10 lines and the 100 lines you refer to probably)

...

How does one, programming in Haskell, keep that in check?

...
And when does that memory get reclaimed?

Haskell is garbage collected, as soon as the fiblist is not longer reachable, and the runtime wants to reclaim the memory, it will.

If fiblist is a top-level declaration it will always be reachable.

Ok, makes sense. Just to make this clear, I'm not complaining nor suggesting there is anything wrong with the way Haskell does things (minds immeasurably superior to mine working on this stuff, I'm not going to pretend to know better), it's just one of those "surprises" for somebody who has no experience yet with Haskell. Thanks, -Tako

5446

Age (days ago)

5449

Last active (days ago)

List overview

Download

16 comments

11 participants

participants (11)

Andrew Coppin
Antoine Latter
Christopher Lane Hinson
Daniel Fischer
Daniel Peebles
Ivan Lazar Miljenovic
Jacques Carette
Richard O'Keefe
Roel van Dijk
Sebastian Fischer
Tako Schotanus