
On Tuesday 17 August 2010 17:33:15, Sebastian Fischer wrote:
BTW, what sort of memory usage are we talking about here?
./calcfib 100000000 +RTS -s 451,876,020 bytes allocated in the heap 10,376 bytes copied during GC 19,530,184 bytes maximum residency (9 sample(s)) 12,193,760 bytes maximum slop 97 MB total memory in use
That's compatible with the figures I got for a similar programme. I just wanted to make sure you had no much higher usage.
I'm not sure how to interpret "bytes allocated" and "maximum residency" especially because the program spends no time during GC. But the 97 MB "total memory" correspond to what my process monitor shows.
The bytes allocated figure says how many bytes the RTS allocated in total during the run of the programme. It's not closely related to memory requirements, for example if you have an almost tight loop, which just has too many variables to keep them all in registers, you'll get high allocation figures for long running loops even though it runs in small constant space (of course, allocating, say, one Int on the heap in each iteration costs performance versus an entirely-in-registers-loop). The maximum residency, iirc, is the maximal size of live heap the garbage collector finds in any of the major GCs (there were 9, but since there were only few items, all 9 major and 40 minor GCs together took too little time to display a positive value). 19.5MB could be the one final Fibonacci number plus three of about half the size, that seems a reasonable value (and if the GC ran at slightly different times, you might have gotten different values). The total memory in use also includes garbage waiting to be collected. I don't remember any details fo GHC's GC algorithm, but having total memory about five times maximum residency doesn't seem unbelievable.
I have now tried your code and I didn't find the memory usage too extreme. Be aware that fib (10^8) requires about 70 million bits and you need several times that for the computation.
Then, roughly ten 70m bit numbers fit into the total memory used:
ghci> (97 * 1024^2 * 8) / (70 * 10^6) 11.624213942857143
I expected to retain about three of such large numbers, not ten, and although the recursion depth is not deep I expected some garbage. Both expectations may be a mistake, of course.
Yes, the memory usage is higher than I'd expect too, but it's well-behaved (sublinear growth), so I'm not overly concerned.
If the relation is
a_n = c_1*a_(n-1) + ... + c_k*a_(n-k)
you have the k×k matrix [...] to raise to the n-th power, multiply the resulting matrix with the vector of initial values (a_(k-1), a_(k-2), ..., a_0) and take the last component
Interesting.
(a propos of this, your Fibonacci numbers are off by one, fib 0 = 0, fib 9 = 34, fib 10 = 55).
Wikipedia also starts with zero but states that "some sources" omit the leading zero. I took the liberty to do the same (and documented it like this) as it seems to match the algorithm better. It also corresponds to the rabbit analogy.
It breaks however the nice properties that fib n = (phi^n - (1-phi)^n)/sqrt 5 fib n `mod` 5 == 0 <=> n `mod` 5 == 0 p prime => fib (p - (p|5)) `mod` p == 0 fib n = (-1)^(n+1)*fib (-n) [where, due to typographical limitations, I wrote the Legendre symbol as (p|q)] For aesthetical reasons, I continue to regard setting fib 0 = 1 as wrong.
These matrices have a special structure that allows doing a multiplication in O(k^2). You might want to look into the Cayley-Hamilton theorem for the latter.
I don't see the link to the CH theorem yet -- probably because I didn't know it.
Cayley-Hamilton allows you to replace the matrix exponentiation by calculating (X^n) modulo the characteristic polynomial of M and thus matrix multiplication by multiplication of polynomials (of degree <= (k-1)). The link is pretty indirect, however, and Cayley-Hamilton/multiplication of polynomials is the endpoint of a twisted path.
I did observe that all matrices during the computation have the form
/a b\ \b c/
which simplifies multiplications.
Yes, the powers of a symmetric matrix are always symmetric. That follows from trans(A·B) = trans(B)·trans(A), so trans(A^n) = (trans(A))^n which, if A is symmetric, i.e. trans(A) = A, reduces to A^n.
Is this related to CH?
No, entirely unrelated. It's the fact that the coefficient of fib (n-2) is 1 in the recursion formula for fib n.
Or can I further improve the multiplications using insights from CH?
It's not worth it for Fibonacci numbers: for small k, an O(k^3) algorithm is better than an O(k^2) algorithm with larger constant factors. If you had a linear recursion involving say a_(n-1000) it'd be a huge gain.