
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Dec 19, 2008, at 7:40 AM, Daniel Kraft wrote:
data Fraction = Fraction Int Int
to hold rational numbers (maybe there's already some built-in type for this in Haskell, much like for instance Scheme has a rational type?),
There is one. It is called Rational. :)
and then I compute a list of pairs of those numbers, that is [(Fraction, Fraction)]. Fraction is declared an instance of Ord.
This list has up to 3 million elements. If I do
main = print $ length $ points
then the program prints out the length fine and takes around 6s to finish (compiled with GHC -O3). Ok, but I acknowledge that length isn't quite an expensive function, as I can imagine that Haskell does not compute and hold the entire list for this but instead each element at once and discards it afterwards.
Unless something has changed since I last checked, there is little difference between ghc -O2 and ghc -O3, and the latter can even be slower much of the time. It may depend on the situation, but I'm just letting you know. In this case, the runtime should not actually be computing _any_ of the elements of the list since it doesn't care what their values are. You are only counting them, not using them, so it really only computes the spine of the list.
Doing
main = print $ length $ map (\(x, _) -> x == Fraction 1 2) points
instead, gives a slightly longer runtime (6.6s), but in this case I'm sure that at least each element is computed; right?
No. The elements' values are still not actually used, so you are still only evaluating the spine of the list.
main = print $ length $ reverse points
gives 11.9s, and here I guess (?) that for this to work, the entire list is computed and hold in memory.
This is beginning to sound like you think Haskell lists are arrays. They aren't. They are actually linked lists. In order to reverse a linked list, you have to travel all the way to the end and then reconstruct it from there. This explains the time increase. It doesn't really have anything to do with computing the elements or holding anything in memory.
However, trying to do
import List main = print $ length $ sort points
makes memory usage go up and the program does not finish in 15m, also spending most time waiting for swapped out memory. What am I doing wrong, why is sort this expensive in this case? I would assume that computing and holding the whole list does not take too much memory, given its size and data type; doing the very same calculation in C should be straight forward. And sort should be O(n * log n) for time and also not much more expensive in memory, right?
This is actually the first time you really evaluated any of the elements of the list, hence a massive increase in memory use. - - Jake -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.8 (Darwin) iEYEARECAAYFAklLsaoACgkQye5hVyvIUKlJYgCfYtwPv/neOnl3+wIu8VhIqfoA lXMAn3EmANZbSRyYeOiXtOdGl7hxCj34 =NJwT -----END PGP SIGNATURE-----