Re: [Haskell-cafe] Re: sort and lazyness (?)

19 Dec 2008

      On Fri, Dec 19, 2008 at 7:58 AM, Daniel Kraft  wrote:
...
Not having looked at your code, I think you are benefiting from
...
fusion/deforestation in the first three main functions. In this case,
although you may appear to be evaluating the entire list, in fact the
list elements can be discarded as they are generated, so functions like
length and reverse can run using constant space, rather than O(n) space.
How does reverse work in constant space?  At the moment I can't imagine it
doing so; that's why I tried it, but of course you could be right.
No, you are correct, reverse is not constant space.  However, as Duncan
explained, reverse does not force any elements of the list, so even if you
had a list of elements which consumed 1Mb each (when fully evaluated), they
would not be forced so the memory would look exactly the same.
...
The sort function, however, requires that the entire list is retained,
...
hence greater memory usage. I also think you are optimistic in the
memory requirements of your 3 million element list. A list of Ints will
take a lot more than 4 bytes per element (on 32-bit machines) because
there's overhead for the list pointers, plus possibly the boxes for the
Ints themselves. I think there are 3 machine words for each list entry
(pointer to this element, pointer to next element, info-table pointer),
and 2 words for each Int, but I'm just guessing:
http://hackage.haskell.org/trac/ghc/wiki/Commentary/Rts/Storage/HeapObje
cts
Of course that's the case, but the list being 3 million elements, and not,
say 100 (which would still fit into memory for a simple C array of ints) I
thought would make it possible.  Otherwise, how can one handle such amounts
in data anyway?  Only using arrays?
Well, if you know the size beforehand and you know the whole thing needs to
fit into memory at the same time, an array is usually a better choice than a
list.  Lists are more like loops -- i.e. control flow rather than data,
whereas Arrays are definitely data.  I realize the imprecision of that
statement...

However, I am not sure what all this jabber about swapping is.  28 bytes/elt
* 3,000,000 elts = 84 Mb, which easily fits into a modern machine's memory.

There are a lot of traps for the unwary in memory performance though.
Depending on how things are defined, you may be computing *too* lazily,
building up thunks when you should just be crunching numbers.  Still,
swapping on this 3,000,000 element list is absurd, and we should look closer
into your example.  Post the rest (eg. the instances?)?

Luke
...
You might get some mileage by suggesting to GHC that your Fraction type
...
is strict e.g.
data Fraction = Fraction !Int !Int
...
which might persuade it to unbox the Ints, giving some space savings.
I already tried so, but this doesn't change anything to the performance.  I
will however try now to use the provided rational type, maybe this helps.
Thanks for the answers,
Daniel
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] Re: sort and lazyness (?)

Luke Palmer