
On Fri, Mar 09, 2007 at 04:02:04PM -0000, Claus Reinke wrote:
but if one does whole-array updates, there isn't much to be gained by avoiding intermediate copies, is there? whether you fill a new memory area, or overwrite an old one, you still have to loop through the whole thing, writing and reading.
I would gain a 20% reduction in memory use, which could come out to a few gigabytes for large problems. That's worth some degree of effort. Actually, it'd be a bit less than 20%, but for large problems it would approach 20%.
ah, ok, i'm not used to thinking in such scales;-) (perhaps you should get in touch with those SAC people, after all - i don't know what their state of play is, but many years ago, they started in an office near mine, and they were definitely thinking about large arrays, even about how to distribute them, and computations on them; also, they used to bench against Fortran and vendor-optimised compilers, not against generic C) but still, even for memory use: if you can run the algorithm in-place, that means that the total of memory in use never exceeds the size of one array, even if the memory manager might not notice. the slice of indices in the new array already written can never again be accessed in the old array, the slice of indices in the old array still valid can not yet have been written in the new array. assuming that GHC's GC isn't clever enough to reclaim parts of arrays, and that wasting even virtual memory on unused space slows down the code by driving it into the wrong level of the memory hierarchy, would it help to split your arrays into slices represented as separate arrays, so that each slice could be abandoned during GC as soon as the algorithm is done with it? to avoid having to deal with complex internal boundary conditions, one might try to define suitable abstractions, eg instead of Array (Int,Int) Int one could try Array Int (Array Int Int), and define all operations on that. or, if we are talking about a one-dimensional vector, perhaps use divMod to distribute the indices over a couple of sub-vectors (from 1xN to 2x(N/2))? given these wild speculations, perhaps i should mention my reason: i've been a fan of reference-counting and the in-place update possibilities it opens, and i have often been annoyed by the persistent reluctance of well-written GC- base code to perform worse than RC-based counterparts. with such an apparent waste of resources and lack of precise knowledge, it seems unfair, but i had to admit it seems to work rather better than i expected (and when reality doesn't adhere to theory, reality is unlikely to give in first;-). whether this still holds for the level of number-crunching you are considering, i can't say - according to some of their publications, SAC seems to do quite well, at a performance-level Haskell has yet to reach, at the price of lower expressiveness for non-array programs (but higher-level array operations).. claus