Re: [Haskell-cafe] idea for avoiding temporaries

10 Mar 2007

      ...
On Fri, Mar 09, 2007 at 04:02:04PM -0000, Claus Reinke wrote:
...
but if one does whole-array updates, there isn't much to be gained by
avoiding intermediate copies, is there? whether you fill a new memory
area, or overwrite an old one, you still have to loop through the whole
thing, writing and reading.
I would gain a 20% reduction in memory use, which could come out to a few
gigabytes for large problems.  That's worth some degree of effort.
Actually, it'd be a bit less than 20%, but for large problems it would
approach 20%.
ah, ok, i'm not used to thinking in such scales;-) (perhaps you should get in touch
with those SAC people, after all - i don't know what their state of play is, but 
many years ago, they started in an office near mine, and they were definitely
thinking about large arrays, even about how to distribute them, and computations
on them; also, they used to bench against Fortran and vendor-optimised compilers,
not against generic C)

but still, even for memory use: if you can run the algorithm in-place, that means
that the total of memory in use never exceeds the size of one array, even if the
memory manager might not notice. the slice of indices in the new array already 
written can never again be accessed in the old array, the slice of indices in the 
old array still valid can not yet have been written in the new array.

assuming that GHC's GC isn't clever enough to reclaim parts of arrays, and 
that wasting even virtual memory on unused space slows down the code by driving
it into the wrong level of the memory hierarchy, would it help to split your arrays 
into slices represented as separate arrays, so that each slice could be abandoned 
during GC as soon as the algorithm is done with it?

to avoid having to deal with complex internal boundary conditions, one might
try to define suitable abstractions, eg instead of Array (Int,Int) Int one could
try Array Int (Array Int Int), and define all operations on that. or, if we are
talking about a one-dimensional vector, perhaps use divMod to distribute
the indices over a couple of sub-vectors (from 1xN to 2x(N/2))?

given these wild speculations, perhaps i should mention my reason: i've been
a fan of reference-counting and the in-place update possibilities it opens, and
i have often been annoyed by the persistent reluctance of well-written GC-
base code to perform worse than RC-based counterparts. with such an 
apparent waste of resources and lack of precise knowledge, it seems unfair,
but i had to admit it seems to work rather better than i expected (and when
reality doesn't adhere to theory, reality is unlikely to give in first;-). whether 
this still holds for the level of number-crunching you are considering, i can't
say - according to some of their publications, SAC seems to do quite well,
at a performance-level Haskell has yet to reach, at the price of lower 
expressiveness for non-array programs (but higher-level array operations)..

claus