Re: [Haskell-cafe] Garbage collecting pointers

26 Mar 2010

      Hi

On Fri, 2010-03-26 at 21:33 +0000, Sebastian Sylvan wrote:
...
Reorganizing data on the fly sounds like it may be a pretty sensible
idea now that cache misses are so bad (in comparison). The fact that
Haskell data is generally immutable helps too.
However, I think your scheme sounds a bit complicated for my tastes,
and I don't really understand the idea of "lengths", what would "n
consecutive elements" mean for a tree? Surely this wouldn't work just
for list-like data structures?
First of all, I had assumed that a data type's memory footprint was
independent of its value. Similarly to how a C-union always occupies the
same amount of memory. After reading your post, I reevaluated my
assumption and my assumption might just be wrong.

However, if my assumption is right we could lay out the memory in a
preorder fashion. The following tree:

       a
      / \
     /   \
    b     c
   / \   / \
  d   e f   g

would have the following memory layout: a, b, d, e, c, f, g. A pointer
to element "a" would be of length 6 (a + 6 additional elements). The
preorder layout would also allow us to point to part of the structure.
For example, we could make a pointer to element "b" of length 2 (b + 2
additional elements). In this way, we do not need to brake a larger
chunk into smaller ones to enable sharing of data. Neither would we need
to scan though the hole structure, when only pointing to a part of the
structure.
...
Something that seems a bit easier to work with would be to use just
two data layouts, one which uses pointers everywhere like now, and one
in which every single field is expanded one level (each of them could
in turn be expanded one more level, but you don't know that until you
look at it). If one of the fields can't be expanded (e.g. because it
would become cyclic) then you just have to keep the "fully
pointerfied" version.
It's still a bit tricky because the offset to a specific member would
be depend on the specific sizes of the previous fields in the layout
(which can vary depending on the actual value). It would always
possible to figure it out at runtime, though, by scanning through the
fields each time finding the next one by incrementing the size, and
you could just disable this feature for large data types. You could
potentially store a heavily quantized lookup table (possibly at the
expense of padding some members to align all members on a large common
multiple from the base).
Would scanning not run the risk of changing the asymptotic complexity of
programs?

What do "heavily quantized lookup table" mean? It is the "heavily
quantized" part I do not get.
...
The general idea seems interesting to me, but then I'm hardly an
expert. I like the notion of using the fact that the language is free
to change the data representation at runtime to improve performance by
"collapsing" an immutable (recursive) data structure once it's been
created. Partly because it's something that would be harder to do in
other languages!
-- 
Sebastian Sylvan
/Mads

Re: [Haskell-cafe] Garbage collecting pointers

Mads Lindstrøm