Re: [Haskell-cafe] Garbage collecting pointers

26 Mar 2010

      On Fri, Mar 26, 2010 at 8:28 PM, Mads Lindstrøm wrote:
...
Hi
For some time I have been thinking about an idea, which could limit
Haskell's memory footprint. I don't know if the idea is crazy or clever,
but I would love to hear peoples thoughts about it. The short story is,
I propose that the garbage collector should not just reclaim unused
memory, it should also diminish the need for pointers, by replacing
nested data structures with larger chunks of consecutive memory. In
other words, I would diminish the need for pointers for arbitrary
recursive data types, just as replacing linked lists with arrays
eliminate the need for pointers.
I will explain my idea by an example of a data type we all know and
love:
data List a = Cons a (List a) | Nil
each Cons cell uses two pointers - one for the element and one for the
rest of the list. If we could somehow merge the element and the rest of
the list into consecutive memory, we would be able to eliminate the
pointer to the rest of list. On 64 bit architectures merging would save
us 8 bytes of "wasted" memory. If we could merge n elements together we
could save n*8 bytes of memory.
64 bit pointers are wasteful in another way, as nobody has anywhere near
2^64 bytes of memory. And nobody is going to have that much memory for
decades to come. Thus, we could use the 8 most significant bits for
something besides pointing to memory, without loosing any significant
ability to address our memories. This would be similar to how GHC uses
some of the least significant bits for pointer tagging.
I suggest using the 8 most significant bits for what could be termed the
length of the pointer. A length of zero would mean that the pointer,
pointed to a Cons-cell just like we have them today. A length of one
would mean that one extra element were embedded into the cons cell. That
is, we would have this block of consecutive memory:
<Cons marker>  <Cons marker>  <pointer to rest of list>
We could also call this a chunk of length two. A pointer of length two,
would mean two embedded list elements (a chunk of length three). And so
on...
By embedding the "length" of the list in the pointer, it becomes
possible that one pointer could point to the beginning of a chunk of n
consecutive elements, while another one could point to the middle of the
same chunk (but the pointer would be of a different length). This would
enable sharing without having to break up into smaller chunks.
How should chunks be created? Or if a functions creates a lot of one
long chunks, how are they turned into longer chunks. I imagine that it
would be the job of the garbage collector, to merge the different
elements in a list into larger chunks. Of cause some code, like reading
a file, could create chunks directly.
I have used list as an example. But I imagine that GHC could do this for
any recursively defined data type.
I can see few, but big disadvantages also. Mainly, my scheme would
complicate dereferencing a pointer, which make the compiled code larger
and it might make branch prediction harder. The latter would only be a
problem on architectures that are pipelined, but popular processors from
Intel and AMD are both heavily pipelined.
Reorganizing data on the fly sounds like it may be a pretty sensible idea
now that cache misses are so bad (in comparison). The fact that Haskell data
is generally immutable helps too.
However, I think your scheme sounds a bit complicated for my tastes, and I
don't really understand the idea of "lengths", what would "n consecutive
elements" mean for a tree? Surely this wouldn't work just for list-like data
structures?
Something that seems a bit easier to work with would be to use just two data
layouts, one which uses pointers everywhere like now, and one in which every
single field is expanded one level (each of them could in turn be expanded
one more level, but you don't know that until you look at it). If one of the
fields can't be expanded (e.g. because it would become cyclic) then you just
have to keep the "fully pointerfied" version.

It's still a bit tricky because the offset to a specific member would be
depend on the specific sizes of the previous fields in the layout (which can
vary depending on the actual value). It would always possible to figure it
out at runtime, though, by scanning through the fields each time finding the
next one by incrementing the size, and you could just disable this feature
for large data types. You could potentially store a heavily quantized lookup
table (possibly at the expense of padding some members to align all members
on a large common multiple from the base).

The general idea seems interesting to me, but then I'm hardly an expert. I
like the notion of using the fact that the language is free to change the
data representation at runtime to improve performance by "collapsing" an
immutable (recursive) data structure once it's been created. Partly because
it's something that would be harder to do in other languages!

-- 
Sebastian Sylvan