
On Tuesday 19 June 2007 01:00, Simon Marlow wrote:
Isaac Dupree wrote:
I was thinking, since we have a copying garbage collector that reassigns all pointers anyway, could we detect the "identical" heap objects and for each set of identical ones, have all thunks that pointed to any of them, now point to the single one that is created in the target space?
I think what you're proposing is often called "hash consing", except that hash-consing is usually done at construction time, you want to do it at GC time.
My take is it would only be worthwhile if there was a *lot* of sharing to be gained by doing it, and in most cases there wouldn't be. This is just a guess based on my experience poking around in the heap though - feel free to try it out and prove me wrong :-)
Presumably it's possible to do a piece-meal app-level hash consing of values by defining an appropriate identity memoisation. Though maybe the optimiser is so cunning it figures out you're returning an equal value and skips it. It'd be nice if there were system wide hash consing memoisations that could be called when substantial sharing was anticipated. Maybe an easy approximation is keeping your String trie, or whatever, in a global IORef. Or what about hash consing after an equality test? Less opportunity but perhaps less cost. Daniel