
On Fri, 7 Dec 2001, Adrian Hey replied:
On Friday 07 December 2001 2:27 pm, D. Tweed wrote:
On Fri, 7 Dec 2001, Adrian Hey wrote: The first is.. Does the compiler keep a unique copy of expressions which consist of just a single zero arity constructor (eg. [],True,Nothing..) as a CAF which is referenced each time the constructor appears in an expression, or does it duplicate the constructor (expression) each time it's used.
Well I suppose if it's necessary to create a new indirection heap record for each reference, then there's not really any point in having a single copy of the value itself. But I don't see why that should be so. Even if it is indirected for some reason it should still be possible to share the indirection record I think.
Maybe CAF is the wrong word to use here since there's no application involved. What I mean is.. are zero arity constructors referenced the same way as a top level constant would be? (Not that I know how that's done, but I presume there's only 1 copy of top level constants in memory at any one time.)
My understanding is that GHC tries to have it both ways. Here's my understanding of how it works (implementors should probably chime in if my memory is faulty or out of date): 1) There is a single, canonical copy of every nullary constructor. This canonical copy is used wherever possible---just like a top-level constant. 2) Updating a thunk with an indirection makes it expensive to obtain the thunk's value. The extra indirection does not require allocation---the only reason we need indirections at all is to overwrite memory that previously held a thunk. The real problem is that it takes time to chase down indirections once they exist. Therefore when a thunk evaluates to a nullary constructor, it is overwritten directly. This effectively creates another copy of the nullary constructor. 3) When the GC runs, instead of copying these newly-created nullary constructors, it replaces them with the canonical copy. [The GC also eliminates indirections, and thus helps us no matter what we do in 2) above] Thus we get the best of both worlds---efficient thunk update and reasonable space consumption. -Jan-Willem Maessen