Re: Weak reference semantics - why does a dead weak ref keep its value alive?

23 May 2014

      Hello Luite,

GHC's separation of weak references into keys and values is a
generalization which can be useful to avoid space leaks; the motivation
for the design is described in "Stretching the storage manager: weak
pointers and stable names in Haskell".

In particular, the variant of weak reference you suggest is the
/ephemeron/ semantics in Hayes.  Their reachability rule is:

    The value field of an ephemeron is reachable if both (a) the
    ephemeron (weak pointer object) is reachable, and (b) the key is
    reachable.

The paper goes into more detail why our semantics might be preferred,
but it boils down to:

(1) Our semantics is simpler,
(2) In this semantics, it is not clear when to run finalizers (if
you garbage collect the weak pointer objects early, you won't be able
to run their finalizers!)
(3) These semantics can be simulated

Cheers,
Edward

Excerpts from Luite Stegeman's message of 2014-05-23 06:40:07 -0700:
...
Hi all,
I'm reviewing and improving my weak references implementation for GHCJS,
among other things to make sure that the profiling/stack trace support,
currently being implemented by Ömer Sinan Ağacan as a GSoC project has the
correct heap information to work with.
JavaScript does not have weak references with 'observable deadness', so we
have to walk the heap data structures from time to time to test
reachability, schedule finalizers and break the weak links. I'm trying to
find a design that keeps GHC's semantics, but where JavaScript's own
garbage collector can get rid of as much data as possible, even before we
run our own heap scan.
Now I ran into the following peculiarity, from the semantics
(System.Mem.Weak documentation):
A heap object is reachable if:
- It is a member of the root set.
- It is directly pointed to by a reachable object, other than a weak
pointer object.
- It is a weak pointer object whose key is reachable.
- It is the value or finalizer of a weak pointer object whose key is
reachable.
This says that even if nothing has a reference to a weak pointer, as long
as there are references to its key, the value is considered to be
reachable. For example in this program
import Data.Maybe
import System.Mem.Weak
import System.Mem
import Control.Exception
import System.IO
import Control.Concurrent
gc = performGC >> threadDelay 1000000
main = do
  hSetBuffering stdout NoBuffering
  k <- evaluate "k"
  v <- evaluate "v"
  addFinalizer v (putStrLn "fv")
  w <- evaluate =<< mkWeak k v (Just $ putStrLn "fkv")
  putStrLn =<< fmap (fromMaybe ".") (deRefWeak w)
  addFinalizer w (putStrLn "fw")
  gc
  putStrLn k
  gc
there is no way to reach 'v' after `w` has been finalized, but still it's
kept alive. This agrees with the semantics in the documentation, but what's
the reason for this?
luite

Re: Weak reference semantics - why does a dead weak ref keep its value alive?

Edward Z. Yang