
Hello Luite, GHC's separation of weak references into keys and values is a generalization which can be useful to avoid space leaks; the motivation for the design is described in "Stretching the storage manager: weak pointers and stable names in Haskell". In particular, the variant of weak reference you suggest is the /ephemeron/ semantics in Hayes. Their reachability rule is: The value field of an ephemeron is reachable if both (a) the ephemeron (weak pointer object) is reachable, and (b) the key is reachable. The paper goes into more detail why our semantics might be preferred, but it boils down to: (1) Our semantics is simpler, (2) In this semantics, it is not clear when to run finalizers (if you garbage collect the weak pointer objects early, you won't be able to run their finalizers!) (3) These semantics can be simulated Cheers, Edward Excerpts from Luite Stegeman's message of 2014-05-23 06:40:07 -0700:
Hi all,
I'm reviewing and improving my weak references implementation for GHCJS, among other things to make sure that the profiling/stack trace support, currently being implemented by Ömer Sinan Ağacan as a GSoC project has the correct heap information to work with.
JavaScript does not have weak references with 'observable deadness', so we have to walk the heap data structures from time to time to test reachability, schedule finalizers and break the weak links. I'm trying to find a design that keeps GHC's semantics, but where JavaScript's own garbage collector can get rid of as much data as possible, even before we run our own heap scan.
Now I ran into the following peculiarity, from the semantics (System.Mem.Weak documentation):
A heap object is reachable if: - It is a member of the root set. - It is directly pointed to by a reachable object, other than a weak pointer object. - It is a weak pointer object whose key is reachable. - It is the value or finalizer of a weak pointer object whose key is reachable.
This says that even if nothing has a reference to a weak pointer, as long as there are references to its key, the value is considered to be reachable. For example in this program
import Data.Maybe import System.Mem.Weak import System.Mem import Control.Exception import System.IO import Control.Concurrent
gc = performGC >> threadDelay 1000000
main = do hSetBuffering stdout NoBuffering k <- evaluate "k" v <- evaluate "v" addFinalizer v (putStrLn "fv") w <- evaluate =<< mkWeak k v (Just $ putStrLn "fkv") putStrLn =<< fmap (fromMaybe ".") (deRefWeak w) addFinalizer w (putStrLn "fw") gc putStrLn k gc
there is no way to reach 'v' after `w` has been finalized, but still it's kept alive. This agrees with the semantics in the documentation, but what's the reason for this?
luite