Questions about pointers in GHC

Hi all, I have a few questions about heap-allocated memory pointers in GHC. Both code and data live in the heap, and both can have pointers to other objects. However, according to my understanding, most heap-allocated objects would be immutable. Moreover even if objects might be mutated, most pointers should not be updated. I have heard that the runtime system might implement several features by updating pointers, but I am not sure which features are they. My question is thus: (1) is there a way to trace mutations of heap objects (for instance by modifying the GC system code) [I don't need to know what is the change, all I want to know is what has been changed], and (2) is there a way where I could trace pointer updates [in this case I'd like to know both the old and objects, preferably by name]? More specifically, are these possible by annotating/changing the runtime system only (i.e. not touching code generation)? Thanks, Ray Department of Computer Science Tufts University

Xuanrui Qi
Hi all,
Hi,
I have a few questions about heap-allocated memory pointers in GHC. Both code and data live in the heap, and both can have pointers to other objects. However, according to my understanding, most heap-allocated objects would be immutable. Moreover even if objects might be mutated, most pointers should not be updated. I have heard that the runtime system might implement several features by updating pointers, but I am not sure which features are they.
Code generally doesn't live on the dynamically-allocated heap. Rather, it is mmap'd immutably into the process's address space. However, closures, which can live in the heap, generally contain pointers to code and can indeed be mutated.
My question is thus: (1) is there a way to trace mutations of heap objects (for instance by modifying the GC system code) [I don't need to know what is the change, all I want to know is what has been changed],
Indeed it is possible. The RTS already tracks most mutations via a write barrier to ensure safety of its generational GC. See the recordMutable macro defined in Cmm.h.
and (2) is there a way where I could trace pointer updates [in this case I'd like to know both the old and objects, preferably by name]? More specifically, are these possible by annotating/changing the runtime system only (i.e. not touching code generation)?
This will be a fair bit harder, requiring modification of the RTS's mutation operations (see PrimOps.cmm and Updates.cmm) at very least. Cheers, - Ben

I'll add that another place where mutation occurs is when demanding an unevaluated thunk (this acts like a read barrier). This once-written property of thunk objects is handled specially by the garbage collector [1]. [1] https://www.microsoft.com/en-us/research/wp-content/uploads/2008/06/par-gc-i... https://www.microsoft.com/en-us/research/wp-content/uploads/2008/06/par-gc-i... (I believe this is current version in GHC's master branch)
On Mar 10, 2018, at 12:42 PM, Ben Gamari
wrote: Xuanrui Qi
writes: Hi all,
Hi,
I have a few questions about heap-allocated memory pointers in GHC. Both code and data live in the heap, and both can have pointers to other objects. However, according to my understanding, most heap-allocated objects would be immutable. Moreover even if objects might be mutated, most pointers should not be updated. I have heard that the runtime system might implement several features by updating pointers, but I am not sure which features are they.
Code generally doesn't live on the dynamically-allocated heap. Rather, it is mmap'd immutably into the process's address space. However, closures, which can live in the heap, generally contain pointers to code and can indeed be mutated.
My question is thus: (1) is there a way to trace mutations of heap objects (for instance by modifying the GC system code) [I don't need to know what is the change, all I want to know is what has been changed],
Indeed it is possible. The RTS already tracks most mutations via a write barrier to ensure safety of its generational GC. See the recordMutable macro defined in Cmm.h.
and (2) is there a way where I could trace pointer updates [in this case I'd like to know both the old and objects, preferably by name]? More specifically, are these possible by annotating/changing the runtime system only (i.e. not touching code generation)?
This will be a fair bit harder, requiring modification of the RTS's mutation operations (see PrimOps.cmm and Updates.cmm) at very least.
Cheers,
- Ben _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
participants (3)
-
Ben Gamari
-
Kavon Farvardin
-
Xuanrui Qi