
Ben Lippmeier
On 31 Aug 2020, at 5:54 pm, Moritz Angermann
wrote: If anyone has some create ideas, I'd love to hear them. I've been wondering if just logging allocations (offset, range, type) would help figuring out what we expected to be there; and then maybe try to break on the allocation, (and subsequent writes).
I'm sure some have been down this road before.
Force a GC before every allocation, and make the GC check the validity of the objects before it moves anything. I think this used to be possible by compiling the runtime system in debug mode.
The usual pain of heap corruption is that once the heap is corrupted it may be several GC cycles before you get the actual crash, and in the meantime the objects have all been moved around. The GC walks over all the objects by nature, so get it to validate the heap every time it does, then force it to run as often as you possibly can.
Indeed. Small nurseries (using +RTS -A), deterministic GC behavior (with +RTS -V0 -I0), and sanity checking (with +RTS -DS) are all a very useful for this.
A user space approach is to use a library like vacuum or packman that also walks over the heap objects directly.
http://hackage.haskell.org/package/vacuum-2.2.0.0/docs/GHC-Vacuum.html https://hackage.haskell.org/package/packman
For what it's worth, the ghc-debug [1] project which Sven Tennie, Matt Pickering, and I have been working on over the last year or so was in part motivated by precisely this use-case. It would allow the heap of one Haskell process's heap to be traversed by another process. This is useful for both debugging and profiling use-cases. Cheers, - Ben [1] https://github.com/bgamari/ghc-debug