How to work out why a data constructor is allocated using gdb?

Hi, I am trying to work out questions such as * Why are there thousands of Module data constructors allocated when building something with GHC * What is allocating all a lot of strings when building GHC? In order to do this I can use gdb in order to find some of the Module/String closures but then I'm a bit stuck about what to do. gdb displays a list of all the Module closures for example and then you can usually try to find the retainer for Module by using findPtr. If the retainer is a THUNK closure, it would be easy, as THUNK closures have DWARF information which maps straight to a particular line. However if the retainer is just some other data constructor, for example, the Module is stored in a Map, it's data constructors all the way up and none of them have DWARF info. I need to fall back to domain specific knowledge to work out where such a sequence of constructors might appear in my program. * Is there anything better I can do to map a constructor allocation to a more precise source location? The string closures were causing me some particular issues as `findPtr` was not showing any retainers so it's hard to work out why they are not GCd. * What situations can an object be retained but show no retainer when using findPtr? Cheers, Matt

It occurred to me today that in order to work out where a constructor
is allocated from you can look in the profiling header.
The profiling header contains information about the cost centre stack
at the precise point of allocation. You should be able
to look at this to see what the call stack was and the source position
of the cost centre at the bottom of the stack.
Cheers,
Matt
On Tue, Aug 13, 2019 at 8:29 AM Matthew Pickering
Hi,
I am trying to work out questions such as
* Why are there thousands of Module data constructors allocated when building something with GHC * What is allocating all a lot of strings when building GHC?
In order to do this I can use gdb in order to find some of the Module/String closures but then I'm a bit stuck about what to do.
gdb displays a list of all the Module closures for example and then you can usually try to find the retainer for Module by using findPtr.
If the retainer is a THUNK closure, it would be easy, as THUNK closures have DWARF information which maps straight to a particular line. However if the retainer is just some other data constructor, for example, the Module is stored in a Map, it's data constructors all the way up and none of them have DWARF info. I need to fall back to domain specific knowledge to work out where such a sequence of constructors might appear in my program.
* Is there anything better I can do to map a constructor allocation to a more precise source location?
The string closures were causing me some particular issues as `findPtr` was not showing any retainers so it's hard to work out why they are not GCd.
* What situations can an object be retained but show no retainer when using findPtr?
Cheers,
Matt

In the unlikely situation that anyone cares about this or attempts to
try to do it. If you combine together `-prof` and `-debug` invoking
any heap profiling mode will take about `10x` longer than usual.
So you can still use `-prof` in order to populate the profiling
headers but don't also try to perform a heap census.
Cheers,
Matt
On Fri, Oct 4, 2019 at 4:57 PM Matthew Pickering
It occurred to me today that in order to work out where a constructor is allocated from you can look in the profiling header.
The profiling header contains information about the cost centre stack at the precise point of allocation. You should be able to look at this to see what the call stack was and the source position of the cost centre at the bottom of the stack.
Cheers,
Matt
On Tue, Aug 13, 2019 at 8:29 AM Matthew Pickering
wrote: Hi,
I am trying to work out questions such as
* Why are there thousands of Module data constructors allocated when building something with GHC * What is allocating all a lot of strings when building GHC?
In order to do this I can use gdb in order to find some of the Module/String closures but then I'm a bit stuck about what to do.
gdb displays a list of all the Module closures for example and then you can usually try to find the retainer for Module by using findPtr.
If the retainer is a THUNK closure, it would be easy, as THUNK closures have DWARF information which maps straight to a particular line. However if the retainer is just some other data constructor, for example, the Module is stored in a Map, it's data constructors all the way up and none of them have DWARF info. I need to fall back to domain specific knowledge to work out where such a sequence of constructors might appear in my program.
* Is there anything better I can do to map a constructor allocation to a more precise source location?
The string closures were causing me some particular issues as `findPtr` was not showing any retainers so it's hard to work out why they are not GCd.
* What situations can an object be retained but show no retainer when using findPtr?
Cheers,
Matt

When I'm interested in only one specific object I add a watchpoint to the
object's header and then do reverse execution. If it stops at mutator code then
I inspect Haskell stack to figure out the Haskell code that it's currently
executing (this is currently quite hard, !1654 helps a lot but it can't be done
on the whole compiler and libraries, the compiler should be booted without it
and then programs can be built with !1654).
If it stops at GC code then I add a conditional breakpoint after line `q =
UNTAG_CLOSURE(q);` in evacuate() to stop if `q` is the object I'm looking for.
That gives me the previous location of the object, and I repeat the process (add
watchpoint to the header, reverse-continue ...).
It's a painful process that can be automated to some extent, for example I once
added a print statement to copy_tag, and wrote a script that maps all locations
of an object during its lifetime using the print statement.
Ömer
Matthew Pickering
In the unlikely situation that anyone cares about this or attempts to try to do it. If you combine together `-prof` and `-debug` invoking any heap profiling mode will take about `10x` longer than usual.
So you can still use `-prof` in order to populate the profiling headers but don't also try to perform a heap census.
Cheers,
Matt
On Fri, Oct 4, 2019 at 4:57 PM Matthew Pickering
wrote: It occurred to me today that in order to work out where a constructor is allocated from you can look in the profiling header.
The profiling header contains information about the cost centre stack at the precise point of allocation. You should be able to look at this to see what the call stack was and the source position of the cost centre at the bottom of the stack.
Cheers,
Matt
On Tue, Aug 13, 2019 at 8:29 AM Matthew Pickering
wrote: Hi,
I am trying to work out questions such as
* Why are there thousands of Module data constructors allocated when building something with GHC * What is allocating all a lot of strings when building GHC?
In order to do this I can use gdb in order to find some of the Module/String closures but then I'm a bit stuck about what to do.
gdb displays a list of all the Module closures for example and then you can usually try to find the retainer for Module by using findPtr.
If the retainer is a THUNK closure, it would be easy, as THUNK closures have DWARF information which maps straight to a particular line. However if the retainer is just some other data constructor, for example, the Module is stored in a Map, it's data constructors all the way up and none of them have DWARF info. I need to fall back to domain specific knowledge to work out where such a sequence of constructors might appear in my program.
* Is there anything better I can do to map a constructor allocation to a more precise source location?
The string closures were causing me some particular issues as `findPtr` was not showing any retainers so it's hard to work out why they are not GCd.
* What situations can an object be retained but show no retainer when using findPtr?
Cheers,
Matt
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
participants (2)
-
Matthew Pickering
-
Ömer Sinan Ağacan