What does the DWARF information generated by your GHC branch look like?

Hi! (I've CCed ghc-devs on this email, as I think the question is of general interest.) I enjoyed reading your paper [1] and I have some questions. * What does the generated DWARF information look like? For example, will you fill in the .debug_line section so that standard tools like "perf report" and gprof can be used on Haskell code? Code pointers would be appreciated. * Does your GHC allow DWARF information to be generated without actually using any of the RTS (e.g. eventlog) machinery? This would be very useful if you want to use "report record/report" only, in order to achieve minimal overhead when that matters. Another way to ask the same question, do you have a ghc -g flag that has no implication for the runtime settings? * Do you generate DW_TAG_subprogram sections in the .debug_info section so that other tools can figure out the name of Haskell functions? Cheers, Johan 1. "Causality of Optimized Haskell: What is burning our cycles?" http://eprints.whiterose.ac.uk/76448/

[copy of the dropped reply, for anybody interested] Johan Tibell wrote:
I enjoyed reading your paper [1] and I have some questions.
Thanks! The DWARF patches are currently under review for Trac #3693. Any feedback would be very appreciated: https://github.com/scpmw/ghc/commits/profiling-import
* What does the generated DWARF information look like?
So far we generate: - .debug_info: Information about all generated procedures and blocks. - .debug_line: Source-code links for all generated code - .debug_frame: Unwind information for the GHC stack - .debug_ghc: Everything we can't properly represent as DWARF
will you fill in the .debug_line section so that standard tools like "perf report" and gprof can be used on Haskell code?
Yes, even though from a few quick tests the results of "perf report" aren't too useful, as source code links are pretty coarse and jump around a lot - especially for optimised Haskell code. There's the option to instead annotate with source code links to a generated ".dump-simpl" file, which might turn out to be more useful.
Code pointers would be appreciated.
Is this about how .debug_line information is generated? We take the same approach as LLVM (and GCC, I think) and simply annotate the assembly with suitable .file & .loc directives. That way we can leave all the heavy lifting to the assembler. Current patch is here: https://github.com/scpmw/ghc/commit/c5294576
* Does your GHC allow DWARF information to be generated without actually using any of the RTS (e.g. eventlog) machinery?
The RTS just serves as a DWARF interpreter for its own executable (+ libraries) in this, so yes, it's fully independent. On the other hand, having special code allows us to avoid a few subtleties about Haskell code that are hard to communicate to standard debugging tools (especially concerning stack tracing).
Another way to ask the same question, do you have a ghc -g flag that has no implication for the runtime settings?
Right now -g does not affect the RTS at all. We might want to change that at some point though so we can get rid of the libdwarf dependency.
* Do you generate DW_TAG_subprogram sections in the .debug_info section so that other tools can figure out the name of Haskell functions?
Yes, we are setting the "name" attribute to a suitable Haskell name. Sadly, at least GDB seems to ignore it and falls back to the symbol name. I investigated this some time ago, and I think the reason was that it doesn't recognize the Haskell language ID (which isn't standardized, obviously). Simply pretending to be C(++) might fix this, but I would be a bit scared of other side-effects. Greetings, Peter Wortmann
participants (2)
-
Johan Tibell
-
Peter Wortmann