
#3693: Show stack traces -------------------------------------+------------------------------------ Reporter: jpet | Owner: Type: feature request | Status: new Priority: normal | Milestone: 7.6.2 Component: Runtime System | Version: 6.10.4 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Unknown/Multiple Type of failure: None/Unknown | Difficulty: Unknown Test Case: | Blocked By: Blocking: | Related Tickets: -------------------------------------+------------------------------------ Changes (by scpmw): * difficulty: => Unknown Comment: Some progress on this - our [https://github.com/scpmw/ghc/commits /profiling-ncg DWARF / profiling branch] now generates DWARF using NCG, complete with .debug_frame unwind records that are at least valid at the start of blocks. That's enough for GDB to backtrace: {{{ $ gdb stack-trace -q Reading symbols from tack-trace...done. (gdb) break stg_raisezh Breakpoint 1 at 0x68f2f8: file rts/Exception.cmm, line 433. (gdb) run Starting program: stack-trace Breakpoint 1, stg_raisezh () at rts/Exception.cmm:433 433 { (gdb) bt #0 stg_raisezh () at rts/Exception.cmm:433 #1 0x0000000000694330 in ?? () at rts/Updates.cmm:57 #2 0x00000000004047a0 in Main_zdwfibzuerr_info () at stack-trace.hs:7 #3 0x0000000000404770 in Main_zdwfibzuerr_info () at stack-trace.hs:5 }}} Note however that we end up with "??" sometimes due to GDB looking at IP-1 for information about the supposed "call" instruction. With tables next to code, that is guaranteed to be inside the info table, so right now I just declare the info table as belonging to a source file so at least the "at bla" bit is correct. Furthermore, using Max' patch and some more or less involved lookups we can do the backtracking from the RTS as well, which then looks roughly as follows ([https://github.com/scpmw/ghc/commits/stack-tracing code here]): {{{ stack-trace: Barf! Loading debug data... Stack trace: 0: stg_bh_upd_frame_ret (at rts/Updates.cmm:86:1-91:2) 1: fib_err (at stack-trace.hs:7:37-7:42) 2: fib_err (at stack-trace.hs:7:37-7:42) 3: fib_err (at stack-trace.hs:7:37-7:42) 4: fib_err (at stack-trace.hs:7:37-7:42) }}} Here the RTS is reading in DWARF from our own executable as well as libraries using libdwarf, then looking up the code pointers to get labels. These are then looked up in the debug info table we normally use for profiling, which yields (amongst other things) references to the original source code. Caveats: * The quality of the result obviously depends a lot on what we can actually find on the stack. Ironically, higher optimization levels sometimes produce better results, as there's a higher change for functions to have specialized return code. * Only works on Linux at this point - main sticking point would be undoing relocation and finding the right files to read DWARF from (right now I read /proc/self/maps for that). * Depends on a significant amount of code from my profiling-ncg branch, which has a few change that might be controversial - such as adding CmmTick and CmmContext to CmmNode instead of tracking source code ticks per-block or per-procedure. Reasoning is basically that this solution seems to have the least overhead on Cmm optimizations. Something I am wondering right now is what kind of API would be most appropriate for this kind of thing. This comes down pretty much to how much extra functionality we want - we would probably want to be able to request stack dumps manually, but would we want to be able to make the stack contents transparent to the Haskell program? If yes we could convert stack dumps into a Haskell lists, possibly offering whatever extra bits of information while we're at it (e.g. Core references). On the other hand that would be really, really impure - and quite some work to boot. Thoughts? Any particular features people would like to see? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/3693#comment:44 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler