[GHC] #14192: Change to 1TB VIRT allocation makes it impossible to core-dump Haskell programs

#14192: Change to 1TB VIRT allocation makes it impossible to core-dump Haskell programs -------------------------------------+------------------------------------- Reporter: nh2 | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Runtime | Version: 8.0.2 System | Keywords: gdb, | Operating System: Unknown/Multiple debugging | Architecture: | Type of failure: Runtime Unknown/Multiple | performance bug Test Case: | Blocked By: Blocking: | Related Tickets: #9706 Differential Rev(s): | Wiki Page: -------------------------------------+------------------------------------- GHC 8.0.2 on Linux changed the memory allocator to always allocate 1TB virtual memory on startup (#9706). I now have a production Haskell program running in a loop and would like to debug where it is stuck, on another machine, thus attaching with `gdb -p` and running `generate-core-file`. But core dumping takes forever, I Ctrl-C'd it when it reached 140 GB in size (my machine only has 64 GB RAM btw.); after the Ctrl-C the size of the core file on the file system was reported as `1.1T` (probably it's a sparse file now). Is there a workaround for this? For example, if I could dump only the resident or actually allocated pages, that would probably help. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14192 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14192: Change to 1TB VIRT allocation makes it impossible to core-dump Haskell programs -------------------------------------+------------------------------------- Reporter: nh2 | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Runtime System | Version: 8.0.2 Resolution: | Keywords: gdb, | debugging Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #9706 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by nh2): I found some info on selective page dumping on https://stackoverflow.com/questions/11734583/why-core-file-is-more-than- virtual-memory but I'm not sure what the right dumping approach is for programs running under the GHC 8.0 RTS. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14192#comment:1 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14192: Change to 1TB VIRT allocation makes it impossible to core-dump Haskell programs -------------------------------------+------------------------------------- Reporter: nh2 | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Runtime System | Version: 8.0.2 Resolution: | Keywords: gdb, | debugging Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #9706 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by nh2): * cc: simonmar, gcampax, ezyang (added) -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14192#comment:2 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14192: Change to 1TB VIRT allocation makes it impossible to core-dump Haskell programs -------------------------------------+------------------------------------- Reporter: nh2 | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Runtime System | Version: 8.0.2 Resolution: | Keywords: gdb, | debugging Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #9706 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by nicolast): I guess setting the `MADV_DONTDUMP` flag on the region using `madvise(2)`, then resetting the flag when chunks of said memory are being used, could work. Not sure about the performance impact of setting and resetting that flag over and over... May make sense to do it over chunks of, say, 32MB at a time, which would still result in 'large' (though likely very compressable) coredumps for small programs, yet manageable. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14192#comment:3 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14192: Change to 1TB VIRT allocation makes it impossible to core-dump Haskell programs -------------------------------------+------------------------------------- Reporter: nh2 | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Runtime System | Version: 8.0.2 Resolution: | Keywords: gdb, | debugging Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #9706 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonmar): I'm surprised this is an issue, I'm sure I've core-dumped processes with the 1TB address space without any problems. The core files look huge, but they're sparse. I wonder what's being dumped. We could easily add a flag to change the size of the region, but adding a flag to disable the region completely would add a performance overhead because we'd have to check the flag repeatedly in the inner loop of the GC, so I'd really like to avoid that if possible. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14192#comment:4 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14192: Change to 1TB VIRT allocation makes it impossible to core-dump Haskell programs -------------------------------------+------------------------------------- Reporter: nh2 | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Runtime System | Version: 8.0.2 Resolution: | Keywords: gdb, | debugging Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #9706 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by nh2): @simonmar Could you try to reproduce? Try this: {{{ import Control.Concurrent main = threadDelay 1000000000 }}} Compile with `ghc --make thefile.hs` (8.0.2), pet the pid with `ps`, `sudo gdb -p thepid`, `generate-core-file`. For me (Ubuntu 16.04) that runs forever, writing GB after GB to `core.*` in gdb's working directory. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14192#comment:5 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14192: Change to 1TB VIRT allocation makes it impossible to core-dump Haskell programs -------------------------------------+------------------------------------- Reporter: nh2 | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Runtime System | Version: 8.0.2 Resolution: | Keywords: gdb, | debugging Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #9706 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by bgamari): Indeed I can reproduce this. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14192#comment:6 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14192: Change to 1TB VIRT allocation makes it impossible to core-dump Haskell programs -------------------------------------+------------------------------------- Reporter: nh2 | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Runtime System | Version: 8.0.2 Resolution: | Keywords: gdb, | debugging Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #9706 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by nicolast): * cc: nicolast (added) -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14192#comment:7 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14192: Change to 1TB VIRT allocation makes it impossible to core-dump Haskell programs -------------------------------------+------------------------------------- Reporter: nh2 | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Runtime System | Version: 8.0.2 Resolution: | Keywords: gdb, | debugging Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #9706 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonmar): Ah, so this is something to do with gdb's generate-core-file. Ordinary core dumps work just fine, e.g. if I send SIGQUIT to the process by hitting `^\`: {{{
ghc --version The Glorious Glasgow Haskell Compilation System, version 8.0.2 ghc foo.hs ./foo ^\Quit (core dumped) ls -l core -rw------- 1 smarlow smarlow 1589248 Sep 8 07:26 core gdb foo core GNU gdb (Ubuntu 7.11.1-0ubuntu1~16.5) 7.11.1 Copyright (C) 2016 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-linux-gnu". Type "show configuration" for configuration details. For bug reporting instructions, please see: http://www.gnu.org/software/gdb/bugs/. Find the GDB manual and other documentation resources online at: http://www.gnu.org/software/gdb/documentation/. For help, type "help". Type "apropos word" to search for commands related to "word"... Reading symbols from foo...done. [New LWP 21001] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". Core was generated by `./foo'. Program terminated with signal SIGQUIT, Quit. #0 0x00007fcd7124c573 in __select_nocancel () at ../sysdeps/unix/syscall- template.S:84 84 ../sysdeps/unix/syscall-template.S: No such file or directory. }}}
-- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14192#comment:8 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14192: Change to 1TB VIRT allocation makes it impossible to core-dump Haskell programs -------------------------------------+------------------------------------- Reporter: nh2 | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Runtime System | Version: 8.0.2 Resolution: | Keywords: gdb, | debugging Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #9706 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by bgamari): Ahh, so it does. Well this is unfortunate. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14192#comment:9 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14192: Change to 1TB VIRT allocation makes it impossible to core-dump Haskell programs -------------------------------------+------------------------------------- Reporter: nh2 | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Runtime System | Version: 8.0.2 Resolution: | Keywords: gdb, | debugging Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #9706 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by nh2): It there another way to take a core dump of a running program without terminating it that could be used in this situation? Also, is it known why the gdb approach doesn't work? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14192#comment:10 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14192: Change to 1TB VIRT allocation makes it impossible to core-dump Haskell programs -------------------------------------+------------------------------------- Reporter: nh2 | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Runtime System | Version: 8.0.2 Resolution: | Keywords: gdb, | debugging Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #9706 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by nicolast): May also depend on GDB version: https://sourceware.org/bugzilla/show_bug.cgi?id=16092 -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14192#comment:11 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14192: Change to 1TB VIRT allocation makes it impossible to core-dump Haskell programs -------------------------------------+------------------------------------- Reporter: nh2 | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Runtime System | Version: 8.0.2 Resolution: | Keywords: gdb, | debugging Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #9706 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by nh2):
May also depend on GDB version: https://sourceware.org/bugzilla/show_bug.cgi?id=16092
My gdb (>= 7.11.1) certainly has the feature; I found it can be checked with `show use-coredump-filter`. There must be more smarts or difference in behaviour that Linux has but GDB doesn't. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14192#comment:12 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14192: Change to 1TB VIRT allocation makes it impossible to core-dump Haskell programs -------------------------------------+------------------------------------- Reporter: nh2 | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Runtime System | Version: 8.0.2 Resolution: | Keywords: gdb, | debugging Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #9706 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by bgamari): It sounds like maybe we should be setting `MADV_DONTDUMP` where available (and later revert it with `MADV_DODUMP`). -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14192#comment:13 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14192: Change to 1TB VIRT allocation makes it impossible to core-dump Haskell
programs
-------------------------------------+-------------------------------------
Reporter: nh2 | Owner: (none)
Type: bug | Status: new
Priority: normal | Milestone:
Component: Runtime System | Version: 8.0.2
Resolution: | Keywords: gdb,
| debugging
Operating System: Unknown/Multiple | Architecture:
Type of failure: Runtime | Unknown/Multiple
performance bug | Test Case:
Blocked By: | Blocking:
Related Tickets: #9706 | Differential Rev(s):
Wiki Page: |
-------------------------------------+-------------------------------------
Comment (by bgamari):
So Phab:D3929 doesn't currently address the issue. I've found that
`strace` produces some suspicious looking output when run on a program
compiled with that patch,
{{{
$ cat >hi.hs <

#14192: Change to 1TB VIRT allocation makes it impossible to core-dump Haskell
programs
-------------------------------------+-------------------------------------
Reporter: nh2 | Owner: (none)
Type: bug | Status: new
Priority: normal | Milestone:
Component: Runtime System | Version: 8.0.2
Resolution: | Keywords: gdb,
| debugging
Operating System: Unknown/Multiple | Architecture:
Type of failure: Runtime | Unknown/Multiple
performance bug | Test Case:
Blocked By: | Blocking:
Related Tickets: #9706 | Differential Rev(s):
Wiki Page: |
-------------------------------------+-------------------------------------
Comment (by bgamari):
Ahh, I see, `advise` isn't a bitmap; I've confirmed that this,
{{{#!c
#include

#14192: Change to 1TB VIRT allocation makes it impossible to core-dump Haskell programs -------------------------------------+------------------------------------- Reporter: nh2 | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Runtime System | Version: 8.0.2 Resolution: | Keywords: gdb, | debugging Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #9706 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by bgamari): With Phab:D3929 appropriately updated the test from comment:5 produces a 7MByte core dump from `gdb`. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14192#comment:16 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14192: Change to 1TB VIRT allocation makes it impossible to core-dump Haskell
programs
-------------------------------------+-------------------------------------
Reporter: nh2 | Owner: (none)
Type: bug | Status: new
Priority: normal | Milestone:
Component: Runtime System | Version: 8.0.2
Resolution: | Keywords: gdb,
| debugging
Operating System: Unknown/Multiple | Architecture:
Type of failure: Runtime | Unknown/Multiple
performance bug | Test Case:
Blocked By: | Blocking:
Related Tickets: #9706 | Differential Rev(s):
Wiki Page: |
-------------------------------------+-------------------------------------
Comment (by Ben Gamari

#14192: Change to 1TB VIRT allocation makes it impossible to core-dump Haskell programs -------------------------------------+------------------------------------- Reporter: nh2 | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: 8.4.1 Component: Runtime System | Version: 8.0.2 Resolution: | Keywords: gdb, | debugging Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #9706 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by bgamari): * milestone: => 8.4.1 -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14192#comment:18 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14192: Change to 1TB VIRT allocation makes it impossible to core-dump Haskell programs -------------------------------------+------------------------------------- Reporter: nh2 | Owner: (none) Type: bug | Status: closed Priority: normal | Milestone: 8.4.1 Component: Runtime System | Version: 8.0.2 Resolution: fixed | Keywords: gdb, | debugging Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #9706 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by bgamari): * status: new => closed * resolution: => fixed -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14192#comment:19 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14192: Change to 1TB VIRT allocation makes it impossible to core-dump Haskell programs -------------------------------------+------------------------------------- Reporter: nh2 | Owner: (none) Type: bug | Status: closed Priority: normal | Milestone: 8.4.1 Component: Runtime System | Version: 8.0.2 Resolution: fixed | Keywords: gdb, | debugging Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #9706 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by nh2): I can confirm that with this patch, `generate-core-file` works fine. For a simple 2-line application, it generates me a ~8 MB core file, and I can re-load that core file into gdb and look at the backtrace successfully. Nice work! -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14192#comment:20 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler
participants (1)
-
GHC