[GHC] #12038: Shutdown interacts badly with requestSync()

#12038: Shutdown interacts badly with requestSync() -------------------------------------+------------------------------------- Reporter: simonmar | Owner: Type: bug | Status: new Priority: normal | Milestone: 8.2.1 Component: Runtime | Version: 7.10.3 System | Keywords: | Operating System: Unknown/Multiple Architecture: | Type of failure: None/Unknown Unknown/Multiple | Test Case: | Blocked By: Blocking: | Related Tickets: Differential Rev(s): | Wiki Page: -------------------------------------+------------------------------------- -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12038 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12038: Shutdown interacts badly with requestSync() -------------------------------------+------------------------------------- Reporter: simonmar | Owner: Type: bug | Status: new Priority: normal | Milestone: 8.2.1 Component: Runtime System | Version: 7.10.3 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Description changed by simonmar: @@ -0,0 +1,3 @@ + I've been investigating #10860, and the problem goes pretty deep, + so I'm going to record what I know here and come back to fix it + properly later. @@ -2,0 +5,19 @@ + We have this mechanism `requestSync()` for operations that need + to seize control of the whole runtime to do something. It is used by + + * `scheduleDoGC()` + * `setNumCapabilities()` + * `forkProcess()` + + `requestSync()` ensures that only one of these operations wins, + the others will `yieldCapability()` to the winner, before + continuing with their own sync. + + The problem is that this interacts badly with shutdown. Shutdown + might start at any time (initiated by `exitScheduler()`). If it + starts during a sync, then a deadlock is likely: some + capabilities will be already shut down, and cannot be acquired by + `acquireAllCapabilities()`. This happens in #10860. + + Really, shutdown should play the `requestSync()` game too, but + that requires a lot of thought. New description: I've been investigating #10860, and the problem goes pretty deep, so I'm going to record what I know here and come back to fix it properly later. We have this mechanism `requestSync()` for operations that need to seize control of the whole runtime to do something. It is used by * `scheduleDoGC()` * `setNumCapabilities()` * `forkProcess()` `requestSync()` ensures that only one of these operations wins, the others will `yieldCapability()` to the winner, before continuing with their own sync. The problem is that this interacts badly with shutdown. Shutdown might start at any time (initiated by `exitScheduler()`). If it starts during a sync, then a deadlock is likely: some capabilities will be already shut down, and cannot be acquired by `acquireAllCapabilities()`. This happens in #10860. Really, shutdown should play the `requestSync()` game too, but that requires a lot of thought. -- -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12038#comment:1 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12038: Shutdown interacts badly with requestSync()
-------------------------------------+-------------------------------------
Reporter: simonmar | Owner:
Type: bug | Status: new
Priority: normal | Milestone: 8.2.1
Component: Runtime System | Version: 7.10.3
Resolution: | Keywords:
Operating System: Unknown/Multiple | Architecture:
| Unknown/Multiple
Type of failure: None/Unknown | Test Case:
Blocked By: | Blocking: 10860
Related Tickets: | Differential Rev(s):
Wiki Page: |
-------------------------------------+-------------------------------------
Comment (by Simon Marlow

#12038: Shutdown interacts badly with requestSync()
-------------------------------------+-------------------------------------
Reporter: simonmar | Owner:
Type: bug | Status: new
Priority: normal | Milestone: 8.2.1
Component: Runtime System | Version: 7.10.3
Resolution: | Keywords:
Operating System: Unknown/Multiple | Architecture:
| Unknown/Multiple
Type of failure: None/Unknown | Test Case:
Blocked By: | Blocking:
Related Tickets: | Differential Rev(s):
Wiki Page: |
-------------------------------------+-------------------------------------
Comment (by Ben Gamari

#12038: Shutdown interacts badly with requestSync() -------------------------------------+------------------------------------- Reporter: simonmar | Owner: Type: bug | Status: new Priority: normal | Milestone: 8.2.1 Component: Runtime System | Version: 7.10.3 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by bgamari): I've still seen failures of `setnumcapabilities001` since comment:3 was pushed. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12038#comment:6 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12038: Shutdown interacts badly with requestSync() -------------------------------------+------------------------------------- Reporter: simonmar | Owner: Type: bug | Status: new Priority: normal | Milestone: 8.2.1 Component: Runtime System | Version: 7.10.3 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonmar): Oh dear. Which ways? What errors? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12038#comment:7 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12038: Shutdown interacts badly with requestSync() -------------------------------------+------------------------------------- Reporter: simonmar | Owner: Type: bug | Status: new Priority: normal | Milestone: 8.4.1 Component: Runtime System | Version: 7.10.3 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by bgamari): * milestone: 8.2.1 => 8.4.1 Comment:
Oh dear. Which ways? What errors?
Unfortunately it doesn't happen very often; I'll try to paste some output here next time I see this rear its ugly head. Bumping to 8.4. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12038#comment:8 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12038: Shutdown interacts badly with requestSync() -------------------------------------+------------------------------------- Reporter: simonmar | Owner: Type: bug | Status: new Priority: normal | Milestone: 8.4.1 Component: Runtime System | Version: 7.10.3 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by bgamari): Here's one example, {{{ +++ "/tmp/ghctest-8lrukibe/test spaces/./concurrent/should_run/setnumcapabilities001.run/setnumcapabilities001.run.stderr.normalised" 2017-01-05 17:29:54.673595299 -0500 @@ -0,0 +1 @@ +setnumcapabilities001: sendWakeup: invalid argument (Bad file descriptor) }}} -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12038#comment:9 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12038: Shutdown interacts badly with requestSync() -------------------------------------+------------------------------------- Reporter: simonmar | Owner: Type: bug | Status: new Priority: normal | Milestone: 8.4.1 Component: Runtime System | Version: 7.10.3 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by bgamari): I've left the test running for several thousand iterations in the background and have seen the error mentioned in comment:9 pop up a handful of times. On the bright side, this appears to be the only failure mode. It seems what is happening here is that the IO manager is trying to wake- up a manager thread which has already exited. We could simply add an `IORef` to `Control` to indicate that the manager has exited, but it's not clear to me whether this would merely be working around some more sinister root cause. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12038#comment:10 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12038: Shutdown interacts badly with requestSync() -------------------------------------+------------------------------------- Reporter: simonmar | Owner: Type: bug | Status: new Priority: normal | Milestone: 8.4.1 Component: Runtime System | Version: 7.10.3 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by bgamari): I scattered a few `HasCallStacks` about and let it run again and saw the following failure, {{{ setnumcapabilities001: sendWakeup CallStack (from HasCallStack): sendWakeup, called at libraries/base/GHC/Event/TimerManager.hs:205:19 in base:GHC.Event.TimerManager wakeManager, called at libraries/base/GHC/Event/TimerManager.hs:223:7 in base:GHC.Event.TimerManager registerTimeout, called at libraries/base/GHC/Event/Thread.hs:59:10 in base:GHC.Event.Thread: invalid argument (Bad file descriptor) }}} The last frame of the callstack corresponds to the `registerTimeout` in `threadDelay`. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12038#comment:11 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12038: Shutdown interacts badly with requestSync() -------------------------------------+------------------------------------- Reporter: simonmar | Owner: Type: bug | Status: patch Priority: normal | Milestone: 8.4.1 Component: Runtime System | Version: 7.10.3 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D2926 Wiki Page: | -------------------------------------+------------------------------------- Changes (by bgamari): * status: new => patch * differential: => Phab:D2926 Comment: Phab:D2926 is a somewhat questionable patch I put together between builds. I'm a bit unsure as to whether the approach is the sort of thing we want, though. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12038#comment:12 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12038: Shutdown interacts badly with requestSync()
-------------------------------------+-------------------------------------
Reporter: simonmar | Owner:
Type: bug | Status: patch
Priority: normal | Milestone: 8.4.1
Component: Runtime System | Version: 7.10.3
Resolution: | Keywords:
Operating System: Unknown/Multiple | Architecture:
| Unknown/Multiple
Type of failure: None/Unknown | Test Case:
Blocked By: | Blocking:
Related Tickets: | Differential Rev(s): Phab:D2926
Wiki Page: |
-------------------------------------+-------------------------------------
Comment (by Ben Gamari

#12038: Shutdown interacts badly with requestSync() -------------------------------------+------------------------------------- Reporter: simonmar | Owner: Type: bug | Status: closed Priority: normal | Milestone: 8.2.1 Component: Runtime System | Version: 7.10.3 Resolution: fixed | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D2926 Wiki Page: | -------------------------------------+------------------------------------- Changes (by bgamari): * status: patch => closed * resolution: => fixed * milestone: 8.4.1 => 8.2.1 Comment: I think comment:13 should fix it. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12038#comment:14 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12038: Shutdown interacts badly with requestSync()
-------------------------------------+-------------------------------------
Reporter: simonmar | Owner:
Type: bug | Status: closed
Priority: normal | Milestone: 8.2.1
Component: Runtime System | Version: 7.10.3
Resolution: fixed | Keywords:
Operating System: Unknown/Multiple | Architecture:
| Unknown/Multiple
Type of failure: None/Unknown | Test Case:
Blocked By: | Blocking:
Related Tickets: | Differential Rev(s): Phab:D2926
Wiki Page: |
-------------------------------------+-------------------------------------
Comment (by Ben Gamari

#12038: Shutdown interacts badly with requestSync() -------------------------------------+------------------------------------- Reporter: simonmar | Owner: Type: bug | Status: new Priority: normal | Milestone: 8.2.1 Component: Runtime System | Version: 7.10.3 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D2926 Wiki Page: | -------------------------------------+------------------------------------- Changes (by simonmar): * status: closed => new * resolution: fixed => Comment: @bgamari: I think the original issue that this ticket describes still exists. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12038#comment:16 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12038: Shutdown interacts badly with requestSync() -------------------------------------+------------------------------------- Reporter: simonmar | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: 8.4.1 Component: Runtime System | Version: 7.10.3 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: #5553 | Differential Rev(s): Phab:D2926 Wiki Page: | -------------------------------------+------------------------------------- Changes (by dfeuer): * related: => #5553 Comment: The fix for this ''appears'' to have fixed #5553 as well. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12038#comment:18 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12038: Shutdown interacts badly with requestSync() -------------------------------------+------------------------------------- Reporter: simonmar | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: 8.4.1 Component: Runtime System | Version: 7.10.3 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: #5553 | Differential Rev(s): Phab:D2926 Wiki Page: | -------------------------------------+------------------------------------- Comment (by dfeuer): Well, I guess not the ''fix'' for this, but Ben's patch. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12038#comment:19 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12038: Shutdown interacts badly with requestSync() -------------------------------------+------------------------------------- Reporter: simonmar | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Runtime System | Version: 7.10.3 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: #5553 | Differential Rev(s): Phab:D2926 Wiki Page: | -------------------------------------+------------------------------------- Changes (by bgamari): * milestone: 8.6.1 => Comment: Demilestoning due to lack of progress. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12038#comment:21 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler
participants (1)
-
GHC