[GHC] #14538: forkprocess01 fails occassionally on with multiple ACQUIRE_LOCK panic

#14538: forkprocess01 fails occassionally on with multiple ACQUIRE_LOCK panic ----------------------------------------+--------------------------------- Reporter: bgamari | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Runtime System | Version: 8.2.1 Keywords: | Operating System: MacOS X Architecture: Unknown/Multiple | Type of failure: None/Unknown Test Case: | Blocked By: Blocking: | Related Tickets: Differential Rev(s): | Wiki Page: ----------------------------------------+--------------------------------- I noticed that `forkprocess01` failed to build on Darwin and commit eb86e867694bceedfb47a527d71429197ffe6dda with {{{ Stderr ( forkprocess01 ): forkprocess01: internal error: multiple ACQUIRE_LOCK: rts/Task.c 228 (GHC version 8.3.20171128 for x86_64_apple_darwin) Please report this as a GHC bug: http://www.haskell.org/ghc/reportabug /bin/sh: line 1: 5124 Abort trap: 6 ./forkprocess01 +RTS -ls -RTS *** unexpected failure for forkprocess01(threaded1_ls) }}} -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14538 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14538: forkprocess01 fails occassionally on with multiple ACQUIRE_LOCK panic -------------------------------------+------------------------------------- Reporter: bgamari | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Runtime System | Version: 8.3 Resolution: | Keywords: Operating System: MacOS X | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by bgamari): * version: 8.2.1 => 8.3 Comment: Since this appears to be OS X-specific I suspect this is due to the `ITimer.c` implementation. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14538#comment:1 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14538: forkprocess01 fails occassionally on with multiple ACQUIRE_LOCK panic -------------------------------------+------------------------------------- Reporter: bgamari | Owner: (none) Type: bug | Status: new Priority: high | Milestone: Component: Runtime System | Version: 8.3 Resolution: | Keywords: Operating System: MacOS X | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by bgamari): * priority: normal => high -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14538#comment:2 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14538: forkprocess01 fails occassionally on with multiple ACQUIRE_LOCK panic -------------------------------------+------------------------------------- Reporter: bgamari | Owner: (none) Type: bug | Status: new Priority: high | Milestone: Component: Runtime System | Version: 8.3 Resolution: | Keywords: Operating System: MacOS X | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by bgamari): I seem to be seeing this more and more often. The fact that this failure is in `forkprocess01` definitely raises questions. Maybe this isn't an RTS bug at all but rather just a manifestation of one of the many gotchas associated with `fork`. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14538#comment:3 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14538: forkprocess01 fails occassionally on with multiple ACQUIRE_LOCK panic -------------------------------------+------------------------------------- Reporter: bgamari | Owner: (none) Type: bug | Status: new Priority: highest | Milestone: Component: Runtime System | Version: 8.3 Resolution: | Keywords: Operating System: MacOS X | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by bgamari): * priority: high => highest Comment: We really need to do something about this; a significant fraction of OS X builds are failing due to this test. I'm going to `skip` it on OS X for the time being. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14538#comment:4 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14538: forkprocess01 fails occassionally on with multiple ACQUIRE_LOCK panic -------------------------------------+------------------------------------- Reporter: bgamari | Owner: (none) Type: bug | Status: new Priority: highest | Milestone: Component: Runtime System | Version: 8.3 Resolution: | Keywords: Operating System: MacOS X | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by osa1): How can I reproduce this? I'm trying this on ghc-mini (ghc at df2c3b3364834d2fd038192c89348fc50a2e0475), `forkprocess01` passes every time. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14538#comment:5 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14538: forkprocess01 fails occassionally on with multiple ACQUIRE_LOCK panic -------------------------------------+------------------------------------- Reporter: bgamari | Owner: (none) Type: bug | Status: new Priority: highest | Milestone: Component: Runtime System | Version: 8.3 Resolution: | Keywords: Operating System: MacOS X | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by bgamari): Hmm, perhaps it only occurs under load? How many times did you run the test? I would set it in a loop and let it run for an hours or so before drawing any conclusions. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14538#comment:6 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14538: forkprocess01 fails occassionally on with multiple ACQUIRE_LOCK panic -------------------------------------+------------------------------------- Reporter: bgamari | Owner: (none) Type: bug | Status: new Priority: highest | Milestone: Component: Runtime System | Version: 8.3 Resolution: | Keywords: Operating System: MacOS X | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by osa1):
How many times did you run the test? I would set it in a loop and let it run for an hours or so before drawing any conclusions.
I run it 100 times, passed every time. I'll keep it running for a few hours and see if that makes any difference. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14538#comment:7 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14538: forkprocess01 fails occassionally on with multiple ACQUIRE_LOCK panic -------------------------------------+------------------------------------- Reporter: bgamari | Owner: (none) Type: bug | Status: new Priority: highest | Milestone: Component: Runtime System | Version: 8.3 Resolution: | Keywords: Operating System: MacOS X | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by osa1): Here's a code path that may be causing this: - rts/Schedule.c `forkProcess()` (called by the library) acquires `all_tasks_mutex` (in line 1987) - `forkProcess()` calls `fork()` - If in parent process (which means all locks are still held), it releases a few locks (but not `all_tasks_mutex`) and calls `releaseCapability_` for all capabilities. - In rts/Capability.c `releaseCapability_()`, when these conditions hold 1. `cap->n_returning_tasks == 0` 2. There is not a pending sync 3. Next thread in the run queue is not a bound one 4. The capability has spare workers (`cap->spare_workers` is not `NULL`) 5. The capability's run queue is not empty (`cap->n_run_queue != 0`) and we're not shutting down (`sched_state != SCHED_SHUTTING_DOWN`) When all these hold `releaseCapability_()` calls `startWorkerTask()` (rts/Task.c), which in turn calls `newTask()`, which tries to take `all_tasks_mutex`, causing this bug. Btw, if I'm reading this correctly there is at least one more bug. The fork(2) man page says state of mutex is also replicated in the child process, so `all_tasks_mutex` will be acquired in the child process. However in the "child" branch of `forkProcess()` we initialize `all_tasks_mutex` without releasing it, and `pthread_mutex_init` man page says "Attempting to initialize an already initialized mutex results in undefined behavior.". So far I've run this test more than 1500 times on ghc-mini and it passed every time. I'll try to reproduce locally based on the information above. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14538#comment:8 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14538: forkprocess01 fails occassionally on with multiple ACQUIRE_LOCK panic -------------------------------------+------------------------------------- Reporter: bgamari | Owner: (none) Type: bug | Status: new Priority: highest | Milestone: Component: Runtime System | Version: 8.3 Resolution: | Keywords: Operating System: MacOS X | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by osa1): I confirmed that if the code path described above is taken we get this error. gdb output: {{{
call startWorkerTask(cap) Main: internal error: multiple ACQUIRE_LOCK: rts/Task.c 228 (GHC version 8.5.20180301 for x86_64_unknown_linux) Please report this as a GHC bug: http://www.haskell.org/ghc/reportabug
Thread 1 "Main" received signal SIGABRT, Aborted. 0x00007ffff6c8d428 in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:54 54 ../sysdeps/unix/sysv/linux/raise.c: No such file or directory. }}}[ -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14538#comment:9 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14538: forkprocess01 fails occassionally on with multiple ACQUIRE_LOCK panic -------------------------------------+------------------------------------- Reporter: bgamari | Owner: (none) Type: bug | Status: new Priority: highest | Milestone: Component: Runtime System | Version: 8.3 Resolution: | Keywords: Operating System: MacOS X | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by osa1): I manage to reproduce this locally on my Linux laptop: {{{ $ ./Main +RTS -N Main: internal error: multiple ACQUIRE_LOCK: rts/Task.c 228 (GHC version 8.5.20180301 for x86_64_unknown_linux) Please report this as a GHC bug: http://www.haskell.org/ghc/reportabug in child process [2] 6823 abort (core dumped) ./Main +RTS -N $ ./Main +RTS -N in parent process in child process Just (Exited (ExitFailure 72)) $ ./Main +RTS -N Main: internal error: multiple ACQUIRE_LOCK: rts/Task.c 228 (GHC version 8.5.20180301 for x86_64_unknown_linux) Please report this as a GHC bug: http://www.haskell.org/ghc/reportabug in child process [2] 6927 abort (core dumped) ./Main +RTS -N }}} reproducer: {{{ import System.Exit import System.Posix.Process import Control.Concurrent main = do p <- forkProcess $ putStrLn "in child process" >> exitWith (ExitFailure 72) putStrLn "in parent process" r <- getProcessStatus True False p yield print r }}} compile with: {{{ ghc-stage2 -O0 Main.hs -debug -rtsopts -threaded -fforce-recomp }}} run with `+RTS -N` -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14538#comment:10 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14538: forkprocess01 fails occassionally on with multiple ACQUIRE_LOCK panic -------------------------------------+------------------------------------- Reporter: bgamari | Owner: (none) Type: bug | Status: new Priority: highest | Milestone: Component: Runtime System | Version: 8.3 Resolution: | Keywords: Operating System: MacOS X | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: #1391, #9295, | Differential Rev(s): #9296 | Wiki Page: | -------------------------------------+------------------------------------- Changes (by osa1): * related: => #1391, #9295, #9296 -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14538#comment:11 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14538: forkprocess01 fails occassionally on with multiple ACQUIRE_LOCK panic -------------------------------------+------------------------------------- Reporter: bgamari | Owner: (none) Type: bug | Status: patch Priority: highest | Milestone: Component: Runtime System | Version: 8.3 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Runtime crash | Test Case: Blocked By: | Blocking: Related Tickets: #1391, #9295, | Differential Rev(s): Phab:D4460 #9296 | Wiki Page: | -------------------------------------+------------------------------------- Changes (by osa1): * status: new => patch * failure: None/Unknown => Runtime crash * differential: => Phab:D4460 * os: MacOS X => Unknown/Multiple -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14538#comment:12 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14538: forkprocess01 fails occassionally on with multiple ACQUIRE_LOCK panic -------------------------------------+------------------------------------- Reporter: bgamari | Owner: (none) Type: bug | Status: patch Priority: highest | Milestone: Component: Runtime System | Version: 8.3 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Runtime crash | Test Case: Blocked By: | Blocking: Related Tickets: #1391, #9295, | Differential Rev(s): Phab:D4460 #9296, #14431 | Wiki Page: | -------------------------------------+------------------------------------- Changes (by bgamari): * related: #1391, #9295, #9296 => #1391, #9295, #9296, #14431 -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14538#comment:13 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14538: forkprocess01 fails occassionally on with multiple ACQUIRE_LOCK panic
-------------------------------------+-------------------------------------
Reporter: bgamari | Owner: (none)
Type: bug | Status: patch
Priority: highest | Milestone:
Component: Runtime System | Version: 8.3
Resolution: | Keywords:
Operating System: Unknown/Multiple | Architecture:
| Unknown/Multiple
Type of failure: Runtime crash | Test Case:
Blocked By: | Blocking:
Related Tickets: #1391, #9295, | Differential Rev(s): Phab:D4460
#9296, #14431 |
Wiki Page: |
-------------------------------------+-------------------------------------
Comment (by Ben Gamari

#14538: forkprocess01 fails occassionally on with multiple ACQUIRE_LOCK panic -------------------------------------+------------------------------------- Reporter: bgamari | Owner: (none) Type: bug | Status: merge Priority: highest | Milestone: 8.4.2 Component: Runtime System | Version: 8.3 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Runtime crash | Test Case: Blocked By: | Blocking: Related Tickets: #1391, #9295, | Differential Rev(s): Phab:D4460 #9296, #14431 | Wiki Page: | -------------------------------------+------------------------------------- Changes (by bgamari): * status: patch => merge * milestone: => 8.4.2 Comment: Excellent sleuthing, osa1! -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14538#comment:15 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14538: forkprocess01 fails occassionally on with multiple ACQUIRE_LOCK panic -------------------------------------+------------------------------------- Reporter: bgamari | Owner: (none) Type: bug | Status: closed Priority: highest | Milestone: 8.4.1 Component: Runtime System | Version: 8.3 Resolution: fixed | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Runtime crash | Test Case: Blocked By: | Blocking: Related Tickets: #1391, #9295, | Differential Rev(s): Phab:D4460 #9296, #14431 | Wiki Page: | -------------------------------------+------------------------------------- Changes (by bgamari): * status: merge => closed * resolution: => fixed * milestone: 8.4.2 => 8.4.1 Comment: Merged in 0dc2a358a954b0b858e91843ade52bb0a28c392d. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14538#comment:16 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler
participants (1)
-
GHC