[GHC] #13434: hs_try_putmvar003 is timing out / segfaulting

#13434: hs_try_putmvar003 is timing out / segfaulting -------------------------------------+------------------------------------- Reporter: ezyang | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Runtime | Version: 8.1 System | Keywords: | Operating System: Unknown/Multiple Architecture: | Type of failure: Runtime crash Unknown/Multiple | Test Case: | Blocked By: Blocking: | Related Tickets: Differential Rev(s): | Wiki Page: -------------------------------------+------------------------------------- This is basically reproduceable every time I validate. During validation, the program times out, but when I run it by hand, it segfaults. Running in debug sanity mode, I get: {{{ hstry: internal error: MVAR_CLEAN on mutable list (GHC version 8.3.20170316 for x86_64_unknown_linux) Please report this as a GHC bug: http://www.haskell.org/ghc/reportabug }}} Simon Marlow, I'm happy to investigate more if you can't repro. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13434 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13434: hs_try_putmvar003 is timing out / segfaulting -------------------------------------+------------------------------------- Reporter: ezyang | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Runtime System | Version: 8.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Runtime crash | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by trommler): * cc: trommler (added) -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13434#comment:1 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13434: hs_try_putmvar003 is timing out / segfaulting -------------------------------------+------------------------------------- Reporter: ezyang | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Runtime System | Version: 8.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Runtime crash | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonmar): I haven't seen this, it doesn't seem to be happening in Harbourmaster. If you could investigate more that would be great. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13434#comment:2 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13434: hs_try_putmvar003 is timing out / segfaulting -------------------------------------+------------------------------------- Reporter: ezyang | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Runtime System | Version: 8.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Runtime crash | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by ezyang): I was able to reproduce on a 64-bit Ubuntu machine. Here were my steps: 1. Download ghc-head from hvr's PPA https://launchpad.net/~hvr/+archive/ubuntu/ghc 2. Build the test case: `ghc-head -c hs_try_putmvar003.hs` and then ` ghc- head hs_try_putmvar003_c.c hs_try_putmvar003.o -debug` 3. Repeatedly run `./a.out 1 1 1 1 +RTS -DS` until it hangs or segfaults. In 25 runs, 4 failed in various different ways (assert fail, segfault, or deadlock.) It's a bit annoying because the C stub uses pthreads whether or not the threaded runtime is linked, so I don't have something that's deterministic. Scanning over the code, I didn't see anything obviously wrong, so I'm going to do some bisecting and see if this always failed or if it's a regression introduced by an unrelated change (levity polymorphism causing the unsafeCoerce to be bad, maybe?) -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13434#comment:3 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13434: hs_try_putmvar003 is timing out / segfaulting -------------------------------------+------------------------------------- Reporter: ezyang | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Runtime System | Version: 8.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Runtime crash | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by bgamari): I have also seen this occasionally. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13434#comment:4 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13434: hs_try_putmvar003 is timing out / segfaulting -------------------------------------+------------------------------------- Reporter: ezyang | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Runtime System | Version: 8.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Runtime crash | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonmar): So that repro compiles the test without `-threaded`, but `-threaded` is required because the test calls into the RTS using multiple threads. I'll fix the test so that it fails when linked without `-threaded` rather than randomly crashing. There could still be an issue, but this isn't it. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13434#comment:5 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13434: hs_try_putmvar003 is timing out / segfaulting -------------------------------------+------------------------------------- Reporter: ezyang | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Runtime System | Version: 8.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Runtime crash | Test Case: Blocked By: | Blocking: 13722 Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by bgamari): jared-w seems to be seeing this reproducibly on his machine. We haven't yet determined what it is about his setup that makes this so, but he uses Arch Linux machine on a dual-core machine. `+RTS -Ds` says the following before hanging, {{{ ... 7ff838ff9700: cap 0: schedule() 7ff838ff9700: giving up capability 0 7ff838ff9700: passing capability 0 to worker 0x7ff8397fa700 7ff8397fa700: resuming capability 0 7ff8397fa700: cap 0: running thread 515 (ThreadRunGHC) 7ff8397fa700: cap 0: thread 515 stopped (blocked on an MVar) thread 515 @ 0x4200369d98 is blocked on an MVar @ 0x420003c798 (TSO_DIRTY) 7ff8397fa700: giving up capability 0 7ff8397fa700: freeing capability 0 7ff894ff9700: cap 0: waking up thread 515 on cap 0 7ff894ff9700: passing capability 0 to worker 0x7ff8397fa700 7ff8397fa700: woken up on capability 0 7ff8397fa700: resuming capability 0 7ff8397fa700: cap 0: running thread 515 (ThreadRunGHC) 7ff8397fa700: cap 0: waking up thread 117 on cap 0 7ff8397fa700: cap 0: thread 515 stopped (finished) 7ff8397fa700: cap 0: running thread 117 (ThreadRunGHC) 7ff8397fa700: cap 0: thread 117 stopped (suspended while making a foreign call) 7ff8397fa700: freeing capability 0 7ff8b718b700: returning; I want capability 0 7ff8b718b700: resuming capability 0 7ff8b718b700: cap 0: running thread 3 (ThreadRunGHC) 7ff8b718b700: cap 0: thread 3 stopped (yielding) 7ff8b718b700: cap 0: running thread 3 (ThreadRunGHC) 7ff8b718b700: cap 0: thread 3 stopped (suspended while making a foreign call) 7ff8b718b700: passing capability 0 to worker 0x7ff838ff9700 7ff838ff9700: woken up on capability 0 7ff838ff9700: resuming capability 0 7ff838ff9700: deadlocked, forcing major GC... 7ff838ff9700: cap 0: requesting parallel GC 7ff838ff9700: 0 idle caps all threads: threads on capability 0: other threads: thread 117 @ 0x4200368920 is blocked on an external call (TSO_DIRTY) thread 116 @ 0x42002c6b58 is blocked on an external call (TSO_DIRTY) thread 115 @ 0x42003f0d88 is blocked on an external call thread 114 @ 0x42003fc858 is blocked on an external call (TSO_DIRTY) thread 24 @ 0x4200361e28 is blocked on an external call (TSO_DIRTY) thread 23 @ 0x42003ebdb0 is blocked on an external call (TSO_DIRTY) thread 22 @ 0x42003d5858 is blocked on an external call (TSO_DIRTY) thread 21 @ 0x42003d1400 is blocked on an external call (TSO_DIRTY) thread 20 @ 0x42003e1400 is blocked on an external call (TSO_DIRTY) thread 19 @ 0x42003d2858 is blocked on an external call thread 18 @ 0x42003a40a0 is blocked on an external call thread 17 @ 0x4200397a88 is blocked on an external call (TSO_DIRTY) thread 16 @ 0x42003bcec8 is blocked on an external call (TSO_DIRTY) thread 15 @ 0x4200393f28 is blocked on an external call (TSO_DIRTY) thread 14 @ 0x420039d9e8 is blocked on an external call (TSO_DIRTY) thread 5 @ 0x42003744c8 is blocked on an external call (TSO_DIRTY) thread 4 @ 0x420036e358 is blocked on an MVar @ 0x420036da10 thread 3 @ 0x42002ba0f0 ["TimerManager"] is blocked on an external call (TSO_DIRTY) thread 2 @ 0x42002ba168 ["IOManager on cap 0"] is blocked on an external call 7ff838ff9700: cap 0: starting GC 7ff838ff9700: cap 0: GC working 7ff838ff9700: cap 0: GC idle 7ff838ff9700: cap 0: GC done 7ff838ff9700: cap 0: GC idle 7ff838ff9700: cap 0: GC done 7ff838ff9700: cap 0: GC idle 7ff838ff9700: cap 0: GC done 7ff838ff9700: cap 0: all caps stopped for GC 7ff838ff9700: cap 0: finished GC 7ff838ff9700: giving up capability 0 7ff838ff9700: freeing capability 0 }}} -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13434#comment:7 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13434: hs_try_putmvar003 is timing out / segfaulting -------------------------------------+------------------------------------- Reporter: ezyang | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Runtime System | Version: 8.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Runtime crash | Test Case: Blocked By: | Blocking: 13722 Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by bgamari): I can also reproduce the failure in my NixOS VM. Both my NixOS installation and jared-w's machine run glibc-2.25, whereas my usual development machine and Harbormaster both use 2.24. This may be causal or maybe not. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13434#comment:8 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13434: hs_try_putmvar003 is timing out / segfaulting -------------------------------------+------------------------------------- Reporter: ezyang | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: 8.4.1 Component: Runtime System | Version: 8.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Runtime crash | Test Case: Blocked By: | Blocking: 13722 Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by bgamari): * milestone: => 8.4.1 Comment: It seems that the testcase is getting stuck on `pthread_mutex_destroy` and `pthread_cond_destroy`. Commenting both of these calls in `destroyCallbackQueue` allows the test to run to completion. It seems that the issue is that the `do { ... } while(1)` loop in `callback` never terminates and therefore the mutex is never unlocked. I suspect this fails only now because `glibc` is now more strict about checking that the mutex is not locked before freeing it. Simon, how did you intend for the loop in `callback` to terminate? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13434#comment:9 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13434: hs_try_putmvar003 is timing out / segfaulting -------------------------------------+------------------------------------- Reporter: ezyang | Owner: (none) Type: bug | Status: patch Priority: normal | Milestone: 8.4.1 Component: Runtime System | Version: 8.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Runtime crash | Test Case: Blocked By: | Blocking: 13722 Related Tickets: | Differential Rev(s): Phab:D3724 Wiki Page: | -------------------------------------+------------------------------------- Changes (by bgamari): * status: new => patch * differential: => Phab:D3724 -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13434#comment:10 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13434: hs_try_putmvar003 is timing out / segfaulting
-------------------------------------+-------------------------------------
Reporter: ezyang | Owner: (none)
Type: bug | Status: patch
Priority: normal | Milestone: 8.4.1
Component: Runtime System | Version: 8.1
Resolution: | Keywords:
Operating System: Unknown/Multiple | Architecture:
| Unknown/Multiple
Type of failure: Runtime crash | Test Case:
Blocked By: | Blocking: 13722
Related Tickets: | Differential Rev(s): Phab:D3724
Wiki Page: |
-------------------------------------+-------------------------------------
Comment (by Ben Gamari

#13434: hs_try_putmvar003 is timing out / segfaulting -------------------------------------+------------------------------------- Reporter: ezyang | Owner: (none) Type: bug | Status: closed Priority: normal | Milestone: 8.2.1 Component: Runtime System | Version: 8.1 Resolution: fixed | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Runtime crash | Test Case: Blocked By: | Blocking: 13722 Related Tickets: | Differential Rev(s): Phab:D3724 Wiki Page: | -------------------------------------+------------------------------------- Changes (by bgamari): * status: patch => closed * resolution: => fixed * milestone: 8.4.1 => 8.2.1 Comment: Merging to `ghc-8.2` as well to ensure clean validation. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13434#comment:12 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler
participants (1)
-
GHC