[GHC] #12230: Non-deterministic ghc-iserv terminated error

#12230: Non-deterministic ghc-iserv terminated error -------------------------------------+------------------------------------- Reporter: ezyang | Owner: simonmar Type: bug | Status: new Priority: highest | Milestone: 8.2.1 Component: GHCi | Version: 8.1 Keywords: | Operating System: Unknown/Multiple Architecture: | Type of failure: None/Unknown Unknown/Multiple | Test Case: | Blocked By: Blocking: | Related Tickets: Differential Rev(s): | Wiki Page: -------------------------------------+------------------------------------- I noticed on a recent validate that I have been getting non-deterministic test failures due to external interpreter: {{{ =====> T10891(ext-interp) 12 of 15 [0, 1, 0] [77/9256] cd "./th/T10891.run" && "/home/hs01/ezyang/ghc-validate/inplace/test spaces/ghc-stage2" -c T10891.h s -dcore-lint -dcmm-lint -dno-debug-output -no-user-package-db -rtsopts -fno-warn-missed-specialisatio ns -fshow-warning-groups -XTemplateHaskell -package template-haskell -fexternal-interpreter -v0 > T108 91.comp.stderr 2>&1 Compile failed (exit code 1) errors were: T10891.hs:30:3: error: • Exception when trying to run compile-time code: ghc-stage2: ghc-iserv terminated (-11) Code: let display :: Name -> Q () display q = ... in do { display ''C; display ''C'; .... } • In the untyped splice: $(let display :: Name -> Q () display q = do { ... } in do { display ''C; display ''C'; display ''C''; .... }) }}} More tests error the higher I crank up parallelism; on a recent full test run I got something like twenty failures of this kind when I have twelve threads. There are at least two problems here. The first is the actual failure, but the second is that there isn't enough diagnostic information here to tell what the actual problem is. Combined with the nondeterministic nature of this bug I'm not sure how to debug it. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12230 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12230: Non-deterministic ghc-iserv terminated error -------------------------------------+------------------------------------- Reporter: ezyang | Owner: simonmar Type: bug | Status: new Priority: highest | Milestone: 8.2.1 Component: GHCi | Version: 8.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonmar): ugh. `ghc-iserv terminated (-11)` means that the ghc-iserv process died with a segfault. I'm not seeing this here, even with a large number of threads. But concurrency in the test suite shouldn't really affect it, because GHC itself and ghc-iserv are both still single-threaded. Is there anything unusual about your platform or setup that might give us a clue? You should have a core dump from `ghc-iserv`. Can you load it up in gdb and see what the stack trace is? You can also try `-opti-v` to get debug output about the messages being exchanged by GHC and ghc-iserv, but this probably won't help much if the problem is non-deterministic. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12230#comment:1 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12230: Non-deterministic ghc-iserv terminated error -------------------------------------+------------------------------------- Reporter: ezyang | Owner: simonmar Type: bug | Status: new Priority: highest | Milestone: 8.2.1 Component: GHCi | Version: 8.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by ezyang): Here is the most common trace I get: {{{ [Current thread is 1 (Thread 0x2b203c5129c0 (LWP 3558))] (gdb) bt #0 0x0000000000000000 in ?? () #1 0x0000000000000000 in ?? () }}} However, I got lucky on one run and here is a more useful trace (I've posted it here http://hs01.scs.stanford.edu/ ): {{{ #0 0x00002ab6bffd82a8 in raise () from /usr/lib/libc.so.6 [Current thread is 1 (Thread 0x2ab6c0b49700 (LWP 7518))] (gdb) bt #0 0x00002ab6bffd82a8 in raise () from /usr/lib/libc.so.6 #1 0x00002ab6bffd972a in abort () from /usr/lib/libc.so.6 #2 0x0000000000c98303 in rtsFatalInternalErrorFn (s=0xd2b456 "invalid closure, info=%p", ap=0x2ab6c0b48a28) at rts/RtsMessages.c:182 #3 0x0000000000c97f35 in barf (s=0xd2b456 "invalid closure, info=%p") at rts/RtsMessages.c:46 #4 0x0000000000ccfb33 in evacuate1 (p=0x41919558) at rts/sm/Evac.c:416 #5 0x0000000000cca042 in scavenge_large_srt_bitmap (large_srt=0x41916068) at rts/sm/Scav.c:308 #6 0x0000000000cca08c in scavenge_srt (srt=0x41916068, srt_bitmap=4294967295) at rts/sm/Scav.c:330 #7 0x0000000000cca172 in scavenge_fun_srt (info=0x406da200) at rts/sm/Scav.c:390 #8 0x0000000000ccc0f6 in scavenge_static () at rts/sm/Scav.c:1747 #9 0x0000000000ccc770 in scavenge_loop1 () at rts/sm/Scav.c:2081 #10 0x0000000000ca8ad5 in scavenge_until_all_done () at rts/sm/GC.c:968 #11 0x0000000000ca7810 in GarbageCollect (collect_gen=1, do_heap_census=rtsFalse, gc_type=2, cap=0x103fa00 <MainCapability>) at rts/sm/GC.c:403 #12 0x0000000000c962c8 in scheduleDoGC (pcap=0x2ab6c0b48ea8, task=0x278b5c0, force_major=rtsTrue) at rts/Schedule.c:1804 #13 0x0000000000c94f80 in scheduleDetectDeadlock (pcap=0x2ab6c0b48ea8, task=0x278b5c0) at rts/Schedule.c:931 #14 0x0000000000c93f4b in schedule (initialCapability=0x103fa00 <MainCapability>, task=0x278b5c0) at rts/Schedule.c:277 #15 0x0000000000c973d4 in scheduleWorker (cap=0x103fa00 <MainCapability>, task=0x278b5c0) at rts/Schedule.c:2516 #16 0x0000000000c9abea in workerStart (task=0x278b5c0) at rts/Task.c:443 #17 0x00002ab6bf812424 in start_thread () from /usr/lib/libpthread.so.0 #18 0x00002ab6c008ccbd in clone () from /usr/lib/libc.so.6 }}} Here's another one I got: {{{ Program terminated with signal SIGILL, Illegal instruction. #0 0x0000000040ed7710 in ?? () [Current thread is 1 (Thread 0x2b2f3a3c09c0 (LWP 30537))] (gdb) bt #0 0x0000000040ed7710 in ?? () #1 0x0000000000000000 in ?? () }}} -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12230#comment:2 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12230: Non-deterministic ghc-iserv terminated error -------------------------------------+------------------------------------- Reporter: ezyang | Owner: simonmar Type: bug | Status: new Priority: highest | Milestone: 8.2.1 Component: GHCi | Version: 8.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonmar): Ok, there's some memory corruption and the stack traces aren't very helpful. I need to repro this somehow. There must be something about your setup that's different from mine - I'm assuming you're on 64-bit Linux? Any custom validate.mk settings? Can you isolate a single test that fails and run it repeatedly? How often does it fail? Can you get a `-opti-v` dump from a failing run? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12230#comment:3 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12230: Non-deterministic ghc-iserv terminated error -------------------------------------+------------------------------------- Reporter: ezyang | Owner: simonmar Type: bug | Status: new Priority: highest | Milestone: 8.2.1 Component: GHCi | Version: 8.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by erikd): On x86_64/Linux (Debian testing) with `BuildFlavour` == `perf-llvm` I get 40+ (of the 42 tests I'm running) failing every single time. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12230#comment:4 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12230: Non-deterministic ghc-iserv terminated error -------------------------------------+------------------------------------- Reporter: ezyang | Owner: simonmar Type: bug | Status: new Priority: highest | Milestone: 8.2.1 Component: GHCi | Version: 8.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonmar): I think this is probably something in the RTS linker. I'm debugging it on Edward's machine but I'm stuck on a permission problem for now. @erikd My guess is that if you add `DYNAMIC_GHC_PROGRAMS=NO` to your build.mk with `perf-llvm`, then GHCi will be completely broken. Perhaps something in LLVM is tickling a missing or broken case in the linker. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12230#comment:5 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12230: Non-deterministic ghc-iserv terminated error -------------------------------------+------------------------------------- Reporter: ezyang | Owner: simonmar Type: bug | Status: new Priority: highest | Milestone: 8.2.1 Component: GHCi | Version: 8.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonmar): Also, on Edward's machine: {{{ [simonmar@hs01 th]$ gcc --version gcc (GCC) 5.3.0 Copyright (C) 2015 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. }}} That's a very recent version of gcc, which makes me think this is a bad interaction between something new gcc is doing and the RTS linker. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12230#comment:6 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12230: Non-deterministic ghc-iserv terminated error -------------------------------------+------------------------------------- Reporter: ezyang | Owner: simonmar Type: bug | Status: new Priority: highest | Milestone: 8.2.1 Component: GHCi | Version: 8.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by erikd): @simonmar Yes, with `DYNAMIC_GHC_PROGRAMS=NO` I now get "775 unexpected failures". -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12230#comment:7 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12230: Non-deterministic ghc-iserv terminated error -------------------------------------+------------------------------------- Reporter: ezyang | Owner: simonmar Type: bug | Status: new Priority: highest | Milestone: 8.2.1 Component: GHCi | Version: 8.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonmar): Thanks that's useful, we should probably have a separate ticket for that, want to make one? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12230#comment:8 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12230: Non-deterministic ghc-iserv terminated error -------------------------------------+------------------------------------- Reporter: ezyang | Owner: simonmar Type: bug | Status: new Priority: highest | Milestone: 8.2.1 Component: GHCi | Version: 8.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by erikd): Created #12238. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12230#comment:9 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12230: Non-deterministic ghc-iserv terminated error -------------------------------------+------------------------------------- Reporter: ezyang | Owner: simonmar Type: bug | Status: new Priority: highest | Milestone: 8.2.1 Component: GHCi | Version: 8.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D2371 Wiki Page: | -------------------------------------+------------------------------------- Changes (by simonmar): * differential: => Phab:D2371 Comment: Proposed fix in Phab:D2371 -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12230#comment:10 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#12230: Non-deterministic ghc-iserv terminated error
-------------------------------------+-------------------------------------
Reporter: ezyang | Owner: simonmar
Type: bug | Status: new
Priority: highest | Milestone: 8.2.1
Component: GHCi | Version: 8.1
Resolution: | Keywords:
Operating System: Unknown/Multiple | Architecture:
| Unknown/Multiple
Type of failure: None/Unknown | Test Case:
Blocked By: | Blocking:
Related Tickets: | Differential Rev(s): Phab:D2371
Wiki Page: |
-------------------------------------+-------------------------------------
Comment (by Simon Marlow

#12230: Non-deterministic ghc-iserv terminated error -------------------------------------+------------------------------------- Reporter: ezyang | Owner: simonmar Type: bug | Status: closed Priority: highest | Milestone: 8.2.1 Component: GHCi | Version: 8.1 Resolution: fixed | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D2371 Wiki Page: | -------------------------------------+------------------------------------- Changes (by simonmar): * status: new => closed * resolution: => fixed -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/12230#comment:12 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler
participants (1)
-
GHC