[GHC] #10840: Periodic alarm signals can cause a retry loop to get stuck

#10840: Periodic alarm signals can cause a retry loop to get stuck -------------------------------------+------------------------------------- Reporter: Rufflewind | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.10.2 Keywords: | Operating System: Unknown/Multiple Architecture: | Type of failure: None/Unknown Unknown/Multiple | Test Case: | Blocked By: Blocking: | Related Tickets: Differential Revisions: | -------------------------------------+------------------------------------- The [https://ghc.haskell.org/trac/ghc/wiki/Commentary/Rts/Signals periodic alarm signals emitted by the GHC RTS] can [https://github.com/haskell/directory/issues/35 hang the program under certain circumstances]. In particular, any retry-loop of the form {{{#!c while (interruptible_syscall() == FAILED && errno == EINTR); }}} can cause a hang if the syscall takes a long time and cannot be resumed, as it will forced to restart from the beginning each time it gets interrupted. If the syscall takes longer than the interval between successive alarm signals, it will be stuck in this loop forever. This was found to occur with `statfs64` (indirectly called by `realpath`) and `open` on SSHFS on Mac OS X. [https://mail.haskell.org/pipermail/ghc-devs/2015-September/009770.html Using a safe foreign import seems to mitigate the issue], but [https://mail.haskell.org/pipermail/ghc-devs/2015-September/009793.html according to Simon Marlow this is probably an accident]. So far there seems to be no way to guarantee the suspension of signals except through low-level tools like `pthread_setmask`. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10840 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10840: Periodic alarm signals can cause a retry loop to get stuck -------------------------------------+------------------------------------- Reporter: Rufflewind | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.10.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Revisions: -------------------------------------+------------------------------------- Changes (by simonmar): * cc: simonmar (added) -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10840#comment:1 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10840: Periodic alarm signals can cause a retry loop to get stuck -------------------------------------+------------------------------------- Reporter: Rufflewind | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.10.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Revisions: -------------------------------------+------------------------------------- Changes (by hamish): * cc: hamish (added) -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10840#comment:2 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10840: Periodic alarm signals can cause a retry loop to get stuck -------------------------------------+------------------------------------- Reporter: Rufflewind | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.10.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Revisions: -------------------------------------+------------------------------------- Changes (by hsyl20): * cc: hsyl20 (added) -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10840#comment:3 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10840: Periodic alarm signals can cause a retry loop to get stuck -------------------------------------+------------------------------------- Reporter: Rufflewind | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.10.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by hsyl20): On Mac OS, you can try to enable USE_PTHREAD_FOR_ITIMER in rts/posix/Itimer.c to make the RTS avoid using alarm signals. Maybe we should make it a flag (or the default) for the threaded RTS? On Linux the same problem may occur. We can use the same solution and even improve it by using the timerfd_* syscalls instead of usleep (cf the following attachment). -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10840#comment:4 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10840: Periodic alarm signals can cause a retry loop to get stuck -------------------------------------+------------------------------------- Reporter: Rufflewind | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.10.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by hsyl20): * Attachment "0001-rts-timer-use-timerfd_-on-Linux-instead-of- signals.patch" added. Make the RTS use timerfd_* on Linux -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10840 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10840: Periodic alarm signals can cause a retry loop to get stuck -------------------------------------+------------------------------------- Reporter: Rufflewind | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.10.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonmar): I like this patch a lot, those signals have been causing problems for a long time. Want to put it on Phabricator so we can review and get it in? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10840#comment:5 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10840: Periodic alarm signals can cause a retry loop to get stuck -------------------------------------+------------------------------------- Reporter: Rufflewind | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.10.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by hsyl20): Done: https://phabricator.haskell.org/D1947 A potential issue is that these syscalls have been introduced in 2.6.25. Do we want to support older kernels? The issue remains for Mac OS X. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10840#comment:6 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10840: Periodic alarm signals can cause a retry loop to get stuck
-------------------------------------+-------------------------------------
Reporter: Rufflewind | Owner:
Type: bug | Status: new
Priority: normal | Milestone:
Component: Compiler | Version: 7.10.2
Resolution: | Keywords:
Operating System: Unknown/Multiple | Architecture:
| Unknown/Multiple
Type of failure: None/Unknown | Test Case:
Blocked By: | Blocking:
Related Tickets: | Differential Rev(s):
Wiki Page: |
-------------------------------------+-------------------------------------
Comment (by Ben Gamari

#10840: Periodic alarm signals can cause a retry loop to get stuck -------------------------------------+------------------------------------- Reporter: Rufflewind | Owner: Type: bug | Status: merge Priority: normal | Milestone: 8.0.1 Component: Compiler | Version: 7.10.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by bgamari): * status: new => merge * milestone: => 8.0.1 Comment: This seems like a bad enough issue that we may want to merge to 8.0. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10840#comment:8 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10840: Periodic alarm signals can cause a retry loop to get stuck -------------------------------------+------------------------------------- Reporter: Rufflewind | Owner: Type: bug | Status: closed Priority: normal | Milestone: 8.0.1 Component: Compiler | Version: 7.10.2 Resolution: fixed | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by bgamari): * status: merge => closed * resolution: => fixed Comment: Merged to `ghc-8.0` as bbdc52f3a6e6a28e209fb8f65699121d4ef3a4e3 and fd3e581b7c9142247601774afc98e49f63b8af45. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10840#comment:9 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10840: Periodic alarm signals can cause a retry loop to get stuck -------------------------------------+------------------------------------- Reporter: Rufflewind | Owner: Type: bug | Status: new Priority: normal | Milestone: 8.0.1 Component: Compiler | Version: 7.10.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by hsyl20): * status: closed => new * resolution: fixed => Comment: The ticket was for Mac OS X. We only solved the issue on Linux. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10840#comment:10 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10840: Periodic alarm signals can cause a retry loop to get stuck -------------------------------------+------------------------------------- Reporter: Rufflewind | Owner: Type: bug | Status: new Priority: high | Milestone: 8.0.1 Component: Compiler | Version: 7.10.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by thoughtpolice): * priority: normal => high Comment: I'm bumping the priority on this and keeping it in 8.0.1, as it seems rather bad to leave open on OS X, especially if the fix in comment:4 can work (use USE_PTHREAD_FOR_ITIMER, if I'm following correctly). Would anyone like to test this? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10840#comment:11 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10840: Periodic alarm signals can cause a retry loop to get stuck -------------------------------------+------------------------------------- Reporter: Rufflewind | Owner: bgamari Type: bug | Status: new Priority: high | Milestone: 8.0.1 Component: Compiler | Version: 7.10.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by bgamari): * owner: => bgamari Comment: I can test comment:4. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10840#comment:12 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10840: Periodic alarm signals can cause a retry loop to get stuck -------------------------------------+------------------------------------- Reporter: Rufflewind | Owner: bgamari Type: bug | Status: new Priority: high | Milestone: 8.0.1 Component: Compiler | Version: 7.10.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by bgamari): Unfortunately I have observed non-reproducible hangs while validating on OS X with `USE_PTHREAD_FOR_ITIMER`. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10840#comment:13 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10840: Periodic alarm signals can cause a retry loop to get stuck -------------------------------------+------------------------------------- Reporter: Rufflewind | Owner: bgamari Type: bug | Status: new Priority: high | Milestone: 8.0.2 Component: Compiler | Version: 7.10.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by bgamari): * milestone: 8.0.1 => 8.0.2 Comment: This won't happen for 8.0.1. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10840#comment:14 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10840: Periodic alarm signals can cause a retry loop to get stuck -------------------------------------+------------------------------------- Reporter: Rufflewind | Owner: bgamari Type: bug | Status: new Priority: high | Milestone: 8.0.2 Component: Compiler | Version: 7.10.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by bgamari): Unfortunately it seems that the patch as merged in comment:9 has a subtle race condition (see #11830) and introduces unnecessary wake-ups (see #11965, #1623). I'm going to revert this for 8.0.1 and perhaps we can give it another try in 8.0.2 if this issues have been sorted. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10840#comment:15 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10840: Periodic alarm signals can cause a retry loop to get stuck -------------------------------------+------------------------------------- Reporter: Rufflewind | Owner: bgamari Type: bug | Status: new Priority: high | Milestone: 8.0.2 Component: Compiler | Version: 7.10.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by hsyl20): In the threaded RTS, on platforms without "timerfd" we could create a "ghc ticker" thread too and block the alarm signal in all threads except this one. I think this would fix the reported issue on OS X. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10840#comment:16 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

In the threaded RTS, on platforms without "timerfd" we could create a "ghc ticker" thread too and block the alarm signal in all threads except
#10840: Periodic alarm signals can cause a retry loop to get stuck -------------------------------------+------------------------------------- Reporter: Rufflewind | Owner: bgamari Type: bug | Status: new Priority: high | Milestone: 8.0.2 Component: Compiler | Version: 7.10.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by hsyl20): Replying to [comment:16 hsyl20]: this one. I think this would fix the reported issue on OS X. Forget this idea, @simonmar told me that it won't work nicely with threads that don't belong to GHC. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10840#comment:17 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10840: Periodic alarm signals can cause a retry loop to get stuck -------------------------------------+------------------------------------- Reporter: Rufflewind | Owner: bgamari Type: bug | Status: new Priority: high | Milestone: 8.0.2 Component: Compiler | Version: 7.10.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by bgamari): The hangs should now be fixed. I'll try reenabling `USE_PTHREAD_FOR_ITIMER` once I have access to the OS X test again. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10840#comment:18 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10840: Periodic alarm signals can cause a retry loop to get stuck ---------------------------------+---------------------------------------- Reporter: Rufflewind | Owner: bgamari Type: bug | Status: new Priority: high | Milestone: 8.2.1 Component: Compiler | Version: 7.10.2 Resolution: | Keywords: Operating System: MacOS X | Architecture: Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | ---------------------------------+---------------------------------------- Changes (by thomie): * os: Unknown/Multiple => MacOS X * milestone: 8.0.2 => 8.2.1 Comment: If I understand correctly, this is just an `OS X` issue now. And the commits for #11830, #11965 and this ticket have not been merged to the ghc-8.0 branch. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10840#comment:19 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10840: Periodic alarm signals can cause a retry loop to get stuck ---------------------------------+---------------------------------------- Reporter: Rufflewind | Owner: bgamari Type: bug | Status: patch Priority: high | Milestone: 8.2.1 Component: Compiler | Version: 7.10.2 Resolution: | Keywords: Operating System: MacOS X | Architecture: Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D2796 Wiki Page: | ---------------------------------+---------------------------------------- Changes (by bgamari): * status: new => patch * differential: => Phab:D2796 Comment: Here is a patch enabling the pthread-based itimer implementation on Darwin. Let's see how Harbormaster fares. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10840#comment:20 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10840: Periodic alarm signals can cause a retry loop to get stuck
---------------------------------+----------------------------------------
Reporter: Rufflewind | Owner: bgamari
Type: bug | Status: patch
Priority: high | Milestone: 8.2.1
Component: Compiler | Version: 7.10.2
Resolution: | Keywords:
Operating System: MacOS X | Architecture: Unknown/Multiple
Type of failure: None/Unknown | Test Case:
Blocked By: | Blocking:
Related Tickets: | Differential Rev(s): Phab:D2796
Wiki Page: |
---------------------------------+----------------------------------------
Comment (by Ben Gamari

#10840: Periodic alarm signals can cause a retry loop to get stuck ---------------------------------+---------------------------------------- Reporter: Rufflewind | Owner: bgamari Type: bug | Status: closed Priority: high | Milestone: 8.2.1 Component: Compiler | Version: 7.10.2 Resolution: fixed | Keywords: Operating System: MacOS X | Architecture: Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D2796 Wiki Page: | ---------------------------------+---------------------------------------- Changes (by bgamari): * status: patch => closed * resolution: => fixed -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10840#comment:22 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler
participants (1)
-
GHC