Signal handler scheduling on single-threaded RTS

I've been poking the bad behavior Brian described here: http://blog.ezyang.com/2010/08/interrupting-ghc/comment-page-1/#comment-1334 and in the process noticed something kind of interesting about thread scheduling in non-multithreaded mode (i.e. without -threaded). When the single-threaded RTS receives a signal, it writes it to pending_handler_buf, and hopes that eventually startSignalHandlers creates the threads to handle the signal. This doesn't happen instantaneously, which is /really/ obvious if you're looping on an FFI call (like a program might use readline). Here's an example (with some extra debug statements from me): installing Haskell signal handler cap 0: thread 1 stopped (suspended while making a foreign call) > ^Cstoring pending signal cap 0: running thread 1 (ThreadRunGHC) cap 0: thread 1 stopped (suspended while making a foreign call) cap 0: running thread 1 (ThreadRunGHC) Just "" cap 0: thread 1 stopped (suspended while making a foreign call) > cap 0: running thread 1 (ThreadRunGHC) cap 0: thread 1 stopped (yielding) cap 0: thread 1 appended to run queue starting signal handlers scheduling a thread to handle signal (signo=2) cap 0: created thread 2 cap 0: thread 2 appended to run queue cap 0: running thread 1 (ThreadRunGHC) cap 0: thread 1 stopped (suspended while making a foreign call) cap 0: running thread 1 (ThreadRunGHC) As you can see, the signal is stored immediately, but thread 1 gets another crack at running the FFI call before it yields and we start signal handlers. Then, it turns out, thread 1 /never/ yields to anyone else, unless I add this following patch: hunk ./rts/posix/Signals.c 432 &base_GHCziConcziSignal_runHandlers_closure, rts_mkPtr(cap, info)), rts_mkInt(cap, info->si_signo)))); + contextSwitchCapability(&MainCapability); } unblockUserSignals(); In which case the program finally realizes that there's a signal and handles it about seven <ENTER> presses later. I wonder if we should make it so that contextSwitchCapability(..) works more instantly; possibly by checking for it before we start a safe FFI call and yielding before we go into the FFI. Cheers, Edward P.S. -threaded execution is broken in a different way: the ^C doesn't get handled until the FFI call returns to Haskell. But that's another issue entirely, and one in which readline is partially to blame. :-)

Relatedly, it seems that it takes a nontrivial amount of time for the multithreaded RTS to "realize" that a thread has emitted a signal. I wonder if there's a more direct way an FFI call can say "when I get back to Haskell, immediately start propagating an exception." Edward

Oh, there's quite a simple fix for this: don't have the FFI call handle the SIGINT; only handle a signal the RTS generates. I guess we should do a little more legwork to make sure interruptible(?) user threads don't see signals. Edward
participants (1)
-
Edward Z. Yang