Signal handler scheduling on single-threaded RTS

11 Nov 2010

      I've been poking the bad behavior Brian described here:

    http://blog.ezyang.com/2010/08/interrupting-ghc/comment-page-1/#comment-1334

and in the process noticed something kind of interesting about
thread scheduling in non-multithreaded mode (i.e. without -threaded).

When the single-threaded RTS receives a signal, it writes it to
pending_handler_buf, and hopes that eventually startSignalHandlers creates
the threads to handle the signal.  This doesn't happen instantaneously,
which is /really/ obvious if you're looping on an FFI call (like a program
might use readline).  Here's an example (with some extra debug statements
from me):

    installing Haskell signal handler
    cap 0: thread 1 stopped (suspended while making a foreign call)
    > ^Cstoring pending signal

    cap 0: running thread 1 (ThreadRunGHC)
    cap 0: thread 1 stopped (suspended while making a foreign call)
    cap 0: running thread 1 (ThreadRunGHC)
    Just ""
    cap 0: thread 1 stopped (suspended while making a foreign call)
    > 
    cap 0: running thread 1 (ThreadRunGHC)
    cap 0: thread 1 stopped (yielding)
    cap 0: thread 1 appended to run queue
    starting signal handlers
    scheduling a thread to handle signal (signo=2)
    cap 0: created thread 2
    cap 0: thread 2 appended to run queue
    cap 0: running thread 1 (ThreadRunGHC)
    cap 0: thread 1 stopped (suspended while making a foreign call)
    cap 0: running thread 1 (ThreadRunGHC)

As you can see, the signal is stored immediately, but thread 1 gets
another crack at running the FFI call before it yields and we start
signal handlers.

Then, it turns out, thread 1 /never/ yields to anyone else, unless I
add this following patch:

hunk ./rts/posix/Signals.c 432
                                            &base_GHCziConcziSignal_runHandlers_closure,
                                            rts_mkPtr(cap, info)),
                                  rts_mkInt(cap, info->si_signo))));
+    contextSwitchCapability(&MainCapability);
   }

   unblockUserSignals();

In which case the program finally realizes that there's a signal and handles
it about seven <ENTER> presses later.

I wonder if we should make it so that contextSwitchCapability(..) works more
instantly; possibly by checking for it before we start a safe FFI call and
yielding before we go into the FFI.

Cheers,
Edward

P.S. -threaded execution is broken in a different way: the ^C doesn't get handled
until the FFI call returns to Haskell. But that's another issue entirely, and one
in which readline is partially to blame. :-)

Edward Z. Yang

Edward Z. Yang

Edward Z. Yang

tags

participants (1)