Signals and external bindings...

I've just finished a QAD project in Python, and expect them to ask
me to build the production version of the same project. I'd like to
switch to Haskell, but (again, for those who noticed) have some
questions I'd like answered.
One problem I ran into is that I use Unix signals to control the
various processes. In CPython, this causes a problem with extension
code because the python signal handler at the C level just note the
signal, then wait to run the Python "signal handler" the next time the
interpreter. Since my extensions are doing the heavy lifting, they
often run for long periods (by which I mean 10s of minutes). Meaning
the signal is ignored for long periods.
Since I expect to such extensions (the wrappers are available) and
want to leave the control mechanisms in place, I'd like to know if I'm
going to have similar problems in Haskell.
Thanks,

On Wed, May 2, 2012 at 1:00 PM, Mike Meyer
One problem I ran into is that I use Unix signals to control the various processes. In CPython, this causes a problem with extension code because the python signal handler at the C level just note the signal, then wait to run the Python "signal handler" the next time the interpreter. Since my extensions are doing the heavy lifting, they often run for long periods (by which I mean 10s of minutes). Meaning the signal is ignored for long periods.
Since I expect to such extensions (the wrappers are available) and want to leave the control mechanisms in place, I'd like to know if I'm going to have similar problems in Haskell.
Yes, and in pretty much any other language that requires its own runtime as well. Cross-runtime borders are *always* a problem for signals and various resources that may need to be cleaned up. You can use wrappers which save the old signal handlers and install new ones which clean up after your plugins and return. Doing so, and thereby learning the hard way what "clean up after your plugins" entails (unless you were very careful designing and writing them in the first place), will teach you why nobody tries to automatically handle it for you. (In the general case, your plugin has to track *everything* so it can unwind (or commit, as appropriate) memory and resource allocations on signal.) -- brandon s allbery allbery.b@gmail.com wandering unix systems administrator (available) (412) 475-9364 vm/sms

On Wed, 2 May 2012 13:46:39 -0400
Brandon Allbery
On Wed, May 2, 2012 at 1:00 PM, Mike Meyer
wrote: You can use wrappers which save the old signal handlers and install new ones which clean up after your plugins and return. Doing so, and thereby learning the hard way what "clean up after your plugins" entails (unless you were very careful designing and writing them in the first place), will teach you why nobody tries to automatically handle it for you. (In the general case, your plugin has to track *everything* so it can unwind (or commit, as appropriate) memory and resource allocations on signal.)
So unless I'm using something like Qt which can catch the signals and
run it's own code to call back to my code and then shut itself down,
I'm pretty much SOL.
Thanks,

On Wed, May 2, 2012 at 2:29 PM, Mike Meyer
On Wed, 2 May 2012 13:46:39 -0400 Brandon Allbery
wrote: On Wed, May 2, 2012 at 1:00 PM, Mike Meyer
wrote: You can use wrappers which save the old signal handlers and install new ones which clean up after your plugins and return. Doing so, and thereby learning the hard way what "clean up after your plugins" entails (unless you were very careful designing and writing them in the first place), will teach you why nobody tries to automatically handle it for you. (In the So unless I'm using something like Qt which can catch the signals and run it's own code to call back to my code and then shut itself down, I'm pretty much SOL.
Pretty much. And Qt only makes it a little easier; I think there are ways to do that with Haskell as well, but it doesn't necessarily earn you much aside from confusion if something goes wrong because of all the boundary crossings. -- brandon s allbery allbery.b@gmail.com wandering unix systems administrator (available) (412) 475-9364 vm/sms

On 2 May 2012 19:29, Mike Meyer
On Wed, 2 May 2012 13:46:39 -0400 Brandon Allbery
wrote: On Wed, May 2, 2012 at 1:00 PM, Mike Meyer
wrote: You can use wrappers which save the old signal handlers and install new ones which clean up after your plugins and return. Doing so, and thereby learning the hard way what "clean up after your plugins" entails (unless you were very careful designing and writing them in the first place), will teach you why nobody tries to automatically handle it for you. (In the general case, your plugin has to track *everything* so it can unwind (or commit, as appropriate) memory and resource allocations on signal.) So unless I'm using something like Qt which can catch the signals and run it's own code to call back to my code and then shut itself down, I'm pretty much SOL.
Thanks,
What would happen if there was a Haskell process that ran as a dispatcher/telephone exchange? If this received a message from a C process and then sent the signal, wouldn't this work?

On Thu, May 3, 2012 at 11:48 AM, umptious
On 2 May 2012 19:29, Mike Meyer
wrote: On Wed, 2 May 2012 13:46:39 -0400 Brandon Allbery
wrote: On Wed, May 2, 2012 at 1:00 PM, Mike Meyer
wrote: You can use wrappers which save the old signal handlers and install new ones which clean up after your plugins and return. Doing so, and thereby learning the hard way what "clean up after your plugins" entails (unless you were very careful designing and writing them in the first place), will teach you why nobody tries to automatically handle it for you. (In the general case, your plugin has to track *everything* so it can unwind (or commit, as appropriate) memory and resource allocations on signal.) So unless I'm using something like Qt which can catch the signals and run it's own code to call back to my code and then shut itself down, I'm pretty much SOL.
What would happen if there was a Haskell process that ran as a dispatcher/telephone exchange? If this received a message from a C process and then sent the signal, wouldn't this work?
What problem do you think this is solving? Because it isn't solving the problem with the Haskell runtime being unable to clean up the internal state of arbitrary C code, which is the reason C code is run the way it is by most other environments. -- brandon s allbery allbery.b@gmail.com wandering unix systems administrator (available) (412) 475-9364 vm/sms

What would happen if there was a Haskell process that ran as a dispatcher/telephone exchange? If this received a message from a C process and then sent the signal, wouldn't this work?
What problem do you think this is solving? Because it isn't solving the problem with the Haskell runtime being unable to clean up the internal state of arbitrary C code, which is the reason C code is run the way it is by most other environments.
Since my extensions are doing the heavy lifting, they often run for long periods (by which I mean 10s of minutes). Meaning
Well the OP wrote the signal is ignored for long periods.<< and you responded
Cross-runtime borders are *always* a problem for signals and various resources that may need to be cleaned up. <<
So if both statements are true and non-misleading, a solution with a dispatcher thread to catch messages would avoid this problem. Because there would be no cross-runtime border for signals. And the dispatcher thread could respond to messages immediately with "I'm taking responsibility for this message now" possibily simplifying the C code. Or the C code has to send signals, then wouldn't it be possible for a dispatcher thread to catch them, and wouldn't that simplify architecture?
Because it isn't solving the problem with the Haskell runtime being unable to clean up the internal state of arbitrary C code, which is the reason C code is run the way it is by most other environments.
Well, NOTHING Haskell could do could clean up after literally arbitrary C code! But was the OP asking that question? It would seem a rather strange one. I thought the problem was that additional complexity was being added to said clean-up was coming from the very long time required to respond to signals, and the OP was asking if there was anyway to avoid this - which is a very different question.

On Fri, May 4, 2012 at 12:23 PM, umptious
What would happen if there was a Haskell process that ran as a dispatcher/telephone exchange? If this received a message from a C process and then sent the signal, wouldn't this work?
What problem do you think this is solving? Because it isn't solving the problem with the Haskell runtime being unable to clean up the internal state of arbitrary C code, which is the reason C code is run the way it is by most other environments.
Well the OP wrote
Since my extensions are doing the heavy lifting, they often run for long periods (by which I mean 10s of minutes). Meaning the signal is ignored for long periods.<<
and you responded
Cross-runtime borders are *always* a problem for signals and various resources that may need to be cleaned up. <<
So if both statements are true and non-misleading, a solution with a dispatcher thread to catch messages would avoid this problem. Because there would be no cross-runtime border for signals. And the dispatcher thread could respond to messages immediately with "I'm taking responsibility for this message now" possibily simplifying the C code.
This is about OS signals, not some kind of routable or controllable messages. Your options are ignore, die, or a *single* signal handler which can only safely do a limited number of things. In any case, the OP is looking for some kind of unwind-on-signal capability so that receipt of a signal safely aborts the C function in such a way that the Haskell or Python runtime can resume and neither memory nor resources (unclear what the hooks are doing but conceivably they could be working with files etc.) will be leaked. In practice, when a signal arrives you can't guarantee much of anything, and specifically in particular neither malloc() nor stdio are guaranteed to be consistent; if you want to recover cleanly, you can only set a flag that the main flow of execution checks regularly (or some morally equivalent mechanism; event-driven systems, including GHC's own runtime, generally write a byte down a designated pipe which is then handled by the standard event loop). There just isn't a good way to deal with this otherwise even in an all-C system; when you're mixing disparate runtimes, it's close to impossible to handle it sanely. Well, NOTHING Haskell could do could clean up after literally arbitrary C
code! But was the OP asking that question? It would seem a rather strange one. I thought the problem was that additional complexity was being added to said clean-up was coming from the very long time required to respond to signals, and the OP was asking if there was anyway to avoid this - which is a very different question.
Only a different question if you don't have any idea what's going on... The OP specifically noted that signals are blocked during cross-calls from Python and asked if the same is true of Haskell; the answer is yes, and also for most other runtimes. I explained *why* they are blocked. What the OP wants is for a signal to abort the C function and return control to the Haskell or Python runtime. Do you understand why this requires arbitrary cleanup capability to be safe? (Hrm; do you understand that C doesn't have garbage collection or even reference counting, but is entirely manual resource management? I didn't explicitly say that, although what I did say should have implied it.) -- brandon s allbery allbery.b@gmail.com wandering unix systems administrator (available) (412) 475-9364 vm/sms

On Fri, 4 May 2012 12:54:00 -0400
Brandon Allbery
What the OP wants is for a signal to abort the C function and return control to the Haskell or Python runtime.
Not quite. What I want is for my code in Haskell or Python to run when the signal arrives (basically, on the signal handler), as opposed to waiting until the C code returns control to the Haskell or Python runtime But that's just the first step.
Do you understand why this requires arbitrary cleanup capability to be safe?
Well, it's clear to me that what I want to do requires arbitrary
cleanup capabilities. If the C library doesn't provide a mechanism to
shut itself down cleanly, nothing will work.
But I realized I never got a more fundamental question answered: when
does a signal handler written in Haskell run? Is it like C, in that it
runs when the signal arrives, even if I'm currently executing some
wrapped C code, and I have to deal with funky rules about what it can
do? Or is it like Python, and it will run the first time the Haskell
runtime gets control after the signal arrives?
Thanks,

On Fri, May 4, 2012 at 1:26 PM, Mike Meyer
But I realized I never got a more fundamental question answered: when does a signal handler written in Haskell run? Is it like C, in that it runs when the signal arrives, even if I'm currently executing some wrapped C code, and I have to deal with funky rules about what it can do? Or is it like Python, and it will run the first time the Haskell runtime gets control after the signal arrives?
I'm not sure it *does* run. That is, I don't think a Haskell-based signal handler is even possible. When I talk about Haskell's signal handlers, I mean C handlers within the GHC runtime. 1. The GHC runtime only protects garbage collection from signals, it doesn't protect anything else. In particular, unless memory allocation is somehow atomic, a signal arriving while something is doing an allocation dare not itself do any allocation — which means it can do very nearly nothing. 2. Registering a Haskell-based signal handler would be a special case of arranging for C to call into Haskell code (since that is in fact what it would be doing; the real signal handler is C invoked from an assembler trampoline). As things currently work, this can only be done from calls to C which suspend the Haskell runtime in a known safe state; asynchronous signals don't provide any way to arrange this, any more than they insure C's malloc() arena is consistent or etc. If a Haskell-hosted signal handler mechanism were to be provided, it would have to be done by invoking the handler from the next run of the event loop. The existing signal handlers work this way already, as I mentioned already; the actual signal handler writes a byte doen a pipe, and the runtime acts on it from its event loop. This is because there is so little that is safe to do from any signal handler, and even less that is safe within the context of GHC's runtime. -- brandon s allbery allbery.b@gmail.com wandering unix systems administrator (available) (412) 475-9364 vm/sms

On Fri, May 4, 2012 at 1:19 PM, Brandon Allbery
On Fri, May 4, 2012 at 1:26 PM, Mike Meyer
wrote: But I realized I never got a more fundamental question answered: when does a signal handler written in Haskell run? Is it like C, in that it runs when the signal arrives, even if I'm currently executing some wrapped C code, and I have to deal with funky rules about what it can do? Or is it like Python, and it will run the first time the Haskell runtime gets control after the signal arrives?
I'm not sure it *does* run. That is, I don't think a Haskell-based signal handler is even possible. When I talk about Haskell's signal handlers, I mean C handlers within the GHC runtime.
Well, there is this: http://www.haskell.org/ghc/docs/latest/html/libraries/unix-2.5.1.0/System-Po... But it's bit short on what exact semantics it provides. It looks like it uses this chunk of code on GHC (on Posix systems): https://github.com/ghc/ghc/blob/376210565e4dff2679246c6ebbcdbb3163c9e8a5/rts... Antoine

On Fri, May 4, 2012 at 3:55 PM, Antoine Latter
On Fri, May 4, 2012 at 1:19 PM, Brandon Allbery
wrote: On Fri, May 4, 2012 at 1:26 PM, Mike Meyer
wrote: But I realized I never got a more fundamental question answered: when does a signal handler written in Haskell run? Is it like C, in that it
I'm not sure it *does* run. That is, I don't think a Haskell-based signal handler is even possible. When I talk about Haskell's signal handlers, I mean C handlers within the GHC runtime.
Well, there is this:
http://www.haskell.org/ghc/docs/latest/html/libraries/unix-2.5.1.0/System-Po...
OK, looks like that does what I said in my final paragraph (which was written based on the runtime Signals.c source). -- brandon s allbery allbery.b@gmail.com wandering unix systems administrator (available) (412) 475-9364 vm/sms

On Fri, 4 May 2012 14:19:37 -0400
Brandon Allbery
On Fri, May 4, 2012 at 1:26 PM, Mike Meyer
wrote: But I realized I never got a more fundamental question answered: when does a signal handler written in Haskell run? Is it like C, in that it runs when the signal arrives, even if I'm currently executing some wrapped C code, and I have to deal with funky rules about what it can do? Or is it like Python, and it will run the first time the Haskell runtime gets control after the signal arrives? I'm not sure it *does* run. That is, I don't think a Haskell-based signal handler is even possible. When I talk about Haskell's signal handlers, I mean C handlers within the GHC runtime.
Ok, now we're just having vocabulary issues. "Signal handler" is used to mean two different things: 1) Code that runs in the context provided by the OS for handling the signal. As you say, this is typically written in C, and has a bunch of restrictions on what it can and cannot do. 2) Code that someone writes to be run in response to a signal from the OS. In C (and presumably C# and C++), the two are identical. The restrictions may well cause you to write handler-1 code that will be set an notice of some kind to run code in your main context when it's convenient. The latter is technically handler-2 code, but calling such a signal handler isn't common.
1. The GHC runtime only protects garbage collection from signals, it doesn't protect anything else. In particular, unless memory allocation is somehow atomic, a signal arriving while something is doing an allocation dare not itself do any allocation — which means it can do very nearly nothing.
I have to wonder how the boehm gc handles such things?
If a Haskell-hosted signal handler mechanism were to be provided, it would have to be done by invoking the handler from the next run of the event loop. The existing signal handlers work this way already, as I mentioned already; the actual signal handler writes a byte doen a pipe, and the runtime acts on it from its event loop. This is because there is so little that is safe to do from any signal handler, and even less that is safe within the context of GHC's runtime.
Well, one is provided, and implemented as you describe. Given a
non-trivial runtime for the language, it seems you either have to do
that, or provide a handler-1 environment with so many restrictions
that you can't do anything useful, or write a handler-1 that runs the
handler-2 once the runtime becomes available again.
I guess my question was "which of those two does haskell do", and the
answer is the latter.
The languages that do the latter all seem to mask the fact that you're
writing handler-2 code: the API is similar to the OS API, and the
handler-1 code is pretty much invisible.
Thanks,

On Fri, 4 May 2012 17:23:36 +0100
umptious
What would happen if there was a Haskell process that ran as a dispatcher/telephone exchange? If this received a message from a C process and then sent the signal, wouldn't this work? What problem do you think this is solving? Because it isn't solving the problem with the Haskell runtime being unable to clean up the internal state of arbitrary C code, which is the reason C code is run the way it is by most other environments. Well the OP wrote Since my extensions are doing the heavy lifting, they often run for long periods (by which I mean 10s of minutes). Meaning the signal is ignored for long periods.<< and you responded Cross-runtime borders are *always* a problem for signals and various resources that may need to be cleaned up. << So if both statements are true and non-misleading, a solution with a dispatcher thread to catch messages would avoid this problem. Because there would be no cross-runtime border for signals. And the dispatcher thread could respond to messages immediately with "I'm taking responsibility for this message now" possibily simplifying the C code.
My (aka the OP) problem isn't with signals per se, it's with shutting down the C code cleanly. The way things are handled in the current implementation does pretty much what I understand you to be saying here: The runtime catches the signal. Since it was jumped into from the C code, it can't really do anything about it, so it notes it for the next time it's actually in control (which is how I read your "then sent the signal"). The problem is that the C code may run for minutes before that happens.
Or the C code has to send signals, then wouldn't it be possible for a dispatcher thread to catch them, and wouldn't that simplify architecture?
No, that doesn't help. I'm not got threads now, so why add another problem? In any case, now the dispatcher thread catches the signal while waiting "in the runtime" and can process it immediately, but it's still got to deal with shutting down the thread running the C code. Either way, unless the C code provides some way to say "stop this", you're SOL. And that may not be enough. If you're running in a signal handler triggered while running the C code, you have to follow whatever rules *it* has for doing so. In my case that was "you can't call any of our methods in a signal handler."
Because it isn't solving the problem with the Haskell runtime being unable to clean up the internal state of arbitrary C code, which is the reason C code is run the way it is by most other environments. Well, NOTHING Haskell could do could clean up after literally arbitrary C code! But was the OP asking that question? It would seem a rather strange one. I thought the problem was that additional complexity was being added to said clean-up was coming from the very long time required to respond to signals, and the OP was asking if there was anyway to avoid this - which is a very different question.
Actually, the question I was trying to ask is:
"Will my Haskell signal handlers get run when the signal arrives, or
do I have to wait for the C code to return control to Haskell for them
to run?" I think the answer is "I have to wait".
The way I beat the problem in the current code was to add a timer to
the C code that does nothing every second - but does via a callback to
my runtime. That's when my signal handler will get run. Since that's
not running in the actual signal handler (which just notes the signal
for when control returns to my runtime), that can use the methods the
C library provides to stop it cleanly.
It works. It's ugly. It's very touchy about code changes. It was a
pain to figure out. But it does work.

It is a tricky subject, but not insurmountable. http://blog.ezyang.com/2010/08/interrupting-ghc/ http://www.haskell.org/ghc/docs/7.2.2/html/users_guide/ffi.html#ffi-interrup... http://blog.ezyang.com/2010/11/its-just-a-longjmp-to-the-left/ Cheers, Edward Excerpts from Mike Meyer's message of Wed May 02 13:00:41 -0400 2012:
I've just finished a QAD project in Python, and expect them to ask me to build the production version of the same project. I'd like to switch to Haskell, but (again, for those who noticed) have some questions I'd like answered.
One problem I ran into is that I use Unix signals to control the various processes. In CPython, this causes a problem with extension code because the python signal handler at the C level just note the signal, then wait to run the Python "signal handler" the next time the interpreter. Since my extensions are doing the heavy lifting, they often run for long periods (by which I mean 10s of minutes). Meaning the signal is ignored for long periods.
Since I expect to such extensions (the wrappers are available) and want to leave the control mechanisms in place, I'd like to know if I'm going to have similar problems in Haskell.
Thanks,
participants (5)
-
Antoine Latter
-
Brandon Allbery
-
Edward Z. Yang
-
Mike Meyer
-
umptious