Why it's dangerous to fork off a new process in Glasgow Haskell

I have some code which forks off a new process which does some non-Haskelly stuff. Thus it does the standard things, something like processId <- Posix.forkProcess case processId of Just _ -> -- parent process, continue normally ... Nothing -> -- child process do ... -- rearrange plumbing of stdin/stdout/stderr Posix.executeFile [blah] The GHC runProcess function uses a similar mechanism. Unfortunately this has developed an extremely irritating bug. The problem is that my program has a lot of worker threads which do things like communicating with servers, already running programs, and so on. These worker threads get replicated in the child process (like everything else) which means that you now have two clients communicating with the server, where there ought to be one, result chaos. (Or in this case, the entire program coming to a halt.) I've had this code for several years (I inherited it), but the problem seems to have recently got much worse; I don't know whether this is because of a change in GHC's scheduling with ghc5.04, or because I am now compiling much more complicated programs. Whatever, I simply couldn't find a way of fixing this in Haskell. It would be good if there were a way of telling GHC's RTS scheduler "Please don't run any other threads apart from this one until further notice". If so, you could do [block other threads] [do fork] [if parent process then unblock other threads ... else -- child process rearrange plumbing and do exec ] But there isn't. So in the end I resorted to writing a C function which does both the forking and replumbing and execing. Since GHC doesn't run native threads while foreign C code is executing, this fixes the problem. It would be nice if there were a better way . . .

It would be good if there were a way of telling GHC's RTS scheduler "Please don't run any other threads apart from this one until further notice".
It seems like you're only worried about the threads which interact with your servers. A thread which continues printing prime numbers or whatever to the screen probably wouldn't cause problems. It also sounds like you want to be able to tell the RTS 'and if one of the other threads is already talking to the servers, let it finish first'. In other words, it sounds like the right thing is to protect access to the servers with some kind of lock (MVar, Semaphore, etc.). Is there a reason why this won't work? [I'm currently working on a system which lets you define your own scheduler hierarchy. Each scheduler has different consequences for preemption and, hence, the kind of locks that its children may or may not need to protect themselves from their siblings. Reasoning about which locks to use where is... challenging. I'd be interested if your program cannot be solved with a carefully placed lock but can be solved with a novel scheduler hierarchy. Not that it'd do _you_ any good though - the scheduler hierarchy is for embedded systems which tend not to be written in Haskell...] -- Alastair Reid alastair@reid-consulting-uk.ltd.uk Reid Consulting (UK) Limited http://www.reid-consulting-uk.ltd.uk/alastair/

Alastair Reid wrote:
It would be good if there were a way of telling GHC's RTS scheduler "Please don't run any other threads apart from this one until further notice".
It seems like you're only worried about the threads which interact with your servers. A thread which continues printing prime numbers or whatever to the screen probably wouldn't cause problems.
What is "the screen"? If you mean stdout, it most certainly would cause problems since stdout of the child process corresponds to the tool's output, so the tool would appear to emit prime numbers at the start.
It also sounds like you want to be able to tell the RTS 'and if one of the other threads is already talking to the servers, let it finish first'.
In other words, it sounds like the right thing is to protect access to the servers with some kind of lock (MVar, Semaphore, etc.).
Is there a reason why this won't work?
So before I send every character to any external device (and I have quite a few of them) I have to check an MVar? No thank you. It's not just servers, I also have external tools and so on. [snip]

In local.glasgow-haskell-users, you wrote:
In other words, it sounds like the right thing is to protect access to the servers with some kind of lock (MVar, Semaphore, etc.). Is there a reason why this won't work?
That's what I thought, too, but then it turned out to be a real hassle to rewrite the entire application to synchronise on a specific location. I'm also quite sure that this would introduce accidental deadlocks for non-trivial concurrent systems. Assume you have some kind of asynchronous database running in a separate thread, listening on a channel for requests. Now you have to introduce a construct which not only waits (suspends) on the channel, but even on another MVar (or add a new "command" for shut down). I tried this, but it really turned out to be much more overhead, especially in terms of maintainability of the source-code. Granted, from the software-engineering point of few, it is much cleaner, but on the other hand, you have the already existing semantics of fork() in a pthread which does exactly the same: Only fork the current thread and not the others. I was toying with two functions called freezeThread and thawThread which would allow you to selectively halt and resume threads, but even the Java-guys dropped that from their specs because of the potential havoc you could cause. -- plonk :: m a -> m () http://www-i2.informatik.rwth-aachen.de/stolz/ *** PGP *** S/MIME

I have some code which forks off a new process which does some non-Haskelly stuff. Thus it does the standard things, something like
processId <- Posix.forkProcess case processId of Just _ -> -- parent process, continue normally ... Nothing -> -- child process do ... -- rearrange plumbing of stdin/stdout/stderr Posix.executeFile [blah]
The GHC runProcess function uses a similar mechanism.
Unfortunately this has developed an extremely irritating bug. The problem is that my program has a lot of worker threads which do things like communicating with servers, already running programs, and so on. These worker threads get replicated in the child process (like everything else) which means that you now have two clients communicating with the server, where
be one, result chaos. (Or in this case, the entire program coming to a halt.) I've had this code for several years (I inherited it), but the problem seems to have recently got much worse; I don't know whether this is because of a change in GHC's scheduling with ghc5.04, or because I am now compiling much more complicated programs.
Whatever, I simply couldn't find a way of fixing this in Haskell. It would be good if there were a way of telling GHC's RTS scheduler "Please don't run any other threads apart from this one until further notice". If so, you could do
[block other threads] [do fork] [if parent process then unblock other threads ... else -- child process rearrange plumbing and do exec ]
But there isn't.
So in the end I resorted to writing a C function which does both the forking and replumbing and execing. Since GHC doesn't run native threads while foreign C code is executing,
Have a look at GHC.Conc.forkProcess, which Volker Stolz
contributed a while ago. Does what you want, but I think you're
actually doing the Right Thing by implementing the fork() & exec()
outside of Haskell.
--sigbjorn
----- Original Message -----
From: "George Russell"
It would be nice if there were a better way . . . _______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Sigbjorn Finne wrote:
Have a look at GHC.Conc.forkProcess, which Volker Stolz contributed a while ago. Does what you want, but I think you're actually doing the Right Thing by implementing the fork() & exec() outside of Haskell.
[snip] Interesting. How exactly does it work, by the way? Posix.runProcess really should use it I think. I wish Posix.runProcess was available and worked on both Windows and Unix. I am now in the unhappy position of having had to code equivalent functions in C on both. The signature of Posix.runProcess is not quite adequate though; one wants some control over the child thread. At a minimum, it would be nice to be able to kill it.

In local.glasgow-haskell-users, you wrote:
Sigbjorn Finne wrote:
Have a look at GHC.Conc.forkProcess
Interesting. How exactly does it work, by the way?
Trade secret ;) Simply drop the TSO pointers to all other threads from the queues so the RTS won't find them again (GC is a different issue).
Posix.runProcess really should use it I think.
No, it's better to be able to choose the way to handle this. Maybe add a flag to Posix.runProcess. But the whole GHC.Conc.forkProcess isn't finished, yet, anyway.
The signature of Posix.runProcess is not quite adequate though; one wants some control over the child thread. At a minimum, it would be nice to be able to kill it. ^^^^^^^^ process?
You still get the child's pid, so feel free to mess with it. -- plonk :: m a -> m () http://www-i2.informatik.rwth-aachen.de/stolz/ *** PGP *** S/MIME
participants (4)
-
Alastair Reid
-
George Russell
-
Sigbjorn Finne
-
Volker Stolz