System.Process bugs on Unix

I have another concern about System.Process. I have this code: (recordfn, recordh) <- openTempFile macdir "new-" ... (c1, macroh, c2, xmacroph) <- runInteractiveProcess "xmacrorec2" ["-k", "0xffff"] Nothing Nothing ... grepph <- runProcess "egrep" ["-v", "^(ButtonRelease|ButtonPress|MotionNotify)"] Nothing Nothing (Just macroh) (Just recordh) Nothing Now, egrep is dying with: grep: (standard input): Resource temporarily unavailable And strace shows that, sure enough, read(0,...) in grep is getting EAGAIN. Sounds like somebody forgot to take the fd out of non-blocking mode. This renders runProcess pretty well useless, IMHO. -- John

John Goerzen wrote:
And strace shows that, sure enough, read(0,...) in grep is getting EAGAIN.
Sounds like somebody forgot to take the fd out of non-blocking mode.
Yep, that looks like what's happening. See System/Process/Internals.hs:runProcessPosix. I wonder why the lower-level runProcess that it calls is written in C.

Bryan O'Sullivan wrote:
John Goerzen wrote:
And strace shows that, sure enough, read(0,...) in grep is getting EAGAIN.
Sounds like somebody forgot to take the fd out of non-blocking mode.
Yep, that looks like what's happening. See System/Process/Internals.hs:runProcessPosix. I wonder why the lower-level runProcess that it calls is written in C.
We can't just take a file descriptor out of non-blocking mode, because due to broken POSIX semantics that would screw up GHC's use of the file descriptor (there's no way to set non-blocking mode per-FD). We can "fix" this by modifying the IO library to work with FDs in blocking mode, which is possible in the threaded RTS, but we haven't completed this yet, see http://hackage.haskell.org/trac/ghc/ticket/724 I've just milestoned this bug for 6.6.2, so I promise to at least try to fix it before then... Cheers, Simon

On Thu, Mar 29, 2007 at 10:55:43AM +0100, Simon Marlow wrote:
Bryan O'Sullivan wrote:
John Goerzen wrote:
And strace shows that, sure enough, read(0,...) in grep is getting EAGAIN.
Sounds like somebody forgot to take the fd out of non-blocking mode.
Yep, that looks like what's happening. See System/Process/Internals.hs:runProcessPosix. I wonder why the lower-level runProcess that it calls is written in C.
We can't just take a file descriptor out of non-blocking mode, because due to broken POSIX semantics that would screw up GHC's use of the file descriptor (there's no way to set non-blocking mode per-FD). We can "fix"
See fcntl(2) -- F_SETFL can change the O_NONBLOCK flag. I don't think this should be a problem for the RTS since it can be done only on the endpoint of the pipe that is used post-fork. -- John

Simon Marlow wrote:
We can't just take a file descriptor out of non-blocking mode, because due to broken POSIX semantics that would screw up GHC's use of the file descriptor (there's no way to set non-blocking mode per-FD).
Are you sure? I'd expect fcntl will do the trick. And you don't have to set non-blocking mode until after the fork, in the child, so GHC shouldn't see any consequences at all.
I've just milestoned this bug for 6.6.2, so I promise to at least try to fix it before then...
I've got a patch partially written, which shouldn't take more than an hour to finish off and test. So don't worry about this just yet.

Bryan O'Sullivan wrote:
Simon Marlow wrote:
We can't just take a file descriptor out of non-blocking mode, because due to broken POSIX semantics that would screw up GHC's use of the file descriptor (there's no way to set non-blocking mode per-FD).
Are you sure? I'd expect fcntl will do the trick. And you don't have to set non-blocking mode until after the fork, in the child, so GHC shouldn't see any consequences at all.
I'm sure, yes. The non-blocking flag is part of the "open file description" (see [1], [2]), not the file descriptor. This is why you use F_SETFL rather than F_SETFD to set it with fcntl(). When you dup() a file descriptor, the two FDs share a non-blocking flag, just as they share a file pointer. A TTY also has a single non-blocking flag, which all FDs (read & write) share. So when GHC sets its stdin to non-blocking mode, and stdin is the current TTY, anyone else writing to the TTY also gets non-blocking mode, hence the tee bug. Just to make sure I wasn't deluded, I just managed to demonstrate this with a couple of small test programs, which I'll attach - you need to compile them both, then run "./nonblock1 | ./nonblock2", and hit enter. Notice that the C program is seeing O_NONBLOCK set on its stdout/stderr descriptors, despite not having set it. Fortunately each end of a pipe has a separate non-blocking flag, which is why runInteractiveProcess doesn't get into difficulties, although runProcess does (see John's original message). [1] http://www.opengroup.org/onlinepubs/009695399/basedefs/xbd_chap03.html#tag_0... [2] http://www.opengroup.org/onlinepubs/009695399/basedefs/xbd_chap03.html#tag_0...
I've just milestoned this bug for 6.6.2, so I promise to at least try to fix it before then...
I've got a patch partially written, which shouldn't take more than an hour to finish off and test. So don't worry about this just yet.
neat - is it based on the patch attached to the bug, or did you do something
different?
Cheers,
Simon
import System.IO
main = do getLine; putStr "Hello World!\n"; hFlush stdout; getLine
#include

Simon Marlow wrote:
I'm sure, yes. The non-blocking flag is part of the "open file description" (see [1], [2]), not the file descriptor.
Ah, yes indeed. I had misunderstood what runProcess was trying to do. Given that, the patch I believed would help was not, in fact, going to do so. If there's a way to work around this behaviour, it's not obvious to me. Had you any thoughts in mind?

Bryan O'Sullivan wrote:
Simon Marlow wrote:
I'm sure, yes. The non-blocking flag is part of the "open file description" (see [1], [2]), not the file descriptor.
Ah, yes indeed. I had misunderstood what runProcess was trying to do. Given that, the patch I believed would help was not, in fact, going to do so.
If there's a way to work around this behaviour, it's not obvious to me. Had you any thoughts in mind?
We use O_NONBLOCK to do multithreaded I/O with our lightweight threads. If you're prepared to use OS threads, then you don't need O_NONBLOCK, you can just make a non-blocking foreign call to do the I/O: e.g foreign import safe "read", this makes the call in a separate OS thread, so other Haskell threads aren't blocked. Only the threaded RTS supports OS threads, though. The idea in my patch (not my idea, I think it's been used elsewhere) is to allow both kinds of FDs: non-blocking I/O for FDs that we are sure are local to the current process, and ordinary blocking I/O with OS threads for FDs that might be shared with another process. In the non-threaded RTS you have two options: continue to use O_NONBLOCK and keep the tee bug, or use blocking I/O for the standard FDs (all other threads will be blocked while you do I/O on a std FD). Cheers, Simon
participants (3)
-
Bryan O'Sullivan
-
John Goerzen
-
Simon Marlow