
Hello Donn, Python has a detailed discussion of this suggestion: * https://www.python.org/dev/peps/pep-0433/#close-file-descriptors-after-fork * https://www.python.org/dev/peps/pep-0446/#closing-all-open-file-descriptors It highlights some problems with this approach, most notably Windows problems, not solving the problem when you exec() without fork(), and looping up to MAXFD being slow (this is what the current Haskell `runInteractiveProcess` code (http://hackage.haskell.org/package/process-1.2.3.0/src/cbits/runProcess.c) seems to be doing; Python improved upon this by not looping up to MAXFD, but instead looking up the open FDs in /proc/<PID>/fd/, after people complained about this loop of close() syscalls being very slow when many FDs were open.
do that, you set up the conditions for breaking something that works in C, which I hate to see happen with Haskell.
While I understand your opinion here, I'm not sure that "breaking something that works in C" is the right description. O_CLOEXEC changes a default setting, but does not irrevocably disable any feature that is available in C. The difference is that you'd have to say which FDs you want to keep in the child - which to my knowledge is OK, since it is a much more common thing to work with *some* designated FDs in the child process than with all of them. To elaborate a bit, if you wanted to write a program where a child process would access the parent's Fds, you would in most cases already have those Fds in some Haskell variables you're working with. In that case, it is easy to `setFdOption fd CloseOnExec False` on those if CLOEXEC is the default, and everybody is happy. If CLOEXEC is not the default, then you'd get a problem with all those Fds on which do *not* have a grip in your program, and it's much harder to fix problems with these resources that are around invisible in the background than with those that you have in variables that you use. In other words, CLOEXEC is something that is easy to *undo* locally when you don't want it, but hard to *do* globally when you need it. Let me know what you think about this. Niklas On 22/07/15 04:47, Donn Cave wrote:
quoth Niklas Hambüchen, ...
The scope of this program is quite general unfortunately: It will happen for any program that uses parallel threads, and that runs two or more external processes at some time. It cannot be fixed by the part that starts the external process (e.g. you can't write a reliable `readProcess` function that doesn't have this problem, since the problem is rooted in the Fds, and there is no version of `exec()` that doesn't inherit parent Fds).
This problem is a general problem in C on Unix, and was discovered quite late.
I believe it has actually been a familiar issue for decades. I don't have any code handy to check, but I'm pretty sure the UNIX system(3) and popen(3) functions closed extraneous file descriptors back in the early '90s, and probably had been doing it for some time by then.
I believe this approach to the problem is supported in System.Process, via close_fds. Implementation is a walk through open FDs, in the child fork, closing anything not called for by the procedure's parameters prior to the exec.
That approach has the advantage that it applies to all file descriptors, whether created by open(2) or by other means - socket, dup(2), etc.
I like this already implemented solution much better than adding a new flag to "all" opens (really only those opens that occur within the Haskell runtime, while of course not for external library FDs.) The O_CLOEXEC proposal wouldn't be the worst or most gratuitous way Haskell tampers with normal UNIX parameters, but any time you do that, you set up the conditions for breaking something that works in C, which I hate to see happen with Haskell.
Donn _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe