
Ben Rudiak-Gould
is there *any* way to get, without an exploitable race condition, two filehandles to the same file which don't share a file pointer?
AFAIK it's not possible if the only thing you know is one of the descriptors. Of course independent open() calls which refer to the same file have separate file pointers (I mean the true filename, not /proc/*/fd/*). On Linux the current file position is stored in struct file in the kernel. struct file includes "void *private_data" whose internals depend on the nature of the file, in particular they can be reference counted. Among polymorphic operations on files in struct file_operations there is nothing which clones the struct file. This means that a device driver would have no means to specify how private_data of its files should be duplicated (e.g. by bumping the reference count). If my understanding is correct, it implies that the kernel has no way to clone an arbitrary struct file. Just don't use the current position of seekable files if you don't like it: use pread/pwrite.
Is there any way to pass a filehandle as stdin to an untrusted/ uncooperative child process in such a way that the child can't interfere with your attempts to (say) append to the same file?
You can set O_APPEND flag to force each write to happen at the end of file. It doesn't prevent the process from clearing the flag. If it's untrusted, how do you know that it won't truncate the file or just write garbage to it where you would have written something? If the file is seekable, you can use pread/pwrite. If it's not seekable, the concept of concurrent but non-interfering reads or writes is meaningless.
I think we just need more kinds of streams. With regard to file-backed streams, there are three cases:
1. We open a file and use it in-process. 2. We open a file and share it with child processes. 3. We get a handle at process startup which happens to be a file.
I disagree. IMHO the only distinction is whether we want to perform I/O at the current position (shared between processes) or explicitly specified position (possible only in case of seekable files). Neither can be emulated in terms of the other.
In case 2 we could avoid OS problems by creating a pipe and managing our end in-process.
It's not transparent: it translates only read and write, but not sendto/recvfrom, setsockopt, ioctl, lseek etc., and obviously it will stop working when our process finishes but the other does not. A pipe can be created when the program really wants this, but it should not be created autimatically whenever we redirect stdin/stdout/stderr of another program to a file we have opened.
Case 3 is the most interesting. In an ideal world I would argue for treating stdin/out/err simply as streams, but that's not practical. Failing that, if we have pread and pwrite, we should provide two versions of stdin/out/err, one of type InputStream/OutputStream and the other of type Maybe File. We can safely layer other streams on top of these files (if they exist) without interfering with the stream operation.
I'm not sure what do you mean. Haskell should not use pread/pwrite for functions like putStr, even if stdout is seekable. The current file position *should* be shared between processes by default, otherwise redirection of stdout to a file will break if the program delegates some work with corresponding output to other programs it runs.
Indeed, file positions are exactly as evil as indices into shared memory arrays, which is to say not evil at all. But suppose each shared memory array came with a shared "current index", and there was no way to create additional ones.
Bad analogy: if you open() the file independently, the position is not shared. The position is not tied to a file with its shared contents but to the given *open* file structure. And there is pread/pwrite (on some OSes at least). It's not suitable as the basic API of all reads and writes though. -- __("< Marcin Kowalczyk \__/ qrczak@knm.org.pl ^^ http://qrnik.knm.org.pl/~qrczak/