
Marcin 'Qrczak' Kowalczyk wrote:
Ben Rudiak-Gould
writes: fileRead can be implemented in terms of OS primitives,
Only if they already support reading from a fixed offset (like pread). I'm not sure if we can rely on something like this being always available, or whether it should be emulated using lseek which is safe only as long as we are the only process using the given open file.
First of all, I don't think any OS shares file pointers between processes. Otherwise it would be practically impossible to safely use an inherited filehandle via any API. Different threads using the same filehandle do share a file pointer (which is a major nuisance in my experience, because of the lack of an atomic seek-read/write), but a Posix fork duplicates the file pointer along with all other state. I can't believe I'm wrong about this, but someone please correct me if I am. This limits the problem to a single process. If you're only using GHC's lightweight threads, there's no problem at all. If you're using OS threads, the worst thing that could happen is that you might have to protect handle access with a critical section. I don't think this would lead to a noticeable performance hit when combined with the other overhead of file read/write operations (or lazy evaluation for that matter).
pread requires that the file is seekable, which means that it can't be used for all file handles: not for pipes, sockets, terminals nor various other devices.
The file interface in this library is only used for files, which are always seekable (by definition). If you want to treat a file as a stream, you create an InputStream or OutputStream backed by the file. Such streams maintain internal (per-stream) file pointers.
Not if it must cooperate with other processes, and you *do* want to set a file position before running another program with redirected standard I/O. In this case it's not enough that you set a private Haskell variable holding its logical file position - you must perform the lseek syscall.
If you're using Posix fork/exec, you can use Posix lseek without losing portability. If you're using a higher-level Haskell library to spawn the program, it will be Stream-aware (if it supports redirection at all) and will know how to set the system file pointer when necessary.
Doing something differently than everybody else has a risk of limited interoperability, even if the new way is "better", and thus must be carefully evaluated to check whether all lost functionality is unimportant enough to lose.
Very true. (But hardly a new problem for Haskell.) -- Ben