
Bulat Ziganshin wrote:
i'm now write some sort of new i/o library. one area where i currently lacks in comparision to the existing Handles implementation in GHC, is the asynchronous i/o operations. can you please briefly describe how this is done in GHC and partially - why the multiple buffers are used?
Multiple buffers were introduced to cope with the semantics we wanted for hPutStr. The problem is that you don't want hPutStr to hold a lock on the Handle while it evaluates its argument list, because that could take arbitrary time. Furthermore, things like this: putStr (trace "foo" "bar") used to cause deadlocks, because putStr holds the lock, evaluates its argument list, which causes trace to also attempt to acquire the lock on stdout, leading to deadlock. So, putStr first grabs a buffer from the Handle, then unlocks the Handle while it fills up the buffer, then it takes the lock again to write the buffer. Since another thread might try to putStr while the lock is released, we need multiple buffers. For async IO on Unix, we use non-blocking read() calls, and if read() indicates that we need to block, we send a request to the IO Manager thread (see GHC.Conc) which calls select() on behalf of all the threads waiting for I/O. For async IO on Windows, we either use the threaded RTS's blocking foreign call mechanism to invoke read(), or the non-threaded RTS has a similar mechanism internally. We ought to be using the various alternatives to select(), but we haven't got around to that yet.
moreover, i have an idea how to implement async i/o without complex burecreacy: use mmapped files, may be together with miltiple buffers.
I don't think we should restrict the implementation to mmap'd files, for all the reasons that Einar gave. Lots of things aren't mmapable, mainly. My vision for an I/O library is this: - a single class supporting binary input (resp. output) that is implemented by various transports: files, sockets, mmap'd files, memory and arrays. Windowed mmap is an option here too. - layers of binary filters on top of this: you could add buffering, and compression/decompression. - a layer of text translation at the top. This is more or less how the Stream-based I/O library that I was working on is structured. The binary I/O library would talk to a binary transport, perhaps with a layer of buffering, whereas text-based applications talk to the text layer. Cheers, Simon