
Dean Herington wrote:
I've long thought that I/O buffering behavior--not just in Haskell, but in most places I've encountered it--was unnecessarily complicated. Perhaps it could be simplified dramatically by treating it as strictly a performance optimization.
This isn't entirely possible; there will always be situations where it matters as to exactly when and how the data gets passed to the OS. My experience taught me that the simplest solution was never to use ANSI stdio buffering in such situations.
Here's a sketch of the approach.
Writing a sequence of characters across the interface I'm proposing is a request by the writing program for those characters to appear at their destination "soon". Ideally, "soon" would be "immediately"; however, the characters' appearance may deliberately be delayed ("buffered"), for efficiency, as long as such delay is "unobtrusive" to a human user of the program. Buffering timeouts would depend on the device; for a terminal, perhaps 50-100 ms would be appropriate. Such an interval would tend not to be noticeable to a human user but would be long enough to effectively collect, say, an entire line of output for output "in one piece". The use of a reasonable timeout would avoid the confusing behavior where a newline-less prompt doesn't appear until the prompted data is entered.
With this scheme, I/O buffering no longer has any real semantic content. (In particular, the interface never guarantees indefinite delay in outputting written characters. Line buffering, if semantically important, needs to be done above the level of this interface.)
That's already true, at least in C: if you output a line which is longer than the buffer, the buffer will be flushed before it contains a newline (i.e. the line won't be written atomically).
Hence, buffering control could be completely eliminated from the interface. However, I would retain it to provide (non-semantic) control over buffering. The optional buffer size currently has such an effect. A timeout value could be added for fine tuning. (Note that a timeout of zero would have an effect similar to Haskell's current NoBuffering.) Lastly, the "flush" operation would remain, as a hint that it's not worth waiting even the limited timeout period before endeavoring to make the characters appear.
Is such an approach feasible?
Possibly. As things stand, anyone who writes code which relies upon output being held back until a flush is asking for trouble. So, your approach wouldn't make it any harder to write correct code, although it might make it significantly more obvious if code was incorrect. AFAICT, the biggest problem would be providing an upper bound on the delay, as that implies some form of preemptive concurrency.
Has it been implemented anywhere?
Not that I know of.
Would such behavior best be implemented by the operating system?
No. The OS (i.e. kernel) doesn't know anything about user-space buffering. Furthermore, one of the main functions of user-space buffering is to minimise the number of system calls, so putting it into the OS would be pointless.
Could it be implemented by the runtime system?
It depends what you mean by "the runtime system"; it would have to be
implemented in user-space.
--
Glynn Clements