
On Sat, 2009-10-10 at 02:51 -0700, oleg@okmij.org wrote:
The reason it's hard is that to demonstrate a difference you have to get the lazy I/O to commute with some other I/O, and GHC will never do that.
The keyword here is GHC. I may well believe that GHC is able to divine programmer's true intent and so it always does the right thing. But writing in the language standard ``do what the version x.y.z of GHC does'' does not seem very appropriate, or helpful to other implementors.
With access to unsafeInterleaveIO it's fairly straightforward to show that it is non-deterministic. These programs that bypass the safety mechanisms on hGetContents just get us back to having access to the non-deterministic semantics of unsafeInterleaveIO.
Haskell's IO library is carefully designed to not run into this problem on its own. It's normally not possible to get two Handles with the same FD...
Is this behavior is specified somewhere, or is this just an artifact of a particular GHC implementation?
It is in the Haskell 98 report, in the design of the IO library. It does not not mention FDs of course. The IO/Handle functions it provides give no (portable) way to obtain two read handles on the same OS file descriptor. The hGetContents behaviour of semi-closing is to stop you from getting two lazy lists of the same read Handle. There's nothing semantically wrong with you bypassing those restrictions (eg openFile "/dev/fd/0") it just means you end up with a non-deterministic IO program, which is something we typically try to avoid. I am a bit perplexed by this whole discussion. It seems to come down to saying that unsafeInterleaveIO is non-deterministic and that things implemented on top are also non-deterministic. The standard IO library puts up some barriers to restrict the non-determinism, but if you walk around the barrier then you can still find it. It's not clear to me what is supposed to be surprising or alarming here. Duncan