reliable (bi)directional pipe to a process

I'd like to point out a reliable, proven and simple way of interacting with another process, via unidirectional or bidirectional pipes. The method supports Unix sockets, pipes, and TCP sockets. I too have noticed insidious bugs in GHC run-time when communicating with another process via a pipe. I tried to use runInteractiveProcess; it worked -- up to file sizes of about 300Kb. Then GHC run-time seems to `loses synchronization' -- and corrupts IO buffers, receiving stuff that cannot have been possibly sent. This is because handle operations are asynchronous and the GHC scheduler seems to have race conditions. That behavior was totally unacceptable. I was writing a production code, and can't afford such errors. Therefore, I wrote a simple foreign function interface to the code sys_open that I have been using for about fifteen years. This code does work, in production, for very large file sizes and long-running processes, on many Unix and Unix-like systems. I was told once about a Cygwin port. http://okmij.org/ftp/syscall-interpose.html#Application http://okmij.org/ftp/packages/sys_open.c http://okmij.org/ftp/Haskell/MySysOpen.hs Please see the test at the end of the file MySysOpen.hs. The test interacts with another process over a bi-directional pipe, repeatedly sending and receiving data. The amount of received data is large (about 510K).

oleg:
I'd like to point out a reliable, proven and simple way of interacting with another process, via unidirectional or bidirectional pipes. The method supports Unix sockets, pipes, and TCP sockets.
I too have noticed insidious bugs in GHC run-time when communicating with another process via a pipe. I tried to use runInteractiveProcess; it worked -- up to file sizes of about 300Kb. Then GHC run-time seems to `loses synchronization' -- and corrupts IO buffers, receiving stuff that cannot have been possibly sent. This is because handle operations are asynchronous and the GHC scheduler seems to have race conditions. That behavior was totally unacceptable. I was writing a production code, and can't afford such errors.
Did you file a bug report!? Can you follow up with information we can use to chase this down.
Therefore, I wrote a simple foreign function interface to the code sys_open that I have been using for about fifteen years. This code does work, in production, for very large file sizes and long-running processes, on many Unix and Unix-like systems. I was told once about a Cygwin port.
http://okmij.org/ftp/syscall-interpose.html#Application http://okmij.org/ftp/packages/sys_open.c http://okmij.org/ftp/Haskell/MySysOpen.hs
Please see the test at the end of the file MySysOpen.hs. The test interacts with another process over a bi-directional pipe, repeatedly sending and receiving data. The amount of received data is large (about 510K).
To file a bug, go here, http://hackage.haskell.org/trac/ghc/newticket?type=bug -- Don

There is actually a real wealth of material on generalizing I/O on your site -- it's definitely something I will be ever more interested in. Now that I think about it, I can remember a time where a program that did a lot of stuff with Amazon would mysteriously run out of file descriptors, and I just had to put some shell around it to kick it over and over again. -- _jsn

I too have noticed insidious bugs in GHC run-time when communicating with another process via a pipe. I tried to use runInteractiveProcess; it worked -- up to file sizes of about 300Kb.
Yeah, I seem to be running into similar strange problems, so I'll be definitely checking out your code. (Why didn't I file bug reports? Because I couldn't isolate the problem well, so my whole setup including external programs would have to be duplicated, and I don't expect GHC headquarters to be willing to do this. No complaints.) J.W.

As an aside, my present problem really seems to be fixed -- I am able to move files of more than 2MB from one process to another within my Haskell program. In my program, I take the input file, turn it into a ByteString, pass it to process one, capture the result as a ByteString, pass it to process two, caputre that result as a ByteString and then put the ByteString into a file. Of course we'll see whether it really works in the next few days here -- but after I got the forking right I really there's little else to be done. However, it's clear this has been a problem in the past. Even if we just use this thread to collect "incidents" -- less specific than bugs, but more precise than simple suggestions of inadequacy -- it would help everyone to know what the limits are with the present runtime. -- _jsn

when I run the test case in the file, the first read_back gets until count=9890, then hangs (I don't see "Doing it again") (CPU is idle) (with ghc-6.10.1)

oleg@okmij.org wrote:
I'd like to point out a reliable, proven and simple way of interacting with another process, via unidirectional or bidirectional pipes. The method supports Unix sockets, pipes, and TCP sockets.
I too have noticed insidious bugs in GHC run-time when communicating with another process via a pipe. I tried to use runInteractiveProcess; it worked -- up to file sizes of about 300Kb. Then GHC run-time seems to `loses synchronization' -- and corrupts IO buffers, receiving stuff that cannot have been possibly sent. This is because handle operations are asynchronous and the GHC scheduler seems to have race conditions. That behavior was totally unacceptable. I was writing a production code, and can't afford such errors.
If there are bugs of this kind, we really need to get them fixed - I'm not aware of anything like this being reported. Please, if anyone can reproduce this, or even if you can't, submit a bug giving as many details as you can. Cheers, Simon
participants (5)
-
Don Stewart
-
Jason Dusek
-
Johannes Waldmann
-
oleg@okmij.org
-
Simon Marlow