Re: [Haskell-cafe] Blocking IO & FIFOs

2012/10/20 Asten, W.G.G. van (Wilfried, Student M-CSC)
Would you be happy with a solution like this:
- First create two handles to two files in the tmp directory - Then use StdStream's UseHandle to redirect std_err and std_out (using CreatePipe for std_in) to these files - Then write your query to the Handle for std_in - waitForProcess - Collect std_out and std_err from the temporary files.
If that is not satisfactory you may want to check out conduit-process (http://hackage.haskell.org/packages/archive/process-conduit/0.5.0.2/doc/html...) that also does some interaction with a process and interleaves std_in and std_out. It should not be to hard to combine std_err into that concept. I also faced this same problem and in one case solved it by using a temporary file to hold my content which was also deleted afterwards (This immediately prevented the content building up in memory).
For my application, it's important to be able to run multiple queries against the same Bash session. Waiting for Bash to shut down is thus not a viable way to finalize the response. Perhaps I can spawn two cats with their outputs connected to the FIFOs and wait for them to terminate. -- Jason Dusek pgp // solidsnack // C1EBC57DC55144F35460C8DF1FD4C6C1FED18A2B

Perhaps an interleaving solution as in process-conduit is still viable:
- Check if one or both of the fifo's are still ready (Based on your
statement about the reading end receiving EOF hIsEOF should work
here). If both fifos are done the query is finished so break the loop.
- Check if some output is available on oh. If so read some of it. Repeat.
- Otherwise check if some output is available on eh. If so read some
of it. Repeat
loop
I also see you don't do anything with the std_out and std_err pipes of
bash as given by runInteractiveProcess. These could also cause a
problem even when the FIFO's are working correctly. Replace these by
handles to the null file or let the output be dumped on the parent's
std_in and std_out (StdStream Inherit).
On Sat, Oct 20, 2012 at 4:00 PM, Jason Dusek
For my application, it's important to be able to run multiple queries against the same Bash session. Waiting for Bash to shut down is thus not a viable way to finalize the response.
Perhaps I can spawn two cats with their outputs connected to the FIFOs and wait for them to terminate.

2012/10/20 Wilfried van Asten
Perhaps an interleaving solution as in process-conduit is still viable:
- Check if one or both of the fifo's are still ready (Based on your statement about the reading end receiving EOF hIsEOF should work here). If both fifos are done the query is finished so break the loop.
Alas, checking for EOF does not work. I mentioned this in passing in my prior email; the code was somewhat involved and I have deleted it. Here is a simple example of something that does not work as expected: In the first terminal: :; mkfifo fifo :; ghci -- :m + GHC.IO.Handle.FD System.IO -- do { h <- openFileBlocking "fifo" ReadMode ; hGetContents h } In the second terminal, *after* doing everything in the first terminal: :; cat > fifo < type some characters here > ^D Notice that the characters appear in the first terminal, as the output of hGetContents. Sending ^D to end cat does not register any effect in GHCi; hGetContents dutifully waits and you can in fact run cat on the FIFO again to send more characters to the same instances of hGetContents. This would seem to be due to non-blocking IO, deep in the IO manager.
- Check if some output is available on oh. If so read some of it. Repeat.
- Otherwise check if some output is available on eh. If so read some of it. Repeat loop
I also see you don't do anything with the std_out and std_err pipes of bash as given by runInteractiveProcess. These could also cause a problem even when the FIFO's are working correctly. Replace these by handles to the null file or let the output be dumped on the parent's std_in and std_out (StdStream Inherit).
I would prefer to leave them be, since they're passed in from the caller, who nominally owns them. If you mean that I should close them in `start', well, that would make it hard to debug this stuff; and if I simply tie them to the parent's file descriptors, it will make it hard to deal with more than a few CoBashes at one time while testing. Using cat to read the FIFOs and allowing Haskell to read from cat does work, actually. https://gist.github.com/3923673 Shell really is such a nice language for tying together processes. -- Jason Dusek pgp // solidsnack // C1EBC57DC55144F35460C8DF1FD4C6C1FED18A2B {-# LANGUAGE OverloadedStrings , ScopedTypeVariables , ParallelListComp , TupleSections #-} module CoBash where import Control.Applicative import Control.Concurrent import Control.Concurrent.MVar import Control.Exception import Control.Monad import Data.Bits import Data.ByteString.Char8 (ByteString) import qualified Data.ByteString.Char8 as Bytes import Data.Maybe import Data.Monoid import qualified GHC.IO.Handle.FD import System.IO import System.IO.Error import System.Process import System.Posix.ByteString import System.IO.Temp import qualified Text.ShellEscape as Esc start :: IO (Handle, Handle, Handle, ProcessHandle) start = runInteractiveProcess "bash" [] Nothing (Just []) query :: (Handle, Handle, Handle, ProcessHandle) -> ByteString -> IO (ByteString, ByteString) query (i, _, _, _) query = withFIFOs query' where query' ofo efo = do Bytes.hPut i cmd hFlush i [ob, eb] <- backgroundReadFIFOs [ofo, efo] return (ob, eb) where cmd = Bytes.unlines ["{", query, "} 1>" <> ofo <> " 2>" <> efo] shutdown :: (Handle, Handle, Handle, ProcessHandle) -> IO () shutdown (i, _, _, p) = () <$ hClose i <* waitForProcess p openFIFO path = GHC.IO.Handle.FD.openFileBlocking (Bytes.unpack path) ReadMode -- | Run an IO action with two FIFOs in scope, which will removed after it -- completes. withFIFOs :: (RawFilePath -> RawFilePath -> IO a) -> IO a withFIFOs m = withSystemTempDirectory "cobash." m' where m' = (uncurry m =<<) . mk . Bytes.pack mk d = (o, e) <$ (createNamedPipe o mode >> createNamedPipe e mode) where (o, e) = (d <> "/o", d <> "/e") mode = ownerReadMode .|. ownerWriteMode .|. namedPipeMode drainFIFO :: ByteString -> IO ByteString drainFIFO path = do (i, o, e, p) <- bash ["-c", "exec cat <"<>(Bytes.unpack path)] hClose i hClose e Bytes.hGetContents o <* waitForProcess p backgroundReadFIFOs theFIFOs = do cells <- sequence (newEmptyMVar <$ theFIFOs) sequence_ [ forkIO (drainFIFO p >>= putMVar c) | p <- theFIFOs | c <- cells ] sequence (takeMVar <$> cells) bash args = runInteractiveProcess "bash" args Nothing (Just [])

Quoth Jason Dusek
For my application, it's important to be able to run multiple queries against the same Bash session. Waiting for Bash to shut down is thus not a viable way to finalize the response.
You could redirect to disk files and also use a pipe to wait for exit. I suppose you redirect I/O for individual queries by applying shell redirections to the command? So like this, uname -a > /tmp/cmd1 2> /tmp/cmd2 7> /tmp/exitpipe ... then read from /tmp/exitpipe, ignore empty result and read command outputs from the disk files. Donn

2012/10/20 Donn Cave
Quoth Jason Dusek
, ... For my application, it's important to be able to run multiple queries against the same Bash session. Waiting for Bash to shut down is thus not a viable way to finalize the response.
You could redirect to disk files and also use a pipe to wait for exit.
I suppose you redirect I/O for individual queries by applying shell redirections to the command? So like this,
uname -a > /tmp/cmd1 2> /tmp/cmd2 7> /tmp/exitpipe
... then read from /tmp/exitpipe, ignore empty result and read command outputs from the disk files.
If I could somehow arrange to detect EOF when /tmp/exitpipe is closed, then I might as well redirect 1 and 2 to FIFOs and wait for them to EOF, collecting the output. However, all of my experiments suggest that there is simply no way in Haskell to detect the closing of the write end of a FIFO. With `openFileBlocking', one can detect when a FIFO is *opened*; but not when it is closed. -- Jason Dusek pgp // solidsnack // C1EBC57DC55144F35460C8DF1FD4C6C1FED18A2B

From Jason Dusek
, ... If I could somehow arrange to detect EOF when /tmp/exitpipe is closed, then I might as well redirect 1 and 2 to FIFOs and wait for them to EOF, collecting the output. However, all of my experiments suggest that there is simply no way in Haskell to detect the closing of the write end of a FIFO. With `openFileBlocking', one can detect when a FIFO is *opened*; but not when it is closed.
It looks to me like our colleague in another followup may have it working in an example. I have run into some trouble myself, with an example program demonstrating the approach I proposed. With a pre-existing named pipe, that I would just keep using, for whatever reason it worked the first time, failed the second, and so forth, working every other time. If the test program created the named pipe, it failed every time. There are probably reasons for all this, but I haven't looked very hard. That was using "withFile". If I use POSIX I/O, it works fine. So it looks to me like there is indeed a way in Haskell to detect a closed FIFO, it just may not be Haskell I/O without a lot more work ironing out the possible causes of failure. I believe that doesn't need to be a problem for you, though, because 1) your application is by nature exclusive to POSIX platforms, and 2) you need the named pipe only to detect command process exit, and you can still apply Haskell I/O to the more interesting data that accumulates in the command output disk file. And there may be an answer for my problems with Haskell I/O. Could be as simple as using openFileBlocking, which apparently isn't supported in the ghc I'm using. Could have something to do with the fine points of named pipes - for example, I believe you're supposed to open them O_RDWR in situations you'd think would call for O_READONLY. (Though the latter worked for me with POSIX I/O.) While I'm here ... I share the concern expressed in an earlier followup about the outputs from bash in runInteractiveProcess. This looks like a feature of runInteractiveProcess that makes it intrinsically something like a "code smell". input-only and output-only processes are commonly used and fairly tractable, where input-output processes are unusual and and fragile, so it's an unfortunate convenience. I think the idea is that you'd use createProcess specifying only the input redirection. Donn

2012/10/21 Donn Cave
From Jason Dusek
: If I could somehow arrange to detect EOF when /tmp/exitpipe is closed, then I might as well redirect 1 and 2 to FIFOs and wait for them to EOF, collecting the output.
However, all of my experiments suggest that there is simply no way in Haskell to detect the closing of the write end of a FIFO. With `openFileBlocking', one can detect when a FIFO is *opened*; but not when it is closed.
[...] If I use POSIX I/O, it works fine. So it looks to me like there is indeed a way in Haskell to detect a closed FIFO, it just may not be Haskell I/O without a lot more work ironing out the possible causes of failure.
Sadly, I can not do Posix IO on handles or read a ByteString from a Posix FD.
2) you need the named pipe only to detect command process exit, and you can still apply Haskell I/O to the more interesting data that accumulates in the command output disk file.
Writing data to disk for communicating with other processes is not a good pattern, I think.
And there may be an answer for my problems with Haskell I/O. Could be as simple as using openFileBlocking, which apparently isn't supported in the ghc I'm using. Could have something to do with the fine points of named pipes - for example, I believe you're supposed to open them O_RDWR in situations you'd think would call for O_READONLY. (Though the latter worked for me with POSIX I/O.)
It is okay to open it O_READONLY if blocking when there is no writer is acceptable. For this application, it is.
While I'm here ... I share the concern expressed in an earlier followup about the outputs from bash in runInteractiveProcess. This looks like a feature of runInteractiveProcess that makes it intrinsically something like a "code smell". input-only and output-only processes are commonly used and fairly tractable, where input-output processes are unusual and and fragile, so it's an unfortunate convenience. I think the idea is that you'd use createProcess specifying only the input redirection.
I am averse to adding "just in case" code that may not do anything; too much system level code attains an air of mystery this way. -- Jason Dusek pgp // solidsnack // C1EBC57DC55144F35460C8DF1FD4C6C1FED18A2B

Hi Everyone, Thanks for all your help. I've put the first working version of this on GitHub: https://github.com/solidsnack/coproc Many improvements suggested in thread have not been implemented as yet but I hope to integrate them as part of expanding the tool to cover other interpreters, like Python or PSQL. -- Jason Dusek pgp // solidsnack // C1EBC57DC55144F35460C8DF1FD4C6C1FED18A2B
participants (4)
-
Donn Cave
-
Gregory Collins
-
Jason Dusek
-
Wilfried van Asten