memory usage in repeated reading of an external program's output

Hello, I have this very simple program that executes an external program, reads its output and prints it (the program is "date"). The readings is done with pipes. The problem is that memory usage constantly increases over time. Profiling does not show garbage collection of any sort. File descriptors and handles seem to be properly closed. Still I cannot find out where the problem lays. Can it be related to the fact that runProcess closes the handles so that the write file descriptor of the pipe is left open? using a "closeFd w" after runProcess gives a Bad fd error. Moreover, fdToHandle "converts" the fd into a handle, so I presume that closing the second should be enough. And indeed removing or inserting rc <- handleToFd rh and closeFd rc doesn't change anything. Thanks for your help. Andrea The code: ---------------- module Main where import System.Process import System.Posix.IO import System.IO import Control.Concurrent runComLoop :: String -> IO () runComLoop command = do (r,w) <- createPipe wh <- fdToHandle w hSetBuffering wh LineBuffering p <- runProcess command [] Nothing Nothing Nothing (Just wh) (Just wh) rh <- fdToHandle r str <- hGetLine rh rc <- handleToFd rh hClose rh closeFd rc -- get and print the status of handles swh <- hShow wh srh <- hShow rh putStrLn $ show swh putStrLn $ show srh putStrLn str threadDelay $ 100000 * 1 runComLoop command main = runComLoop "date"

On 21/06/07, Andrea Rossato
runComLoop :: String -> IO () runComLoop command = do (r,w) <- createPipe wh <- fdToHandle w hSetBuffering wh LineBuffering p <- runProcess command [] Nothing Nothing Nothing (Just wh) (Just wh) rh <- fdToHandle r str <- hGetLine rh rc <- handleToFd rh hClose rh closeFd rc -- get and print the status of handles swh <- hShow wh srh <- hShow rh putStrLn $ show swh putStrLn $ show srh
putStrLn str threadDelay $ 100000 * 1 runComLoop command
main = runComLoop "date"
I honestly don't know the answer to this, and no doubt someone intelligent will swoop in momentarily - but have you tried decomposing runComLoop into smaller chunks to see which is using the most memory? You could also try the 'bracket' function, which I find enormously useful for showing which handles, etc are in scope. Cheers, D.

On Jun 21, 2007, at 6:40 , Andrea Rossato wrote:
I have this very simple program that executes an external program, reads its output and prints it (the program is "date"). The readings is done with pipes.
The problem is that memory usage constantly increases over time. Profiling does not show garbage collection of any sort.
Huh. Thank you; I'd observed that behavior in one of my programs but hadn't sat down to figure out why yet, mostly because I have a workaround: if you cap the heap (+RTS -M...) it *does* do GC. Which makes it sound like something I don't know how to debug. :/ -- brandon s. allbery [solaris,freebsd,perl,pugs,haskell] allbery@kf8nh.com system administrator [openafs,heimdal,too many hats] allbery@ece.cmu.edu electrical and computer engineering, carnegie mellon university KF8NH

On Thu, Jun 21, 2007 at 08:18:23AM -0400, Brandon S. Allbery KF8NH wrote:
On Jun 21, 2007, at 6:40 , Andrea Rossato wrote:
I have this very simple program that executes an external program, reads its output and prints it (the program is "date"). The readings is done with pipes.
The problem is that memory usage constantly increases over time. Profiling does not show garbage collection of any sort.
Huh. Thank you; I'd observed that behavior in one of my programs but hadn't sat down to figure out why yet, mostly because I have a workaround: if you cap the heap (+RTS -M...) it *does* do GC.
Which makes it sound like something I don't know how to debug. :/
well, I followed Dougal's suggestion and riduced the program into pieces. I'm also testing the difference of runProcess and runInteractiveProcess. The first seems better but I need some more test. In the first case pipes2Handles gets 80.4% alloc. In the second is runRunIntProcess to get 88%. Results are pretty much the same after all. Now I'm going to profile for memory usage: I've seen that some GC happens if you are patient enough. Thanks for your kind attention. Andrea This is the code broken up: module Main where import System.Process import System.Posix.IO import System.IO import Control.Concurrent readOutput rh = do str <- hGetLine rh return str mkPipe = do (r,w) <- createPipe return (r,w) pipes2Handles r w = do wh <- fdToHandle w rh <- fdToHandle r return (rh,wh) runRunProcess wh c = do p <- runProcess c [] Nothing Nothing Nothing (Just wh) (Just wh) return p closeHandle rh wh = do hClose wh hClose rh runWithRunProcess c = do (r,w) <- mkPipe (rh,wh) <- pipes2Handles r w p <- runRunProcess wh c exit <- waitForProcess p str <- readOutput rh closeHandle wh rh putStrLn str threadDelay $ 100000 * 1 runWithRunProcess c runRunIntProcess c = do (sin,sout,serr,p) <- runInteractiveProcess c [] Nothing Nothing return $! (sin,sout,serr,p) cHandles i o e = do hClose i hClose o hClose e runWithRunIntProcess c = do (i,o,e,p) <- runRunIntProcess c exit <- waitForProcess p str <- readOutput o cHandles i o e putStrLn str threadDelay $ 100000 * 1 runWithRunIntProcess c main' = runWithRunProcess "date" main = runWithRunIntProcess "date"

On Thu, Jun 21, 2007 at 09:44:43AM -0700, Bryan O'Sullivan wrote:
Yes, the process will hit a steady state of a few megabytes of heap after a short time.
By the way, your program leaks ProcessHandles.
Yes, I've noticed, indeed. Still I do not understand you reference to the leak problem. Could you please elaborate a bit? Thanks for your kind attention. Andrea

Andrea Rossato wrote:
Still I do not understand you reference to the leak problem. Could you please elaborate a bit?
The runProcess function returns a ProcessHandle. If you don't call waitForProcess on that handle, you'll leak those handles. On Unix-like systems, this means you'll accumulate zombie processes and potentially fill your process table, DoSing your machine.

On Thu, Jun 21, 2007 at 01:36:16PM -0700, Bryan O'Sullivan wrote:
Andrea Rossato wrote:
Still I do not understand you reference to the leak problem. Could you please elaborate a bit?
The runProcess function returns a ProcessHandle. If you don't call waitForProcess on that handle, you'll leak those handles. On Unix-like systems, this means you'll accumulate zombie processes and potentially fill your process table, DoSing your machine.
ahhh, yes, I found out the hard way. By the way, in the code I'm writing I was waiting for the exit code of the process. I forgot to copy it in the first example, and when I run it...;-) as you can see, the second version already corrected the problem. Thanks for your kind attention. All the best. andrea
participants (4)
-
Andrea Rossato
-
Brandon S. Allbery KF8NH
-
Bryan O'Sullivan
-
Dougal Stanton