Proposal: overhaul System.Process

newer
darcs patch: Fix #2005 [network...

Simon Marlow

22 Apr 2008 22 Apr '08

10:19 p.m.

I've made some improvements to System.Process that I'd like to get feedback on. Everything so far is backwards compatible in the sense that I've only added to the API - everything that was there before is still available, with the same semantics (except where bugs have been fixed). Haddock for the proposed new System.Process: http://darcs.haskell.org/~simonmar/process/System-Process.html Ticket: http://hackage.haskell.org/trac/ghc/ticket/2233 Discussion period: 4 weeks (20 May) Summary of changes: Tue Apr 22 15:02:16 PDT 2008 Simon Marlow * Overhall System.Process - fix #1780: pipes created by runInteractiveProcess are set close-on-exec by default - add a new, more general, form of process creation: createProcess Each of stdin, stdout and stderr may individually be taken from existing Handles or attached to new pipes. Also it has a nicer API. - add readProcess from Don Stewart's newpopen package. This function behaves like C's popen(). - Move System.Cmd.{system,rawSystem} into System.Process. Later we can depecate System.Cmd. - Don't use O_NONBLOCK for pipes, as it can confuse the process attached to the pipe (requires a fix to GHC.Handle in the base package). - move the tests from the GHC testsuite into the package itself, and add a couple more - bump the version to 2.0

Show replies by date

Neil Mitchell

22 Apr 22 Apr

10:29 p.m.

...

I've made some improvements to System.Process that I'd like to get feedback on.

It looks a lot nicer! I may be able to stop my standard trick of system "cmd > stdout.txt 2> stderr.txt" then readFile. The only function I was a bit concerned with was readProcess: readProcess :: FilePath -> [String] -> String -> IO (Either ExitCode String) I would have thought (ExitCode,String) was more appropriate. This interface means that readProcess cannot be lazy, as it must have the ExitCode before it generates the Right. Additionally, its probably quite important to have the output if something fails. I'd also like clarification if the result string is the stdout handle, or both stdout and stderr - I can see arguments for both variants, so perhaps both could be provided? Thanks Neil

Bryan O'Sullivan

10:35 p.m.

Neil Mitchell wrote:

...

I would have thought (ExitCode,String) was more appropriate.

Yes, definitely. What happens to stderr with this function, by the way? Is it tied to stdout (probably the right thing to do), or to /dev/null, or is it closed (eek!)? The haddock should make that clear. It would be useful if there was a readProcess variant that gave back a String each for stdout and stderr.

Simon Marlow

10:45 p.m.

Bryan O'Sullivan wrote:

...

Neil Mitchell wrote:

...
I would have thought (ExitCode,String) was more appropriate.

Yes, definitely.

Good point. Although I'm not sure I'm keen on readProcess being lazy (but you can have a lazy variant if you want).

...

What happens to stderr with this function, by the way? Is it tied to stdout (probably the right thing to do), or to /dev/null, or is it closed (eek!)?

None of the above :) Currently it's inherited from the parent. Unfortunately it's not easy to tie stderr and stdout to the same pipe - createProcess can't do that, and readProcess is defined in terms of it.

...

It would be useful if there was a readProcess variant that gave back a String each for stdout and stderr.

Would it be reasonable for that to be the only variant? Cheers, Simon

Neil Mitchell

10:54 p.m.

...

...
What happens to stderr with this function, by the way? Is it tied to stdout (probably the right thing to do), or to /dev/null, or is it closed (eek!)?

None of the above :) Currently it's inherited from the parent. Unfortunately it's not easy to tie stderr and stdout to the same pipe - createProcess can't do that, and readProcess is defined in terms of it.

It would be useful if they were tied, but not essential - its still a big improvement over currently.

...

...
It would be useful if there was a readProcess variant that gave back a String each for stdout and stderr.

Would it be reasonable for that to be the only variant?

If you are implementing this function strictly, then that should be sufficient. If it were lazy you'd probably want three variants: 1) Only return stdout, and dump stderr onto the normal stderr. 2) Return both stdout and stderr separately. 3) Tie stdout and stderr. I guess people who want laziness can implement it themselves directly, taking care to get whatever laziness it is that they want. Thanks Neil

Duncan Coutts

11:07 p.m.

On Tue, 2008-04-22 at 15:45 -0700, Simon Marlow wrote:

...

...
What happens to stderr with this function, by the way? Is it tied to stdout (probably the right thing to do), or to /dev/null, or is it closed (eek!)?

None of the above :) Currently it's inherited from the parent. Unfortunately it's not easy to tie stderr and stdout to the same pipe - createProcess can't do that, and readProcess is defined in terms of it.

I don't understand the restriction. What if we just pass stdout as the handle to use for stdout and stderr. The types say that's possible, so what would go wrong? Duncan

Simon Marlow

11:25 p.m.

Duncan Coutts wrote:

...

On Tue, 2008-04-22 at 15:45 -0700, Simon Marlow wrote:

...
...
What happens to stderr with this function, by the way? Is it tied to stdout (probably the right thing to do), or to /dev/null, or is it closed (eek!)? None of the above :) Currently it's inherited from the parent. Unfortunately it's not easy to tie stderr and stdout to the same pipe - createProcess can't do that, and readProcess is defined in terms of it.

I don't understand the restriction. What if we just pass stdout as the handle to use for stdout and stderr. The types say that's possible, so what would go wrong?

Yes that's possible, but what you can't do is create a new pipe and attach both stdout and stderr to it. Cheers, Simon

Duncan Coutts

10:49 p.m.

On Tue, 2008-04-22 at 15:35 -0700, Bryan O'Sullivan wrote:

...

Neil Mitchell wrote:

...
I would have thought (ExitCode,String) was more appropriate.

Yes, definitely.

Yes, I mentioned this to Don previously when he published his popen code. I think he agreed. Duncan

Don Stewart

10:52 p.m.

duncan.coutts:

...

On Tue, 2008-04-22 at 15:35 -0700, Bryan O'Sullivan wrote:

...
Neil Mitchell wrote:

...
I would have thought (ExitCode,String) was more appropriate.

Yes, definitely.

Yes, I mentioned this to Don previously when he published his popen code. I think he agreed.

Duncan

I'd changed, but not pushed out, process-light: -- -- | readProcess forks an external process, reads its standard output -- strictly, blocking until the process terminates, and returns either the output -- string, or, in the case of non-zero exit status, an error code, and -- any output. -- -- Output is returned strictly, so this is not suitable for -- interactive applications. -- -- Users of this library should compile with -threaded if they -- want other Haskell threads to keep running while waiting on -- the result of readProcess. -- -- > > readProcess "date" [] [] -- > Right "Thu Feb 7 10:03:39 PST 2008\n" -- -- The argumenst are: -- -- * The command to run, which must be in the $PATH, or an absolute path -- -- * A list of separate command line arguments to the program -- -- * A string to pass on the standard input to the program. -- readProcess :: FilePath -- ^ command to run -> [String] -- ^ any arguments -> String -- ^ standard input -> IO (Either (ExitCode,String) String) -- ^ either the stdout, or an exitcode and any output readProcess cmd args input = C.handle (return . handler) $ do (inh,outh,errh,pid) <- runInteractiveProcess cmd args Nothing Nothing output <- hGetContents outh outMVar <- newEmptyMVar forkIO $ (C.evaluate (length output) >> putMVar outMVar ()) when (not (null input)) $ hPutStr inh input takeMVar outMVar ex <- C.catch (waitForProcess pid) (\_e -> return ExitSuccess) hClose outh hClose inh -- done with stdin hClose errh -- ignore stderr return $ case ex of ExitSuccess -> Right output ExitFailure _ -> Left (ex, output) where handler (C.ExitException e) = Left (e,"") handler e = Left (ExitFailure 1, show e)

Duncan Coutts

11:08 p.m.

On Tue, 2008-04-22 at 15:52 -0700, Don Stewart wrote:

...

duncan.coutts:

...

...
...
...
I would have thought (ExitCode,String) was more appropriate.

Yes, definitely.

Yes, I mentioned this to Don previously when he published his popen code. I think he agreed.

...

I'd changed, but not pushed out, process-light:

...

readProcess :: FilePath -- ^ command to run -> [String] -- ^ any arguments -> String -- ^ standard input -> IO (Either (ExitCode,String) String) -- ^ either the stdout, or an exitcode and any output

You don't need the Either. ExitCode already covers the case when the process terminates successfully. Duncan

Don Stewart

11:09 p.m.

duncan.coutts:

...

On Tue, 2008-04-22 at 15:52 -0700, Don Stewart wrote:

...
duncan.coutts:

...
...
...
...
I would have thought (ExitCode,String) was more appropriate.

Yes, definitely.

Yes, I mentioned this to Don previously when he published his popen code. I think he agreed.

...
I'd changed, but not pushed out, process-light:

...
readProcess :: FilePath -- ^ command to run -> [String] -- ^ any arguments -> String -- ^ standard input -> IO (Either (ExitCode,String) String) -- ^ either the stdout, or an exitcode and any output

You don't need the Either. ExitCode already covers the case when the process terminates successfully.

But we want to force people to check the failure case. Just returning the tuple doesn't help there. -- Don

Duncan Coutts

11:28 p.m.

On Tue, 2008-04-22 at 16:09 -0700, Don Stewart wrote:

...

duncan.coutts:

...

...
...
readProcess :: FilePath -- ^ command to run -> [String] -- ^ any arguments -> String -- ^ standard input -> IO (Either (ExitCode,String) String) -- ^ either the stdout, or an exitcode and any output

You don't need the Either. ExitCode already covers the case when the process terminates successfully.

But we want to force people to check the failure case. Just returning the tuple doesn't help there.

In Cabal we have two versions: usualConvenientVersion :: FilePath -> [String] -> IO String moreGeneralVersion :: FilePath -> [String] -> IO (String, ExitCode) The first version - that we expect to use most often - just throws an exception if the exit code is non-0. In our experience this is almost always the right thing to do. There is only one place in Cabal where we expect the command to fail but we need the output anyway. We previously had all our process functions return an ExitCode and they were routinely ignored. I suppose the fact that with readProcess people will be interested in the result does help the situation. Duncan

Bryan O'Sullivan

23 Apr 23 Apr

3:40 a.m.

Duncan Coutts wrote:

...

In Cabal we have two versions:

usualConvenientVersion :: FilePath -> [String] -> IO String

moreGeneralVersion :: FilePath -> [String] -> IO (String, ExitCode)

I have written those two functions perhaps twenty times, with the semantics that Duncan describes, in maybe four different languages. It would indeed be very nice to have them in the standard toolbox :-)

John Meacham

2:41 a.m.

On Tue, Apr 22, 2008 at 04:09:56PM -0700, Don Stewart wrote:

...

duncan.coutts:

...
On Tue, 2008-04-22 at 15:52 -0700, Don Stewart wrote:

...
duncan.coutts:

...
...
...
...
I would have thought (ExitCode,String) was more appropriate.

Yes, definitely.

Yes, I mentioned this to Don previously when he published his popen code. I think he agreed.

...
I'd changed, but not pushed out, process-light:

...
readProcess :: FilePath -- ^ command to run -> [String] -- ^ any arguments -> String -- ^ standard input -> IO (Either (ExitCode,String) String) -- ^ either the stdout, or an exitcode and any output

You don't need the Either. ExitCode already covers the case when the process terminates successfully.

But we want to force people to check the failure case. Just returning the tuple doesn't help there.

But it is much more elegant, cleaner code, and more in line with the underlying semantics. A general library API shouldn't force complexity on its users. Also, it has two redundant cases (Left (ExitSuccess,out)) and (Right out), which is a far worse bug in an API. Also, it means you can't have lazy output. since you won't know the error code until the process has finished completely. John -- John Meacham - ⑆repetae.net⑆john⑈

Don Stewart

2:45 a.m.

john:

...

On Tue, Apr 22, 2008 at 04:09:56PM -0700, Don Stewart wrote:

...
duncan.coutts:

...
On Tue, 2008-04-22 at 15:52 -0700, Don Stewart wrote:

...
duncan.coutts:

...
...
...
> I would have thought (ExitCode,String) was more appropriate.

Yes, definitely.

Yes, I mentioned this to Don previously when he published his popen code. I think he agreed.

...
I'd changed, but not pushed out, process-light:

...
readProcess :: FilePath -- ^ command to run -> [String] -- ^ any arguments -> String -- ^ standard input -> IO (Either (ExitCode,String) String) -- ^ either the stdout, or an exitcode and any output

You don't need the Either. ExitCode already covers the case when the process terminates successfully.

But we want to force people to check the failure case. Just returning the tuple doesn't help there.

But it is much more elegant, cleaner code, and more in line with the underlying semantics. A general library API shouldn't force complexity on its users. Also, it has two redundant cases (Left (ExitSuccess,out)) and (Right out), which is a far worse bug in an API.

Also, it means you can't have lazy output. since you won't know the error code until the process has finished completely.

Yeah, this was originally written for lambdabot, where lazy output just wasn't an option -- and possibly dangerous. Finding a good type that encourages the kind of "correctness" approach to handling errors that we like in Haskell would be good though -- if we can improve safety, cheaply, let's do it! -- Don

John Meacham

2:54 a.m.

On Tue, Apr 22, 2008 at 07:45:38PM -0700, Don Stewart wrote:

...

...
But it is much more elegant, cleaner code, and more in line with the underlying semantics. A general library API shouldn't force complexity on its users. Also, it has two redundant cases (Left (ExitSuccess,out)) and (Right out), which is a far worse bug in an API.

Also, it means you can't have lazy output. since you won't know the error code until the process has finished completely.

Yeah, this was originally written for lambdabot, where lazy output just wasn't an option -- and possibly dangerous.

Finding a good type that encourages the kind of "correctness" approach to handling errors that we like in Haskell would be good though -- if we can improve safety, cheaply, let's do it!

It seems more verbose and ambiguous than safe, because now you have to look up the documentation to figure out what the difference between (Left (ExitSuccess,s)) and (Right s) is, taking up precious, precious mindspace to remembering it and introducing another place a bug can be introduced. Code clarity does a lot more for correctness (and debugability) than dubious measures to improve some idea of safety. John -- John Meacham - ⑆repetae.net⑆john⑈

David Roundy

2:50 p.m.

On Tue, Apr 22, 2008 at 7:54 PM, John Meacham wrote:

...

On Tue, Apr 22, 2008 at 07:45:38PM -0700, Don Stewart wrote:

...
Finding a good type that encourages the kind of "correctness" approach to handling errors that we like in Haskell would be good though -- if we can improve safety, cheaply, let's do it!

It seems more verbose and ambiguous than safe, because now you have to look up the documentation to figure out what the difference between (Left (ExitSuccess,s)) and (Right s) is, taking up precious, precious mindspace to remembering it and introducing another place a bug can be introduced. Code clarity does a lot more for correctness (and debugability) than dubious measures to improve some idea of safety.

Personally, I'd rather have a version that just throws an exception when the exit code is non-zero. As Duncan mentioned, this is usually what you want to do. Given that the IO monad already has pretty nice (and flexible) error handling, and that this is only a convenience function, which is easily implemented in terms of createProcess, it seems like we should make it actually be convenient. Using Either for error handling means that we can't use this for "simple" cases where the right thing is to fail when the function fails. Using a tuple as the output means that for "simple" cases, folks will almost always do the wrong thing, which is to ignore errors. David

Simon Marlow

6:29 p.m.

David Roundy wrote:

...

On Tue, Apr 22, 2008 at 7:54 PM, John Meacham wrote:

...
On Tue, Apr 22, 2008 at 07:45:38PM -0700, Don Stewart wrote:

...
Finding a good type that encourages the kind of "correctness" approach to handling errors that we like in Haskell would be good though -- if we can improve safety, cheaply, let's do it!

It seems more verbose and ambiguous than safe, because now you have to look up the documentation to figure out what the difference between (Left (ExitSuccess,s)) and (Right s) is, taking up precious, precious mindspace to remembering it and introducing another place a bug can be introduced. Code clarity does a lot more for correctness (and debugability) than dubious measures to improve some idea of safety.

Personally, I'd rather have a version that just throws an exception when the exit code is non-zero. As Duncan mentioned, this is usually what you want to do. Given that the IO monad already has pretty nice (and flexible) error handling, and that this is only a convenience function, which is easily implemented in terms of createProcess, it seems like we should make it actually be convenient. Using Either for error handling means that we can't use this for "simple" cases where the right thing is to fail when the function fails. Using a tuple as the output means that for "simple" cases, folks will almost always do the wrong thing, which is to ignore errors.

Ok, here's the new proposal. readProcess :: FilePath -- ^ command to run -> [String] -- ^ any arguments -> String -- ^ standard input -> IO String -- ^ stdout + stderr readProcessMayFail :: FilePath -- ^ command to run -> [String] -- ^ any arguments -> String -- ^ standard input -> IO (ExitCode,String) -- ^ exitcode, and stdout + stderr It turns out to be dead easy to bind stderr and stdout to the same pipe. After a couple of minor tweaks the following now works: createProcess (proc cmd args){ std_out = CreatePipe, std_err = UseHandle stdout } So now we have: Prelude System.Process> readProcessMayFail "ls" ["/foo"] "" (ExitFailure 2,"ls: /foo: No such file or directory\n") Prelude System.Process> readProcess "ls" ["/foo"] "" *** Exception: readProcess: ls: failed Look ok? Incedentally, for those that know of such things, should readProcess do the same signal management that system currently does? That is, ignore SIGINT and SIGQUIT in the parent and restore them to the default in the child? Cheers, Simon

Bryan O'Sullivan

6:55 p.m.

Simon Marlow wrote:

...

Incedentally, for those that know of such things, should readProcess do the same signal management that system currently does? That is, ignore SIGINT and SIGQUIT in the parent and restore them to the default in the child?

Why does system do that in the first place? Are we not calling the underlying platform's system(3)? Since these functions are supposed to be similar to popen(3), they shouldn't touch signals. The POSIX.2 rationale explicitly states that popen implementations that mess with the parent's signals while waiting for the child are non-conforming.

David Roundy

6:58 p.m.

On Wed, Apr 23, 2008 at 11:29:42AM -0700, Simon Marlow wrote:

...

So now we have:

Prelude System.Process> readProcessMayFail "ls" ["/foo"] "" (ExitFailure 2,"ls: /foo: No such file or directory\n") Prelude System.Process> readProcess "ls" ["/foo"] "" *** Exception: readProcess: ls: failed

Look ok?

Looks fine as an API. As an implementation, I'd prefer for the exception thrown to include stderr (and wouldn't mind if the output didn't include stderr). It'd be much nicer if we had: Prelude System.Process> readProcess "ls" ["/foo"] "" *** Exception: readProcess: ls /foo: No such file or directory This would mean that correct programs could use readProcess without sacrificing nice feedback when something unusual happens. Of course, we can't guarantee that stderr will give any hint as to what went wrong, but that's not our bug. We could also potentially include both stdout and stderr, or just the last few lines of the stdout/stderr combination. But it'd be nice to be able to use readProcess rather than being forced to write our own in order to give better error messages to our users. -- David Roundy Department of Physics Oregon State University

Simon Marlow

8:27 p.m.

David Roundy wrote:

...

On Wed, Apr 23, 2008 at 11:29:42AM -0700, Simon Marlow wrote:

...
So now we have:

Prelude System.Process> readProcessMayFail "ls" ["/foo"] "" (ExitFailure 2,"ls: /foo: No such file or directory\n") Prelude System.Process> readProcess "ls" ["/foo"] "" *** Exception: readProcess: ls: failed

Look ok?

Looks fine as an API. As an implementation, I'd prefer for the exception thrown to include stderr (and wouldn't mind if the output didn't include stderr). It'd be much nicer if we had:

Prelude System.Process> readProcess "ls" ["/foo"] "" *** Exception: readProcess: ls /foo: No such file or directory

This would mean that correct programs could use readProcess without sacrificing nice feedback when something unusual happens. Of course, we can't guarantee that stderr will give any hint as to what went wrong, but that's not our bug. We could also potentially include both stdout and stderr, or just the last few lines of the stdout/stderr combination.

Yes, there are a couple of problems here: 1. stdout and stderr are tied together, so we don't know which parts of the output are stderr. 2. the output might be multi-line, and it's not clear how much or which parts to include. The easy answer is just "include it all", but then the error messages could get arbitrarily long and potentially include a lot of superfluous information. However, I can certainly include the arguments and the exit code in the exception, which I'm not currently doing. Cheers, Simon

David Roundy

8:36 p.m.

On Wed, Apr 23, 2008 at 01:27:24PM -0700, Simon Marlow wrote:

...

David Roundy wrote:

...
On Wed, Apr 23, 2008 at 11:29:42AM -0700, Simon Marlow wrote:

...
So now we have:

Prelude System.Process> readProcessMayFail "ls" ["/foo"] "" (ExitFailure 2,"ls: /foo: No such file or directory\n") Prelude System.Process> readProcess "ls" ["/foo"] "" *** Exception: readProcess: ls: failed

Look ok?

Looks fine as an API. As an implementation, I'd prefer for the exception thrown to include stderr (and wouldn't mind if the output didn't include stderr). It'd be much nicer if we had:

Prelude System.Process> readProcess "ls" ["/foo"] "" *** Exception: readProcess: ls /foo: No such file or directory

This would mean that correct programs could use readProcess without sacrificing nice feedback when something unusual happens. Of course, we can't guarantee that stderr will give any hint as to what went wrong, but that's not our bug. We could also potentially include both stdout and stderr, or just the last few lines of the stdout/stderr combination.

Yes, there are a couple of problems here:

1. stdout and stderr are tied together, so we don't know which parts of the output are stderr.

2. the output might be multi-line, and it's not clear how much or which parts to include.

The easy answer is just "include it all", but then the error messages could get arbitrarily long and potentially include a lot of superfluous information.

However, I can certainly include the arguments and the exit code in the exception, which I'm not currently doing.

Why not then leave the stderr out of the output, and just print it to stderr? It's the standard location to send error output, and I'd hate to lose it. -- David Roundy Department of Physics Oregon State University

Simon Marlow

29 Apr 29 Apr

5:14 p.m.

David Roundy wrote:

...

Why not then leave the stderr out of the output, and just print it to stderr? It's the standard location to send error output, and I'd hate to lose it.

Ok, so here's the new proposal: readProcess :: FilePath -- ^ command to run -> [String] -- ^ any arguments -> String -- ^ standard input -> IO String -- ^ stdout only (stderr is inherited) readProcessWithExitCode :: FilePath -- ^ command to run -> [String] -- ^ any arguments -> String -- ^ standard input -> IO (ExitCode,String) -- ^ exitcode, and stdout + stderr There's an inconsistency between the two variants in where stderr goes, but that seems unavoidable. And you can always roll your own if you want something different, it's only 12 lines of code and all the pieces are available separately. We can put the code for one of them in the docs as an example. Ok? I'm also thinking of adding closeFds :: Bool to the CreateProcess record, to indicate that all FDs except 0..2 should be closed in the child. Python's version has this: http://docs.python.org/lib/node528.html (which is suspiciously similar to ours, clearly great minds think alike :-) and we have a ticket open for this in GHC: http://hackage.haskell.org/trac/ghc/ticket/1415 Cheers, Simon

David Roundy

5:48 p.m.

On Tue, Apr 29, 2008 at 10:14:44AM -0700, Simon Marlow wrote:

...

David Roundy wrote:

...
Why not then leave the stderr out of the output, and just print it to stderr? It's the standard location to send error output, and I'd hate to lose it.

Ok, so here's the new proposal:

readProcess :: FilePath -- ^ command to run -> [String] -- ^ any arguments -> String -- ^ standard input -> IO String -- ^ stdout only (stderr is inherited)

readProcessWithExitCode :: FilePath -- ^ command to run -> [String] -- ^ any arguments -> String -- ^ standard input -> IO (ExitCode,String) -- ^ exitcode, and stdout + stderr

Looks good to me! I imagine (hope?) readProcess will see more use than readProcessWithExitCode, since it's simpler and easier to use. -- David Roundy Department of Physics Oregon State University

John Meacham

23 Apr 23 Apr

9:25 p.m.

On Wed, Apr 23, 2008 at 11:29:42AM -0700, Simon Marlow wrote:

...

Ok, here's the new proposal.

readProcess :: FilePath -- ^ command to run -> [String] -- ^ any arguments -> String -- ^ standard input -> IO String -- ^ stdout + stderr

readProcessMayFail :: FilePath -- ^ command to run -> [String] -- ^ any arguments -> String -- ^ standard input -> IO (ExitCode,String) -- ^ exitcode, and stdout + stderr

MayFail seems to be attached to the wrong one here. 'readProcess' is the one that might fail, the second actual call always succeeds but returns an error code. I think readProcessWithExitCode is better. John -- John Meacham - ⑆repetae.net⑆john⑈

Simon Marlow

24 Apr 24 Apr

4:11 p.m.

On 23/04/2008, John Meacham wrote:

...

On Wed, Apr 23, 2008 at 11:29:42AM -0700, Simon Marlow wrote:

...
Ok, here's the new proposal.

readProcess :: FilePath -- ^ command to run -> [String] -- ^ any arguments -> String -- ^ standard input -> IO String -- ^ stdout + stderr

readProcessMayFail :: FilePath -- ^ command to run -> [String] -- ^ any arguments -> String -- ^ standard input -> IO (ExitCode,String) -- ^ exitcode, and stdout + stderr

MayFail seems to be attached to the wrong one here. 'readProcess' is the one that might fail, the second actual call always succeeds but returns an error code. I think readProcessWithExitCode is better.

yes, well the idea was that you would use readProcessMayFail when you are anticipating that the process might fail. Still, I like your suggestion of readProcessWithExitCode better, so I'll go with that. Cheers, Simon

Duncan Coutts

22 Apr 22 Apr

11:01 p.m.

On Tue, 2008-04-22 at 15:19 -0700, Simon Marlow wrote:

...

I've made some improvements to System.Process that I'd like to get feedback on. Everything so far is backwards compatible in the sense that I've only added to the API - everything that was there before is still available, with the same semantics (except where bugs have been fixed).

Haddock for the proposed new System.Process:

http://darcs.haskell.org/~simonmar/process/System-Process.html

Looks good.

...

Summary of changes:

Tue Apr 22 15:02:16 PDT 2008 Simon Marlow * Overhall System.Process

- fix #1780: pipes created by runInteractiveProcess are set close-on-exec by default

- add a new, more general, form of process creation: createProcess Each of stdin, stdout and stderr may individually be taken from existing Handles or attached to new pipes. Also it has a nicer API.

Yay!

...

- add readProcess from Don Stewart's newpopen package. This function behaves like C's popen().

I'll double check that we can use this in Cabal where we currently have to implement something similar using #ifdef, doing it differntly for ghc vs nhc/hugs due to different compilers implementing different apis and the ghc api not being usable without pre-emptive threads (iirc). Our current function is :: FilePath -> [String] -> IO (String, ExitCode) So that connects stdin to /dev/null, I expect we can implement that in terms of the new createProcess.

...

- Move System.Cmd.{system,rawSystem} into System.Process. Later we can depecate System.Cmd.

Do you suppose we can rename the system/rawSystem given that we're already moving them from one module to another? Just off the top of my head, how about "runShellCommand" & "runProgram", better suggestions welcome.

...

- Don't use O_NONBLOCK for pipes, as it can confuse the process attached to the pipe (requires a fix to GHC.Handle in the base package).

- move the tests from the GHC testsuite into the package itself, and add a couple more

- bump the version to 2.0

Simon Marlow

11:35 p.m.

Duncan Coutts wrote:

...

Do you suppose we can rename the system/rawSystem given that we're already moving them from one module to another?

Just off the top of my head, how about "runShellCommand" & "runProgram", better suggestions welcome.

Well, ideally we'd do a complete renaming sweep, e.g. runProcess should be spawnProcess (or just removed entirely), then we could use runProcess for what is currently called rawSystem. But I've got enough flak for changing APIs in the past so I wimped out this time :-) runShellCommand for system is not good, because we already have runCommand which is the same except that it doesn't wait for completion. Something like runProcessAndWait would make sense, perhaps, but that's a mouthful. Cheers, Simon

Duncan Coutts

23 Apr 23 Apr

9:27 a.m.

On Tue, 2008-04-22 at 16:35 -0700, Simon Marlow wrote:

...

Duncan Coutts wrote:

...
Do you suppose we can rename the system/rawSystem given that we're already moving them from one module to another?

Just off the top of my head, how about "runShellCommand" & "runProgram", better suggestions welcome.

Well, ideally we'd do a complete renaming sweep, e.g. runProcess should be spawnProcess (or just removed entirely), then we could use runProcess for what is currently called rawSystem. But I've got enough flak for changing APIs in the past so I wimped out this time :-)

Ah but this isn't a change, it's a new api, so we have complete freedom. We're adding a new replacement for system/rawSystem and deprecating the old module. Duncan

Simon Marlow

29 Apr 29 Apr

5:20 p.m.

Duncan Coutts wrote:

...

On Tue, 2008-04-22 at 16:35 -0700, Simon Marlow wrote:

...
Duncan Coutts wrote:

...
Do you suppose we can rename the system/rawSystem given that we're already moving them from one module to another?

Just off the top of my head, how about "runShellCommand" & "runProgram", better suggestions welcome. Well, ideally we'd do a complete renaming sweep, e.g. runProcess should be spawnProcess (or just removed entirely), then we could use runProcess for what is currently called rawSystem. But I've got enough flak for changing APIs in the past so I wimped out this time :-)

Ah but this isn't a change, it's a new api, so we have complete freedom. We're adding a new replacement for system/rawSystem and deprecating the old module.

I can't think of a good naming scheme that doesn't break backwards compatibility. Suggestions welcome. Ideally we'd change runProcess to spawnProcess, and similarly for runCommand, runInteractiveCommand etc. But then what do we use for 'system' and 'rawSystem'? Good names for these are 'runCommand' and 'runProcess' respectively, but we can't re-use those names without breaking the API (we want to leave the old versions in place deprecated for a while). Cheers, Simon

Jules Bean

14 May 14 May

1:58 p.m.

Simon Marlow wrote:

...

Duncan Coutts wrote:

...
On Tue, 2008-04-22 at 16:35 -0700, Simon Marlow wrote:

...
Duncan Coutts wrote:

...
Do you suppose we can rename the system/rawSystem given that we're already moving them from one module to another?

Just off the top of my head, how about "runShellCommand" & "runProgram", better suggestions welcome. Well, ideally we'd do a complete renaming sweep, e.g. runProcess should be spawnProcess (or just removed entirely), then we could use runProcess for what is currently called rawSystem. But I've got enough flak for changing APIs in the past so I wimped out this time :-)

Ah but this isn't a change, it's a new api, so we have complete freedom. We're adding a new replacement for system/rawSystem and deprecating the old module.

I can't think of a good naming scheme that doesn't break backwards compatibility. Suggestions welcome.

Ideally we'd change runProcess to spawnProcess, and similarly for runCommand, runInteractiveCommand etc. But then what do we use for 'system' and 'rawSystem'? Good names for these are 'runCommand' and 'runProcess' respectively, but we can't re-use those names without breaking the API (we want to leave the old versions in place deprecated for a while).

Surely you can put the nicest new names in another module System.Process.RunCommand (or whatever) programs which use the old versions will still work. Programs which want the nicely named new versions can import from the new module. One day, in a later version, you can move them out to the main namespace. Jules

Bulat Ziganshin

2:20 p.m.

New subject: Re[2]: Proposal: overhaul System.Process

Hello Jules, Wednesday, May 14, 2008, 5:58:46 PM, you wrote:

...

Surely you can put the nicest new names in another module

...

System.Process.RunCommand

...

(or whatever)

...

programs which use the old versions will still work. Programs which want the nicely named new versions can import from the new module.

yes, it will work, but may become source of confusion: it will be impossible to copy-paste code between modules importing old and new functions and it will be impossible to understand behavior of code snippet without looking into imports -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

Jules Bean

2:23 p.m.

Bulat Ziganshin wrote:

...

Hello Jules,

Wednesday, May 14, 2008, 5:58:46 PM, you wrote:

...
Surely you can put the nicest new names in another module

...
System.Process.RunCommand

...
(or whatever)

...
programs which use the old versions will still work. Programs which want the nicely named new versions can import from the new module.

yes, it will work, but may become source of confusion: it will be impossible to copy-paste code between modules importing old and new functions and it will be impossible to understand behavior of code snippet without looking into imports

True enough. And this is already true. Consider the Traversable/Foldable functions which 'replace' the Data.List / Control.Monad versions. I feel haskell is slightly weak at this kind of book-keeping actually. It's annoying that having decided to prefer the 'new' mapM_, and hiding the old from Control.Monad, you have to also hide it from the other places which re-export it. I don't have a concrete suggestion for how to improve this :-( But the problem will surely come up again and again as the libraries get bigger and more things get either deprecated or generalised. Jules

Simon Marlow

15 May 15 May

9:45 a.m.

The latest version of the Process overhaul is here: http://darcs.haskell.org/~simonmar/process/System-Process.html and the patch is here: http://darcs.haskell.org/~simonmar/process-2.0.patch Changes from previous: - changes to readProcess, and added readProcessWithExitCode, as discussed on the list. - added close_fds flag to the CreateProcess record, to request closing all FDs in the child. See: http://hackage.haskell.org/trac/ghc/ticket/1415 - made it work on Windows, and the patch passes GHC's validate on both Windows and Linux. Reminder: the deadline for discussion is 20 May (5 days). Cheers, Simon

Neil Mitchell

16 May 16 May

1:34 p.m.

...

- made it work on Windows, and the patch passes GHC's validate on both Windows and Linux.

Reminder: the deadline for discussion is 20 May (5 days).

Fully support. I think it could benefit from a slight clarification to the documentation in createProcess, to help pattern-match safety checking. (aStdin,aStdout,aStderr,d) <- createProcess cp isJust aStdin == (std_in == CreatePipe) (and ditto for the out and err) I have talked to Simon on IRC, and he said he'll add some kind of clarification on this issue. I am looking forward to using this library! All we need now is a decent HTTP library... Thanks Neil

David Roundy

3:10 p.m.

On Fri, May 16, 2008 at 02:34:48PM +0100, Neil Mitchell wrote:

...

...
- made it work on Windows, and the patch passes GHC's validate on both Windows and Linux.

Reminder: the deadline for discussion is 20 May (5 days).

Fully support. I think it could benefit from a slight clarification to the documentation in createProcess, to help pattern-match safety checking.

(aStdin,aStdout,aStderr,d) <- createProcess cp isJust aStdin == (std_in == CreatePipe)

(and ditto for the out and err)

If only we had type families already, then we could define -- These data constructors are exported data Inherit = Inherit data CreatePipe = CreatePipe data UseHandle = UseHandle Handle data NewHandle = NewHandle Handle -- This class is not exported! (so CreateProcess needn't worry about new -- instances sowing up) class StdStream s where type Out s isInherit :: s -> Bool isCreatePipe :: s -> Bool isHandle :: s -> Maybe Handle instance StdStream Inherit where type Out Inherit = () ... instance StdStream IsHandle where type Out IsHandle = () ... instance StdStream CreateProcess where type Out Inherit = NewHandle ... data CreateProcess sin sout serr = CreateProcess { cmdspec :: CmdSpec cwd :: (Maybe FilePath) env :: (Maybe [(String, String)]) std_in :: sin std_out :: sout std_err :: serr close_fds :: Bool } createProcess :: (StdStream sin, StdStream sout, StdStream serr) => CreateProcess sin sout serr -> IO (Out sin, Out sout, Out serr, ProcessHandle) Then we could have a static guarantee that we only try to peek at actually-created pipes. I suppose this is a bit heavy infrastructure just to avoid runtime checks for "Just", but in a few years (say, post Haskell'...) it'd be nice to have safer instances like this. David

Neil Mitchell

3:13 p.m.

...

createProcess :: (StdStream sin, StdStream sout, StdStream serr) => CreateProcess sin sout serr -> IO (Out sin, Out sout, Out serr, ProcessHandle)

Then we could have a static guarantee that we only try to peek at actually-created pipes. I suppose this is a bit heavy infrastructure just to avoid runtime checks for "Just", but in a few years (say, post Haskell'...) it'd be nice to have safer instances like this.

Of course, you can already have these checks without any effort at all, using Catch: http://www-users.cs.york.ac.uk/~ndm/catch/ If only Catch didn't depend on Yhc... Thanks Neil

David Roundy

3:40 p.m.

On Fri, May 16, 2008 at 04:13:35PM +0100, Neil Mitchell wrote:

...

...
createProcess :: (StdStream sin, StdStream sout, StdStream serr) => CreateProcess sin sout serr -> IO (Out sin, Out sout, Out serr, ProcessHandle)

Then we could have a static guarantee that we only try to peek at actually-created pipes. I suppose this is a bit heavy infrastructure just to avoid runtime checks for "Just", but in a few years (say, post Haskell'...) it'd be nice to have safer instances like this.

Of course, you can already have these checks without any effort at all, using Catch: http://www-users.cs.york.ac.uk/~ndm/catch/

But then you'd also have to restrict yourself to Haskell 98, right? Or at least to a subset of ghc's extensions. It's not nearly as nice as a solution in the type checker, since it relies on every user of the library running an extra tool if they want a safe interface. Also, it does nothing to eliminate the actual (admittedly trivial) runtime cost of the use of the Maybe type. Actually, though, now that I think about this, I'm curious... does this mean you intend to teach Catch about this particular interface? Or is it able to sneak into the source code of the library in order to infer that as long as we pass CreatePipe for stdin, the first element of the tuple will be a Just? I suppose it must. Which illustrates another advantage of a type-level solution: it allows programmers to infer the behavior of the function from its type, rather than requiring them to look at either its documentation or its implementation (both of which have significant disadvantages... e.g. either of them might not correctly describe all past and future versions of the library, while it's usually a safe assumption that a type-level constraint will lead to a type-check error on any version of the library that fails to satisfy said constraint). -- David Roundy Department of Physics Oregon State University

Neil Mitchell

6:23 p.m.

...

...
Of course, you can already have these checks without any effort at all, using Catch: http://www-users.cs.york.ac.uk/~ndm/catch/

But then you'd also have to restrict yourself to Haskell 98, right? Or at least to a subset of ghc's extensions.

The Catch tool, as currently implemented, works with Yhc's Core language. Currently you can only generate Yhc's Core language with Yhc. Therefore, currently, the Catch implementation is restricted to Haskell 98 (plus pattern guards). The theory and implementation of Catch are unrestricted, and will cope with all of GHC's extensions including type families, GADT's, MPTC's + FD's, implicit parameters, rank-n types, linear implicit parameters, unboxed types - a slightly larger range of things than GHC's backend code generator can deal with :-) There is a converter from GHC Core to Yhc Core, but GHC's external Core is a bit too flakey/unfinished/in progress to have things work just yet, perhaps once 6.10 is out...

...

It's not nearly as nice as a solution in the type checker, since it relies on every user of the library running an extra tool if they want a safe interface.

True. But its 0 lines of code, rather than type class hackery - you pay a lot less, you get very slightly less. You can always alias ghc to ghc && catch.

...

Also, it does nothing to eliminate the actual (admittedly trivial) runtime cost of the use of the Maybe type.

Use a supercompiler, which typically eliminates most things: http://www-users.cs.york.ac.uk/~ndm/supero/ - everything I wrote about Catch and Haskell 98 applies in exactly the same way to Supero. Plus remember that this command will always spawn a process - I think we could put a simple fib generator in there and it still wouldn't make a difference.

...

Actually, though, now that I think about this, I'm curious... does this mean you intend to teach Catch about this particular interface? Or is it able to sneak into the source code of the library in order to infer that as long as we pass CreatePipe for stdin, the first element of the tuple will be a Just? I suppose it must.

It will almost certainly be able to infer the relationship between the input and output from the implementation. You could of course make createProcess a primitive, in which case you could tell Catch this information.

...

Which illustrates another advantage of a type-level solution: it allows programmers to infer the behavior of the function from its type, rather than requiring them to look at either its documentation or its implementation (both of which have significant disadvantages... e.g. either of them might not correctly describe all past and future versions of the library, while it's usually a safe assumption that a type-level constraint will lead to a type-check error on any version of the library that fails to satisfy said constraint).

Catch can generate documentation which is precise, and can check annotations. Your point about type signature being a stronger guarantee that is less likely to be broken in the future is completely valid. I'm not recommending we scrap the type system and embrace Catch instead. I'm just suggesting that sometimes instead of hacking the type system in ways that makes peoples heads hurt, there may be an alternative* :-) Thanks Neil * Disclaimer: Alternative may be limited to Haskell 98. Alternative may not work as advertised. Alternative may be poorly supported, compared to something as impressive as GHC. Alternative may turn out to be not the right thing in every case.

David Roundy

6:33 p.m.

On Fri, May 16, 2008 at 07:23:57PM +0100, Neil Mitchell wrote:

...

Hi

Hello!

...

...
It's not nearly as nice as a solution in the type checker, since it relies on every user of the library running an extra tool if they want a safe interface.

True. But its 0 lines of code, rather than type class hackery - you pay a lot less, you get very slightly less. You can always alias ghc to ghc && catch.

The problem is that you can't always do this if you're the library writer, since you can't control all your users' configurations, so there are still plenty of good reasons to try to design good APIs! [...]

...

I'm not recommending we scrap the type system and embrace Catch instead. I'm just suggesting that sometimes instead of hacking the type system in ways that makes peoples heads hurt, there may be an alternative* :-)

Indeed, I certainly appreciate the existence of Catch--and don't really care for the API I suggested. But I also would be very surprised if Manuel or Simon couldn't come up with a far prettier API that would be just as effective (if we were willing to use type families in this package, which I don't recommend). When I see an API like this one with a very simple relationship between input and output that is forced to be dynamically checked, I can't help but think that we can do better than this. David

Simon Marlow

19 May 19 May

10:41 a.m.

David Roundy wrote:

...

If only we had type families already, then we could define

-- These data constructors are exported data Inherit = Inherit data CreatePipe = CreatePipe data UseHandle = UseHandle Handle data NewHandle = NewHandle Handle

-- This class is not exported! (so CreateProcess needn't worry about new -- instances sowing up) class StdStream s where type Out s isInherit :: s -> Bool isCreatePipe :: s -> Bool isHandle :: s -> Maybe Handle

instance StdStream Inherit where type Out Inherit = () ...

instance StdStream IsHandle where type Out IsHandle = () ...

instance StdStream CreateProcess where type Out Inherit = NewHandle ...

data CreateProcess sin sout serr = CreateProcess { cmdspec :: CmdSpec cwd :: (Maybe FilePath) env :: (Maybe [(String, String)]) std_in :: sin std_out :: sout std_err :: serr close_fds :: Bool }

createProcess :: (StdStream sin, StdStream sout, StdStream serr) => CreateProcess sin sout serr -> IO (Out sin, Out sout, Out serr, ProcessHandle)

Yes, this is neat. I also played around with an alternative formulation using GADTs, but decided against it because of course System.Process is supposed to present a portable API. I sure hope we can add something like this in the future though. Cheers, Simon

Brian Brunswick

16 May 16 May

2:44 p.m.

On Thu, May 15, 2008 at 10:45 AM, Simon Marlow wrote:

...

The latest version of the Process overhaul is here:

- changes to readProcess, and added readProcessWithExitCode, as discussed on the list.

I'd just like to argue briefly again against merging stdout and stderr. My previous message didn't get to the whole list because of a different from address to that I subscribed with. * In other languages and cases in production code that has to deal with errors in a nice way, I have always in the past wanted separate stdout and stderr, and never come across a case where they needed to be merged. * Its not type safe! Surely the haskell spirit should not be to mix up two streams of output, with fundamentally different meanings. Data goes to stdout, errors and warnings go to stderr. * Most unix utilities are designed for separation to make sense. * The library should discourage careless programming that can mix errors in with data. -- Brian_Brunswick____brian@ithil.org____Wit____Disclaimer____!Shortsig_rules!

Duncan Coutts

23 May 23 May

7:22 p.m.

On Fri, 2008-05-16 at 15:44 +0100, Brian Brunswick wrote:

...

On Thu, May 15, 2008 at 10:45 AM, Simon Marlow wrote:

...
The latest version of the Process overhaul is here:

- changes to readProcess, and added readProcessWithExitCode, as discussed on the list.

I'd just like to argue briefly again against merging stdout and stderr.

I think I agree. A more sensible default would be to discard stdout completely or to use the inherited stderr. Though exactly which depends on the context unfortunately. For Cabal we have a function like readProcess and it discards stderr. This seems to be the right thing for the various uses in Cabal (except stupid "python -V" which prints the version number to stderr). Duncan

David Roundy

24 May 24 May

12:13 p.m.

On Fri, May 23, 2008 at 08:22:56PM +0100, Duncan Coutts wrote:

...

On Fri, 2008-05-16 at 15:44 +0100, Brian Brunswick wrote:

...
On Thu, May 15, 2008 at 10:45 AM, Simon Marlow wrote:

...
The latest version of the Process overhaul is here:

- changes to readProcess, and added readProcessWithExitCode, as discussed on the list.

I'd just like to argue briefly again against merging stdout and stderr.

I think I agree. A more sensible default would be to discard stdout completely or to use the inherited stderr. Though exactly which depends on the context unfortunately.

For Cabal we have a function like readProcess and it discards stderr. This seems to be the right thing for the various uses in Cabal (except stupid "python -V" which prints the version number to stderr).

I'd vote for inherited stderr instead, as otherwise it can be very hard for users (or developers!) to debug what goes wrong with an external command. -- David Roundy Department of Physics Oregon State University

Simon Marlow

27 May 27 May

8:22 a.m.

Duncan Coutts wrote:

...

On Fri, 2008-05-16 at 15:44 +0100, Brian Brunswick wrote:

...
On Thu, May 15, 2008 at 10:45 AM, Simon Marlow wrote:

...
The latest version of the Process overhaul is here:

- changes to readProcess, and added readProcessWithExitCode, as discussed on the list.

I'd just like to argue briefly again against merging stdout and stderr.

I think I agree. A more sensible default would be to discard stdout completely or to use the inherited stderr. Though exactly which depends on the context unfortunately.

For Cabal we have a function like readProcess and it discards stderr. This seems to be the right thing for the various uses in Cabal (except stupid "python -V" which prints the version number to stderr).

Neil Mitchell and David Roundy argued against separating stdout and stderr for readProcessWithExitCode. Myself I don't have a strong opinion - although I'm somewhat swayed by Brian's points. It does seem slightly cleaner to separate the two in the default API. Any more opinions? Cheers, Simon

Curt Sampson

11:57 a.m.

On 2008-05-27 09:22 +0100 (Tue), Simon Marlow wrote:

...

Neil Mitchell and David Roundy argued against separating stdout and stderr for readProcessWithExitCode. Myself I don't have a strong opinion - although I'm somewhat swayed by Brian's points. It does seem slightly cleaner to separate the two in the default API. Any more opinions?

One issue with merging them is, how do you merge them? Append one? Attempt to keep them in the order the stuff was printed? What happens when operating system buffering gets in there? I've seen enough messed-up merges of stdout and stderr that I'd generally far prefer to keep them separate, myself. cjs -- Curt Sampson +81 90 7737 2974 Mobile sites and software consulting: http://www.starling-software.com

Neil Mitchell

12:25 p.m.

...

Neil Mitchell and David Roundy argued against separating stdout and stderr for readProcessWithExitCode. Myself I don't have a strong opinion - although I'm somewhat swayed by Brian's points. It does seem slightly cleaner to separate the two in the default API. Any more opinions?

After reading Brian's arguments, I'm not convinced I was right. I still think it would be a shame if people just ignored the stderr, and blindly missed the errors which happened there, but I'm not overly convinced that interspersing them with stdout makes them any more visible. Thanks Neil

Curt Sampson

12:38 p.m.

On 2008-05-27 13:25 +0100 (Tue), Neil Mitchell wrote:

...

I still think it would be a shame if people just ignored the stderr, and blindly missed the errors which happened there....

It certainly would be a shame, but is it so hard to miss? I'd think that the type system would let you know in pretty strong terms that stderr is coming back as well, and you'd have to ignore it on purpose. cjs -- Curt Sampson +81 90 7737 2974 Mobile sites and software consulting: http://www.starling-software.com

Duncan Coutts

23 May 23 May

7:18 p.m.

On Thu, 2008-05-15 at 10:45 +0100, Simon Marlow wrote:

...

The latest version of the Process overhaul is here:

http://darcs.haskell.org/~simonmar/process/System-Process.html

We discussed some ideas on #ghc today and I'm posting them here so people can comment. Deprecate: * runCommand * runProcess * runInteractiveCommand * runInteractiveCommand The rationale is that these are all more limited than the new createProcess and yet none of them are really convenient. It's better to have fewer variations if they can all be expressed easily using createProcess. It makes the API much simpler. Also we'd not add `system` or `rawSystem` to `System.Process` and instead add new equivalents with more consistent names. That would leave just: * createProcess * readProcess * readProcessWithExitCode Then add the following: * callProcess :: FilePath -> [String] -> IO () * callCommand :: String -> IO () These would be synchronous like the current system and rawSystem. The difference is they would throw IOErrors on failure rather than returning the ExitCode which is so easily ignored. These are of course only convenience functions. If someone wants the exit code it should be easy to do it via createProcess and waitForProcess. We need to make sure that is indeed the case. We'd also add async versions of the above: * spawnProcess :: FilePath -> [String] -> IO ProcessHandle * spawnCommand :: String -> IO ProcessHandle that do not wait for the process to finish and return the ProcessHandle. Again these should be easy instances of createProcess. The docs should probably say as much so it's clear how to make minor variations. We also discussed how it should be safe to GC the ProcessHandle and not end up with zombie processes on unix systems. On Windows it's actually easy because you can close the process handle which means you don't get the exit status of the process. On unix we have to collect the exit status of every child process (actually you can ignore all of them, but you cannot ignore them selectively). The point is that with a convenient spawnProcess it's tempting to ignore the ProcessHandle result and never bother calling waitForProcess on it. We do want to support that. At the moment doing that would leave zombie processes. We discussed a mechanism to allow GC'ing ProcessHandles that does not leave zombies. It'd probably involve keeping a Map PID (MVar ExitCode) and embedding a MVar ExitCode in the ProcessHandle. Then when we get notified that a child process terminated we would store the exit status in the MVar. Then waitForProcess would just wait on that MVar ExitCode. The one thing we have left that we cannot express with createProcess is the behaviour with respect to ^C handling. For some processes we want to delegate ^C handling to that child process (eg imagine calling ghci). For others we want to handle ^C 'normally'. For details see #2301: http://hackage.haskell.org/trac/ghc/ticket/2301 Duncan

Simon Marlow

18 Jul 18 Jul

2:24 p.m.

Following Duncan's suggestions below, I have a further round of changes to the process library for comment. The new Haddock docs are here: http://darcs.haskell.org/~simonmar/process/System-Process.html Here is the patch message: * More System.Process overhaul New functions: callProcess :: FilePath -> [String] -> IO () callCommand :: String -> IO () spawnProcess :: FilePath -> [String] -> IO ProcessHandle spawnCommand :: String -> IO ProcessHandle Changes: - system and rawSystem have been removed from System.Process again. (they were only there temporarily after the last round of changes, now callCommand and callProcess replace them respectively). On Unix systems we now use SIGCHLD to detect process completion instead of calling waitpid(). This has several advantages: - much cheaper: no extra OS threads to do the waiting - doesn't require -threaded to get non-blocking waitForProcess - waitForProcess can be interrupted - no zombies left around (only relevant on Unix) However, it relies on the new signal API (see separate proposal). And these advantages aren't available on Windows (yet). Cheers, Simon Duncan Coutts wrote:

...

On Thu, 2008-05-15 at 10:45 +0100, Simon Marlow wrote:

...
The latest version of the Process overhaul is here:

http://darcs.haskell.org/~simonmar/process/System-Process.html

We discussed some ideas on #ghc today and I'm posting them here so people can comment.

Deprecate: * runCommand * runProcess * runInteractiveCommand * runInteractiveCommand

The rationale is that these are all more limited than the new createProcess and yet none of them are really convenient. It's better to have fewer variations if they can all be expressed easily using createProcess. It makes the API much simpler.

Also we'd not add `system` or `rawSystem` to `System.Process` and instead add new equivalents with more consistent names.

That would leave just:

* createProcess * readProcess * readProcessWithExitCode

Then add the following:

* callProcess :: FilePath -> [String] -> IO () * callCommand :: String -> IO ()

These would be synchronous like the current system and rawSystem. The difference is they would throw IOErrors on failure rather than returning the ExitCode which is so easily ignored.

These are of course only convenience functions. If someone wants the exit code it should be easy to do it via createProcess and waitForProcess. We need to make sure that is indeed the case.

We'd also add async versions of the above:

* spawnProcess :: FilePath -> [String] -> IO ProcessHandle * spawnCommand :: String -> IO ProcessHandle

that do not wait for the process to finish and return the ProcessHandle. Again these should be easy instances of createProcess. The docs should probably say as much so it's clear how to make minor variations.

We also discussed how it should be safe to GC the ProcessHandle and not end up with zombie processes on unix systems. On Windows it's actually easy because you can close the process handle which means you don't get the exit status of the process. On unix we have to collect the exit status of every child process (actually you can ignore all of them, but you cannot ignore them selectively).

The point is that with a convenient spawnProcess it's tempting to ignore the ProcessHandle result and never bother calling waitForProcess on it. We do want to support that. At the moment doing that would leave zombie processes.

We discussed a mechanism to allow GC'ing ProcessHandles that does not leave zombies. It'd probably involve keeping a Map PID (MVar ExitCode) and embedding a MVar ExitCode in the ProcessHandle. Then when we get notified that a child process terminated we would store the exit status in the MVar. Then waitForProcess would just wait on that MVar ExitCode.

The one thing we have left that we cannot express with createProcess is the behaviour with respect to ^C handling. For some processes we want to delegate ^C handling to that child process (eg imagine calling ghci). For others we want to handle ^C 'normally'. For details see #2301: http://hackage.haskell.org/trac/ghc/ticket/2301

Duncan

_______________________________________________ Libraries mailing list Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries

Frederik Eaton

25 May 25 May

11:27 p.m.

Hello, I think it's great to improve System.Process, and the new interface looks good... I just wanted to say that I found it quite easy to write my own wrappers around those functions, which implement an interface that I find easier to use, several years ago. The hardest part was not so much writing the new interface, but updating it whenever the standard interface changed and broke my code. So I'd like to cast my vote for backwards compatibility. The standard libraries will never be perfect, but constantly deprecating and removing functionality can really impair their usefulness for large projects. I don't mean to be negative but I wanted to voice that concern. Best wishes, Frederik Simon Marlow-7 wrote:

...

I've made some improvements to System.Process that I'd like to get feedback on. Everything so far is backwards compatible in the sense that I've only added to the API - everything that was there before is still available, with the same semantics (except where bugs have been fixed).

Haddock for the proposed new System.Process:

http://darcs.haskell.org/~simonmar/process/System-Process.html

Ticket:

http://hackage.haskell.org/trac/ghc/ticket/2233

Discussion period: 4 weeks (20 May)

Summary of changes:

Tue Apr 22 15:02:16 PDT 2008 Simon Marlow * Overhall System.Process

- fix #1780: pipes created by runInteractiveProcess are set close-on-exec by default

- add a new, more general, form of process creation: createProcess Each of stdin, stdout and stderr may individually be taken from existing Handles or attached to new pipes. Also it has a nicer API.

- add readProcess from Don Stewart's newpopen package. This function behaves like C's popen().

- Move System.Cmd.{system,rawSystem} into System.Process. Later we can depecate System.Cmd.

- Don't use O_NONBLOCK for pipes, as it can confuse the process attached to the pipe (requires a fix to GHC.Handle in the base package).

- move the tests from the GHC testsuite into the package itself, and add a couple more

- bump the version to 2.0 _______________________________________________ Libraries mailing list Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries

-- View this message in context: http://www.nabble.com/Proposal%3A-overhaul-System.Process-tp16844161p1746361... Sent from the Haskell - Libraries mailing list archive at Nabble.com.

Duncan Coutts

11:40 p.m.

On Sun, 2008-05-25 at 16:27 -0700, Frederik Eaton wrote:

...

Hello,

I think it's great to improve System.Process, and the new interface looks good... I just wanted to say that I found it quite easy to write my own wrappers around those functions, which implement an interface that I find easier to use, several years ago. The hardest part was not so much writing the new interface, but updating it whenever the standard interface changed and broke my code. So I'd like to cast my vote for backwards compatibility. The standard libraries will never be perfect, but constantly deprecating and removing functionality can really impair their usefulness for large projects. I don't mean to be negative but I wanted to voice that concern.

The proposal is to deprecate several functions but not to remove anything. The decision to remove anything is independent. One this issue, one thing that I think may help is to have a period where the deprecated functions still exist but are removed from the documentation. The rationale is that one of the main reasons for removing functions that are essentially duplicated by newer variants is because they clutter the API docs. So the deprecation cycle would be something like: 1. normal 2. deprecated and still documented 3. deprecated and removed from documentation 4. removed Another thing that would help is if haddock marked deprecated functions as such and kept them in a separate section of the index, at least in the synopsis. It should be easy for haddock 2.x to find deprecated functions since it can just look at the pragmas which are presumably preserved by the GHC api. Duncan

Frederik Eaton

18 Jun 18 Jun

10:37 p.m.

I'm sorry for my late reply. This all sounds reasonable. My only concerns are that deprecating the old interfaces seems like a declaration of intent to eventually remove them. I believe that to the extent possible, library development should start with simple low-level interfaces which expose OS-level functionality in a straightforward manner, and then build on top of them. I realise that running processes is an especially difficult area, because different operating systems have different interfaces, there is the possibility of pipes and so forth. Perhaps this is why we have gone through multiple iterations of process-running APIs in the Haskell standard libraries. But I think that especially low-level interfaces like these should be created with the understanding that they will stay around forever, even if the cumulative result is slightly ugly. That can be an incentive to get them right the first time. In my mind, if 'createProcess' could not be implemented via the old interfaces, then that means they weren't low-level enough to start with. And in that case, we should be working on providing better low-level interfaces, rather than new high-level ones (I think 'createProcess' is slightly higher-level than 'runProcess' because it has a notion of a shell, pipes, fancy data structures, etc.). On the other hand, if 'runProcess' and friends were sufficient to implement 'createProcess', then that means they provided a satisfactory low-level foundation. And I think low-level interfaces are built upon, not deprecated. Duncan Coutts wrote:

...

The proposal is to deprecate several functions but not to remove anything. The decision to remove anything is independent.

One this issue, one thing that I think may help is to have a period where the deprecated functions still exist but are removed from the documentation. The rationale is that one of the main reasons for removing functions that are essentially duplicated by newer variants is because they clutter the API docs.

So the deprecation cycle would be something like:

1. normal 2. deprecated and still documented 3. deprecated and removed from documentation 4. removed

Another thing that would help is if haddock marked deprecated functions as such and kept them in a separate section of the index, at least in the synopsis. It should be easy for haddock 2.x to find deprecated functions since it can just look at the pragmas which are presumably preserved by the GHC api.

-- View this message in context: http://www.nabble.com/Proposal%3A-overhaul-System.Process-tp16844161p1799354... Sent from the Haskell - Libraries mailing list archive at Nabble.com.

David Roundy

10:46 p.m.

On Wed, Jun 18, 2008 at 03:37:00PM -0700, Frederik Eaton wrote:

...

I'm sorry for my late reply. This all sounds reasonable. My only concerns are that deprecating the old interfaces seems like a declaration of intent to eventually remove them. I believe that to the extent possible, library development should start with simple low-level interfaces which expose OS-level functionality in a straightforward manner, and then build on top of them. I realise that running processes is an especially difficult area, because different operating systems have different interfaces, there is the possibility of pipes and so forth. Perhaps this is why we have gone through multiple iterations of process-running APIs in the Haskell standard libraries. But I think that especially low-level interfaces like these should be created with the understanding that they will stay around forever, even if the cumulative result is slightly ugly. That can be an incentive to get them right the first time.

In my mind, if 'createProcess' could not be implemented via the old interfaces, then that means they weren't low-level enough to start with. And in that case, we should be working on providing better low-level interfaces, rather than new high-level ones (I think 'createProcess' is slightly higher-level than 'runProcess' because it has a notion of a shell, pipes, fancy data structures, etc.). On the other hand, if 'runProcess' and friends were sufficient to implement 'createProcess', then that means they provided a satisfactory low-level foundation. And I think low-level interfaces are built upon, not deprecated.

No, createProcess is more low-level than the previous interface, which is why it can't be implemented using the previous interfaces, but rather they can be implemented using createProcess. Which is why they can be deprecated, although I'd hope they won't be removed for at least a couple more ghc releases after createProcess is introduced. David

Frederik Eaton

11:22 p.m.

David Roundy-2 wrote:

...

No, createProcess is more low-level than the previous interface, which is why it can't be implemented using the previous interfaces, but rather they can be implemented using createProcess. Which is why they can be deprecated, although I'd hope they won't be removed for at least a couple more ghc releases after createProcess is introduced.

In what way is 'createProcess' more low-level than the previous interfaces? In any case, it still makes sense to me to keep the previous interfaces around indefinitely - even if they have to be redefined in terms of 'createProcess'. But 'createProcess' seems quite a bit more high-level than the others. It seems to combine a lot of different POSIX functions like 'exec' and 'system' and 'pipe' which could all be exposed individually through a simpler API. Frederik -- View this message in context: http://www.nabble.com/Proposal%3A-overhaul-System.Process-tp16844161p1799417... Sent from the Haskell - Libraries mailing list archive at Nabble.com.

John Meacham

11:39 p.m.

On Wed, Jun 18, 2008 at 04:22:24PM -0700, Frederik Eaton wrote:

...

In what way is 'createProcess' more low-level than the previous interfaces? In any case, it still makes sense to me to keep the previous interfaces around indefinitely - even if they have to be redefined in terms of 'createProcess'. But 'createProcess' seems quite a bit more high-level than the others. It seems to combine a lot of different POSIX functions like 'exec' and 'system' and 'pipe' which could all be exposed individually through a simpler API.

I think another issue here is where the OS abstraction happens. Opaquely in libraries, or on top of OS primitives themselves in haskell. Personally, I would like to see the OS abstraction happen in exposed haskell libraries. as in, POSIX, Win32, SunOS, etc.. libraries in haskell built on exposed haskell primitives, and things like 'createProcess' on top of them. Of course, the implementation of createProcess will have to conform to whatever is available on the current system (which is in not as simple as posix vs windows, autoconf-esque feature checks will be needed) but I'd rather have that done in haskell than C or opaquely in a library. This is more an opinion on implementation than standardization. I think it is fully fine to provide standard high level interfaces before the low level ones get worked out, as long as the low level ones arn't excluded in favor of just the high level interface. John -- John Meacham - ⑆repetae.net⑆john⑈

David Roundy

19 Jun 19 Jun

1:55 a.m.

On Wed, Jun 18, 2008 at 04:22:24PM -0700, Frederik Eaton wrote:

...

David Roundy-2 wrote:

...
No, createProcess is more low-level than the previous interface, which is why it can't be implemented using the previous interfaces, but rather they can be implemented using createProcess. Which is why they can be deprecated, although I'd hope they won't be removed for at least a couple more ghc releases after createProcess is introduced.

In what way is 'createProcess' more low-level than the previous interfaces?

The easiest sense in which it's more low-level is that the previous two commands runProcess and runInteractiveProcess can both be implemented using createProcess, but not the other way around. New API: (a subset thereof, actually) createProcess :: CreateProcess -> IO (Maybe Handle, Maybe Handle, Maybe Handle, ProcessHandle) data CreateProcess = CreateProcess { cmdspec :: CmdSpec cwd :: (Maybe FilePath) env :: (Maybe [(String, String)]) std_in :: StdStream std_out :: StdStream std_err :: StdStream close_fds :: Bool } Old API: runProcess :: FilePath -> [String] -> Maybe FilePath -> Maybe [(String, String)] -> Maybe Handle -> Maybe Handle -> Maybe Handle -> IO ProcessHandle runInteractiveProcess :: FilePath -> [String] -> Maybe FilePath -> Maybe [(String, String)] -> IO (Handle, Handle, Handle, ProcessHandle) In the old API, you are required to either use preexisting handles for each of stdin/stdout/stderr (possibly allowing the child process to inherit some or all of them), *or* create new pipes for all three. You might consider trying to implement runInteractiveProcess using runProcess (you certainly can't do the opposite) by setting up pipes and passing them to runProcess, but there's no portable function that can be used to do this, so you'd be out of luck. You'd be stuck doing what darcs has long done: write to temporary files on disk, which is totally braindead. createProcess is (so far as I can tell) the lowest-level *portable* Haskell API for spawning processes that has yet been proposed, in the sense that there are no other proposed functions that could be used to implement createProcess, and no other proposed functions that cannot be implemented using createProcess. -- David Roundy Department of Physics Oregon State University

Frederik Eaton

11:20 p.m.

David Roundy-2 wrote:

...

On Wed, Jun 18, 2008 at 04:22:24PM -0700, Frederik Eaton wrote:

...
David Roundy-2 wrote:

...
No, createProcess is more low-level than the previous interface, which is why it can't be implemented using the previous interfaces, but rather they can be implemented using createProcess. Which is why they can be deprecated, although I'd hope they won't be removed for at least a couple more ghc releases after createProcess is introduced.

In what way is 'createProcess' more low-level than the previous interfaces?

The easiest sense in which it's more low-level is that the previous two commands runProcess and runInteractiveProcess can both be implemented using createProcess, but not the other way around.

New API: (a subset thereof, actually)

createProcess :: CreateProcess -> IO (Maybe Handle, Maybe Handle, Maybe Handle, ProcessHandle)

data CreateProcess = CreateProcess { cmdspec :: CmdSpec cwd :: (Maybe FilePath) env :: (Maybe [(String, String)]) std_in :: StdStream std_out :: StdStream std_err :: StdStream close_fds :: Bool }

Old API:

runProcess :: FilePath -> [String] -> Maybe FilePath -> Maybe [(String, String)] -> Maybe Handle -> Maybe Handle -> Maybe Handle -> IO ProcessHandle

runInteractiveProcess :: FilePath -> [String] -> Maybe FilePath -> Maybe [(String, String)] -> IO (Handle, Handle, Handle, ProcessHandle)

In the old API, you are required to either use preexisting handles for each of stdin/stdout/stderr (possibly allowing the child process to inherit some or all of them), *or* create new pipes for all three.

You might consider trying to implement runInteractiveProcess using runProcess (you certainly can't do the opposite) by setting up pipes and passing them to runProcess, but there's no portable function that can be used to do this, so you'd be out of luck. You'd be stuck doing what darcs has long done: write to temporary files on disk, which is totally braindead.

createProcess is (so far as I can tell) the lowest-level *portable* Haskell API for spawning processes that has yet been proposed, in the sense that there are no other proposed functions that could be used to implement createProcess, and no other proposed functions that cannot be implemented using createProcess.

It sounds like we are confusing "low-level" with "powerful". I don't know enough about portability and the constraints imposed by portability to comment on what else could be done, but the fact that we keep changing interfaces for process execution suggests to me that the existing interfaces have not been low-level enough. For instance, why can't we create a pipe independently of a process? What about creating pipes to file descriptors other than standard input/output/error? If we had good Haskell interfaces to OS-level primitives, as John Meacham suggested, then we could build something that tries to be a "greatest common denominator" on top of these. I am skeptical that the result would look very much like 'createProcess', for instance doesn't Windows have a concept of file descriptors other than standard input/output/error? I suppose that I can always use System.Posix.Process, though, and hope that it doesn't change too much... Frederik -- View this message in context: http://www.nabble.com/Proposal%3A-overhaul-System.Process-tp16844161p1802014... Sent from the Haskell - Libraries mailing list archive at Nabble.com.

David Roundy

20 Jun 20 Jun

midnight

On Thu, Jun 19, 2008 at 04:20:52PM -0700, Frederik Eaton wrote:

...

David Roundy-2 wrote:

...
createProcess is (so far as I can tell) the lowest-level *portable* Haskell API for spawning processes that has yet been proposed, in the sense that there are no other proposed functions that could be used to implement createProcess, and no other proposed functions that cannot be implemented using createProcess.

It sounds like we are confusing "low-level" with "powerful". I don't know enough about portability and the constraints imposed by portability to comment on what else could be done, but the fact that we keep changing interfaces for process execution suggests to me that the existing interfaces have not been low-level enough. For instance, why can't we create a pipe independently of a process? What about creating pipes to file descriptors other than standard input/output/error? If we had good Haskell interfaces to OS-level primitives, as John Meacham suggested, then we could build something that tries to be a "greatest common denominator" on top of these. I am skeptical that the result would look very much like 'createProcess', for instance doesn't Windows have a concept of file descriptors other than standard input/output/error? I suppose that I can always use System.Posix.Process, though, and hope that it doesn't change too much...

The problem presumably is that there isn't sufficient similarity between systems to create smaller components which are themselves cross-platform that can be combined to write a function like createProcess. I don't know enough windows programming to confirm or deny whether it would be possible to create a cross-platform pipe-creation function, but I doubt it would be possible to create something with semantics sufficiently similar to posix pipes so as to be useful. There's certainly no guarantee that there exist "small" cross-platform pieces that can be used to construct a powerful process-spawning function. Perhaps Simon can enlighten us? I just did a bit of a search and came across a bit of API from microsoft: http://msdn.microsoft.com/en-us/library/ms682425(VS.85).aspx It looks vaguely similar to the posix approach: you create pipes, create backup copies of whichever of stdin/out/err you wish to change, set your stdin/out/err to the values you wish to pass to your child process, spawn the process, then you set stdin/out/err back to your saved values, and maybe close one end of the newly-created pipes (I'm not sure of this). Under posix, the order is different, which probably means that you can't break this into smaller pieces. Under posix, you fork first, and only then do you modify stdin/out/err and close one end of any pipes you might have created, and then call exec. Since the order of operations is different, we can't write a portable version of Simon's createProcess out of pretty smaller pieces. Obviously, we could write it out of functions like forkIfWeAreRunningPosix, etc, but that would be stupid. That's my half-baked analysis, based on reading a couple of web pages... David

Duncan Coutts

19 Jun 19 Jun

10:32 a.m.

On Wed, 2008-06-18 at 16:22 -0700, Frederik Eaton wrote:

...

In any case, it still makes sense to me to keep the previous interfaces around indefinitely - even if they have to be redefined in terms of 'createProcess'.

I wouldn't mind keeping them indefinitely (especially since they're just simple instances of createProcess) so long as they eventually get removed from the documentation (or at a very minimum, removed to a separate page where I don't have to look at them). I think that would also satisfy Curt and his point about the cruft in the Java's libraries. Keeping stuff around so you don't break old programs is fine and great. Making the documentation large and the api look very wide and not knowing which functions to pick is actively bad. Duncan

Curt Sampson

3:57 a.m.

On 2008-06-18 15:37 -0700 (Wed), Frederik Eaton wrote:

...

My only concerns are that deprecating the old interfaces seems like a declaration of intent to eventually remove them.

Just as one opinion, I prefer that old interfaces get removed. I have no problem updating code to deal with this kind of stuff, and the cost of that is easily paid off (for me) in the benefits I get from having simple interfaces available. I found Java's libraries quite trying due to the amount of cruft one was always wading through. That said, I think it's a *very* good idea to keep old versions of libraries and compilers available for download, for those folks that want to run a program written for an older system and don't feel up to tweaking the code to run on a newer one. cjs -- Curt Sampson +81 90 7737 2974 Mobile sites and software consulting: http://www.starling-software.com

David Roundy

5:39 p.m.

On Thu, Jun 19, 2008 at 12:57:58PM +0900, Curt Sampson wrote:

...

On 2008-06-18 15:37 -0700 (Wed), Frederik Eaton wrote:

...
My only concerns are that deprecating the old interfaces seems like a declaration of intent to eventually remove them.

Just as one opinion, I prefer that old interfaces get removed. I have no problem updating code to deal with this kind of stuff, and the cost of that is easily paid off (for me) in the benefits I get from having simple interfaces available. I found Java's libraries quite trying due to the amount of cruft one was always wading through.

Here's an idea: perhaps deprecated functions should be exported in a new module System.Process.Deprecated? Then they could be left in that module after they've been removed from System.Process, if they ever are removed. That would allow for a slightly easier transition. I don't like the idea of forcing users of System.Process who are happy with the existing interface (e.g. darcs... which is not quite happy with it, but can get by) to either use #ifdefs (or something like that) or to limit their code to specific versions of ghc. Not too long ago, I was bothered by being unable to compile darcs on a system with only ghc 6.2. True, I could have compiled and installed a new ghc, but I didn't want to do so just to get a newer version of darcs, so I'm still using darcs 1.0.2 on that system (which I don't maintain, and is still running debian sarge). Of course, moving the functions to a new module doesn't help *that* much, but it's easier to change the name of a module imported than to maintain two entirely different codebases for calling external programs (which is rather a fragile part of almost any code that will be using System.Process). Perhaps a better option would be to keep the old interface indefinitely, but update haddock to allow deprecated interfaces to be documented only on a separate page. Then you needn't break anyone's code, and new users would also not need to wade through a half dozen ancient functions with weird naming conventions. David

Frederik Eaton

11:35 p.m.

David Roundy-2 wrote:

...

On Thu, Jun 19, 2008 at 12:57:58PM +0900, Curt Sampson wrote:

...
On 2008-06-18 15:37 -0700 (Wed), Frederik Eaton wrote:

...
My only concerns are that deprecating the old interfaces seems like a declaration of intent to eventually remove them.

Just as one opinion, I prefer that old interfaces get removed. I have no problem updating code to deal with this kind of stuff, and the cost of that is easily paid off (for me) in the benefits I get from having simple interfaces available. I found Java's libraries quite trying due to the amount of cruft one was always wading through.

Here's an idea: perhaps deprecated functions should be exported in a new module System.Process.Deprecated? Then they could be left in that module after they've been removed from System.Process, if they ever are removed. That would allow for a slightly easier transition. I don't like the idea of forcing users of System.Process who are happy with the existing interface (e.g. darcs... which is not quite happy with it, but can get by) to either use #ifdefs (or something like that) or to limit their code to specific versions of ghc. Not too long ago, I was bothered by being unable to compile darcs on a system with only ghc 6.2. True, I could have compiled and installed a new ghc, but I didn't want to do so just to get a newer version of darcs, so I'm still using darcs 1.0.2 on that system (which I don't maintain, and is still running debian sarge).

Of course, moving the functions to a new module doesn't help *that* much, but it's easier to change the name of a module imported than to maintain two entirely different codebases for calling external programs (which is rather a fragile part of almost any code that will be using System.Process).

Perhaps a better option would be to keep the old interface indefinitely, but update haddock to allow deprecated interfaces to be documented only on a separate page. Then you needn't break anyone's code, and new users would also not need to wade through a half dozen ancient functions with weird naming conventions.

What's wrong with Simon Marlow's proposed haddock documentation? http://darcs.haskell.org/~simonmar/process/System-Process.html It puts the new stuff at the top. The old stuff goes under "Specific variants of createProcess". The "Synopsis" section doesn't make this separation very clear, and that could be improved, but otherwise it seems good enough. Under 'runProcess' it says "Note: consider using the more general createProcess instead of runProcess.". Sure, Haddock could be updated to put certain interfaces on a separate page or under a collapsed tab. Certainly it would be better to do that than to remove the interfaces (or even put them under a different module name, which still leads to compatibility headaches), but who knows when somebody will have time to make such changes to Haddock. If people want to have a System.Process package with only createProcess and no runProcess, then I still think that the best way to do that would be to make a new package with a new name. That seems preferable to removing names from an existing package, because it doesn't require people to change their code. Frederik -- View this message in context: http://www.nabble.com/Proposal%3A-overhaul-System.Process-tp16844161p1802026... Sent from the Haskell - Libraries mailing list archive at Nabble.com.

Curt Sampson

25 May 25 May

11:44 p.m.

On 2008-05-25 16:27 -0700 (Sun), Frederik Eaton wrote:

...

...but updating it whenever the standard interface changed and broke my code. So I'd like to cast my vote for backwards compatibility. The standard libraries will never be perfect, but constantly deprecating and removing functionality can really impair their usefulness for large projects.

Just as a counterpoint, I have to say that one of the things that impressed me a lot about Haskell over the last two months as I've started using it for real work is that the library interfaces are of noticably higher quality than other languages I've used (in the main, C, Java and Ruby). Part of this may be due to having smarter people working on things in the first place, but I suspect a reasonable amount is due to the ability to change interfaces as one discovers how libraries are really used and better ways to design them. I wouldn't want to lose this, and end up with the kind of cruft that everybody knows is broken but will never go away that exists in the Java and (to a lesser degree) Ruby libraries. I'm willing to put up with a fair amount of interface-change pain to this end. cjs -- Curt Sampson +81 90 7737 2974 Mobile sites and software consulting: http://www.starling-software.com

Frederik Eaton

26 May 26 May

12:16 a.m.

Dear Curt, Well, I thought the exact same thing when I started using Haskell. Then I found that nothing I wrote lasted for more than a year or so. I would put a lot of time into a project, it would be done, and then soon it wouldn't compile anymore. So whereas with other languages I had to deal with less slick interfaces, but got code that is still useful to me, many of the things I wrote in Haskell are bit-rotted. This doesn't seem strictly necessary, for instance, I improved on the System.Process interface in my own libraries, and nobody had to change their code as a result. The problem is that when Simon Marlow improves on System.Process, everyone has to change their code. Why not release the new interface as a new library with a new name? Changing the standard is an easy way to point new users to whatever the current best interface is, but it is destructive as well, it defeats the purpose of "standard" by breaking old code. We could accomplish the same thing constructively with a web page that lists currently recommended libraries. Then, new users could use the new libraries, but code that depends on the old libraries would still work. I want to write code that lasts for a long time. I don't want everything I write to become a maintenance hassle, I have found that I don't have time for that. It seems strange that the Haskell language, being "pure", doesn't allow destructive updates of data; but the standard module interfaces are being constantly subject to destructive updates, i.e. names change meaning or disappear, which can make it just as hard to reason about what my code is doing. Frederik Curt Sampson-2 wrote:

...

On 2008-05-25 16:27 -0700 (Sun), Frederik Eaton wrote:

...
...but updating it whenever the standard interface changed and broke my code. So I'd like to cast my vote for backwards compatibility. The standard libraries will never be perfect, but constantly deprecating and removing functionality can really impair their usefulness for large projects.

Just as a counterpoint, I have to say that one of the things that impressed me a lot about Haskell over the last two months as I've started using it for real work is that the library interfaces are of noticably higher quality than other languages I've used (in the main, C, Java and Ruby). Part of this may be due to having smarter people working on things in the first place, but I suspect a reasonable amount is due to the ability to change interfaces as one discovers how libraries are really used and better ways to design them. I wouldn't want to lose this, and end up with the kind of cruft that everybody knows is broken but will never go away that exists in the Java and (to a lesser degree) Ruby libraries. I'm willing to put up with a fair amount of interface-change pain to this end.

cjs -- Curt Sampson +81 90 7737 2974 Mobile sites and software consulting: http://www.starling-software.com _______________________________________________ Libraries mailing list Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries

-- View this message in context: http://www.nabble.com/Proposal%3A-overhaul-System.Process-tp16844161p1746393... Sent from the Haskell - Libraries mailing list archive at Nabble.com.

Frederik Eaton

3:05 a.m.

By the way, I don't mean to suggest that I would ever have the time or motivation to do as much for Haskell as Simon has done, and clearly for the same reason he has a better perspective than I. I am not even sure if there are plans to ever remove the existing interfaces. I was just voicing a concern that came to my mind when I saw this thread. If it is not possible to address the concern, then I'm sorry for boring people with it. But it is something that worries me on occasion. Best wishes, Frederik Frederik Eaton wrote:

...

Dear Curt,

Well, I thought the exact same thing when I started using Haskell. Then I found that nothing I wrote lasted for more than a year or so. I would put a lot of time into a project, it would be done, and then soon it wouldn't compile anymore. So whereas with other languages I had to deal with less slick interfaces, but got code that is still useful to me, many of the things I wrote in Haskell are bit-rotted. This doesn't seem strictly necessary, for instance, I improved on the System.Process interface in my own libraries, and nobody had to change their code as a result. The problem is that when Simon Marlow improves on System.Process, everyone has to change their code. Why not release the new interface as a new library with a new name? Changing the standard is an easy way to point new users to whatever the current best interface is, but it is destructive as well, it defeats the purpose of "standard" by breaking old code. We could accomplish the same thing constructively with a web page that lists currently recommended libraries. Then, new users could use the new libraries, but code that depends on the old libraries would still work. I want to write code that lasts for a long time. I don't want everything I write to become a maintenance hassle, I have found that I don't have time for that. It seems strange that the Haskell language, being "pure", doesn't allow destructive updates of data; but the standard module interfaces are being constantly subject to destructive updates, i.e. names change meaning or disappear, which can make it just as hard to reason about what my code is doing.

Frederik

Curt Sampson-2 wrote:

...
On 2008-05-25 16:27 -0700 (Sun), Frederik Eaton wrote:

...
...but updating it whenever the standard interface changed and broke my code. So I'd like to cast my vote for backwards compatibility. The standard libraries will never be perfect, but constantly deprecating and removing functionality can really impair their usefulness for large projects.

Just as a counterpoint, I have to say that one of the things that impressed me a lot about Haskell over the last two months as I've started using it for real work is that the library interfaces are of noticably higher quality than other languages I've used (in the main, C, Java and Ruby). Part of this may be due to having smarter people working on things in the first place, but I suspect a reasonable amount is due to the ability to change interfaces as one discovers how libraries are really used and better ways to design them. I wouldn't want to lose this, and end up with the kind of cruft that everybody knows is broken but will never go away that exists in the Java and (to a lesser degree) Ruby libraries. I'm willing to put up with a fair amount of interface-change pain to this end.

cjs -- Curt Sampson +81 90 7737 2974 Mobile sites and software consulting: http://www.starling-software.com _______________________________________________ Libraries mailing list Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries

-- View this message in context: http://www.nabble.com/Proposal%3A-overhaul-System.Process-tp16844161p1746503... Sent from the Haskell - Libraries mailing list archive at Nabble.com.

Duncan Coutts

9:25 a.m.

On Sun, 2008-05-25 at 17:16 -0700, Frederik Eaton wrote:

...

This doesn't seem strictly necessary, for instance, I improved on the System.Process interface in my own libraries, and nobody had to change their code as a result. The problem is that when Simon Marlow improves on System.Process, everyone has to change their code.

As I said before, you'll notice that Simon's improvements do not involve anyone having to change their code. New functions are being added and none are being removed. Duncan

Bulat Ziganshin

6:10 a.m.

New subject: Re[2]: Proposal: overhaul System.Process

Hello Curt, Monday, May 26, 2008, 3:44:30 AM, you wrote:

...

Just as a counterpoint, I have to say that one of the things that impressed me a lot about Haskell over the last two months as I've started using it for real work

after having use it for several years you will be even more impressed by need to fix all your programs and libraries used to every new ghc version released -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

Curt Sampson

27 May 27 May

12:21 p.m.

...

after having use it for several years you will be even more impressed by need to fix all your programs and libraries used to every new ghc version released

Actually, this sort of thing doesn't bother me in the slightest; I'm constantly fixing all my code to work with my own libraries, anyway, so a little more work to deal with ghc library changes is no big deal. I do recognise that I'm unusual in that I a) have extensive automated tests for everything I write, and b) I do a very large amount of refactoring, so I'm quite used to this sort of thing. cjs -- Curt Sampson +81 90 7737 2974 Mobile sites and software consulting: http://www.starling-software.com

Bulat Ziganshin

12:25 p.m.

New subject: Re[2]: Proposal: overhaul System.Process

Hello Curt, Tuesday, May 27, 2008, 4:21:06 PM, you wrote:

...

...
after having use it for several years you will be even more impressed by need to fix all your programs and libraries used to every new ghc version released

...

Actually, this sort of thing doesn't bother me in the slightest; I'm constantly fixing all my code to work with my own libraries, anyway, so a little more work to deal with ghc library changes is no big deal.

...

I do recognise that I'm unusual in that I a) have extensive automated tests for everything I write, and b) I do a very large amount of refactoring, so I'm quite used to this sort of thing.

c) never used hackage or, even funnier, trying to keep you libs uploaded there upgraded for each ghc version d) never released open-source software -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

Curt Sampson

12:36 p.m.

On 2008-05-27 16:25 +0400 (Tue), Bulat Ziganshin wrote:

...

...
I do recognise that I'm unusual in that...

c) never used hackage or, even funnier, trying to keep you libs uploaded there upgraded for each ghc version

I use hackage quite a bit, though admittedly it's only been with gcc 6.8.2. However, the code is open source; I don't imagine that updating hackage libraries would be any worse than many of the other library updates, bugfixes and so on I've had to do over the years. Again, I'm different; I keep source copies of most hackage libraries in my build framework for each application, and I can very easily keep local patches.

...

d) never released open-source software

You'd best stick to what you know about; I've been working on and helping to release major pieces of open source software (e.g., NetBSD) for fifteen years now, and have had plenty of packages for which I was the sole or primary maintainer. The basic issue here, I think, is that you're probably just using inferior build systems, and so some things are harder for you than they are for me. You probably won't believe that, because of course everybody thinks that whatever they are currently doing is pretty much the best possible way there is to do it. cjs -- Curt Sampson +81 90 7737 2974 Mobile sites and software consulting: http://www.starling-software.com

Bulat Ziganshin

12:45 p.m.

New subject: Re[2]: Proposal: overhaul System.Process

Hello Curt, Tuesday, May 27, 2008, 4:36:29 PM, you wrote:

...

I use hackage quite a bit, though admittedly it's only been with gcc 6.8.2. However, the code is open source; I don't imagine that updating hackage libraries would be any worse than many of the other library updates, bugfixes and so on I've had to do over the years.

but it will be better to see things that just continue to work with every new ghc version instead of modifying all the libs

...

Again, I'm different; I keep source copies of most hackage libraries in my build framework for each application, and I can very easily keep local patches.

yes, i'm doing the same. but i don't think that it's good way

...

...
d) never released open-source software

...

You'd best stick to what you know about; I've been working on and helping to release major pieces of open source software (e.g., NetBSD) for fifteen years now, and have had plenty of packages for which I was the sole or primary maintainer.

...

The basic issue here, I think, is that you're probably just using inferior build systems, and so some things are harder for you than they are for me. You probably won't believe that, because of course everybody thinks that whatever they are currently doing is pretty much the best possible way there is to do it.

well, just imagine how it will work if gcc libraries will be changed every year :) does "superior build system" means that you will keep 15 copies of every library, one-per-year? -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

Johannes Waldmann

12:40 p.m.

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 |>> by need to fix all your programs and libraries used to every new ghc |>> version released | |> Actually, this sort of thing doesn't bother me in the slightest [...] whoa. a) I want my code to "just work", and remain working b) I want to maintain as little code as possible (I'm not paid for coding, etc.), ~ so it is essential to have external libraries that "just work" (see above). best regards, J. W. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.4-svn0 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFIPAEjDqiTJ5Q4dm8RArvWAKCJzQGP+XQXlgiH77+Gcc/qtM0VtQCfaY5L BMGc69EezS92vKuxJjBSjSw= =S4Ks -----END PGP SIGNATURE-----

Curt Sampson

12:43 p.m.

On 2008-05-27 14:40 +0200 (Tue), Johannes Waldmann wrote:

...

b) I want to maintain as little code as possible (I'm not paid for coding, etc.),

I understand. On the other hand, I am paid for coding, and being a business owner, I'm paid by what I produce, not by how much time I spend. Bad interfaces cost me money. It's generally cheaper for me to change to good interfaces than it is to stick with the bad ones. I'm not saying that my word is the be-all and end-all of this, by the way, but just trying to show you how the other half lives. Anyway, my opinion is well known (if not well understood) by this point, so I won't respond to any more of these. Just keep in mind that I have analyzed these kinds of issues in detail, lived with them for years, and have done it both ways, so I'm not talking from inexperience here. cjs -- Curt Sampson +81 90 7737 2974 Mobile sites and software consulting: http://www.starling-software.com

Bulat Ziganshin

12:55 p.m.

New subject: Re[2]: Proposal: overhaul System.Process

Hello Curt, Tuesday, May 27, 2008, 4:43:48 PM, you wrote:

...

Bad interfaces cost me money. It's generally cheaper for me to change to good interfaces than it is to stick with the bad ones.

you don't consider third way - use library versioning to control interface changes. search haskell.org for "package versioning policy". ideally, base library should never change because you canot use multiple versions of base library with one ghc version, and changes in everything else shuld be controlled via PVP btw, what a business solutions you develop with a Haskell? (if it's not top secret :D) i agree that for closed source, money-making software updating to new version of interfaces is much less problem. you can just skip upgrading to new ghc version - i personally still use 6.6. but when you are going to open-source, free software world, this means more problems and less people willing to keep things working. for example, i've dropped support of libraries i once have published and keep updating only code of my own program. ultimately, this means less libs on hackage and therefore less opportunities to use haskell in your business -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com

Curt Sampson

1:14 p.m.

On 2008-05-27 16:55 +0400 (Tue), Bulat Ziganshin wrote:

...

you don't consider third way - use library versioning to control interface changes.

That would be lovely. When you've implemented it, let us know. However, given that we don't have this at the moment, we have to choose between the two extremes of never changing or removing library functions, and going hog-wild with changes.

...

ideally, base library should never change because you canot use multiple versions of base library with one ghc version...

Ouch. ByteString, which I am quite heavily dependent on, is nice, but hardly perfect.

...

btw, what a business solutions you develop with a Haskell? (if it's not top secret :D)

I'm building an automatic options trading system. It eats a data feed, builds a mathematical model of the market, and places orders in an attempt to make money.

...

ultimately, this means less libs on hackage and therefore less opportunities to use haskell in your business

Not really. I'd rather have a few good libraries than a large quantity of them. Haskell's libraries are far better than Java's, IMHO. And a big reason for that is the willingness to change them, rather than wanting to freeze them. cjs -- Curt Sampson +81 90 7737 2974 Mobile sites and software consulting: http://www.starling-software.com

6197

Age (days ago)

6284

Last active (days ago)

List overview

Download

75 comments

13 participants

participants (13)

Brian Brunswick
Bryan O'Sullivan
Bulat Ziganshin
Curt Sampson
David Roundy
Don Stewart
Duncan Coutts
Frederik Eaton
Johannes Waldmann
John Meacham
Jules Bean
Neil Mitchell
Simon Marlow