process: Confusion about UseHandle handles being closed.

I originally opened this as a Github issue[1] about this, but due to lack of attention, Herbert recommended that I send it to the libraries list as well. Here's the issue description copied verbatim from Github: When working on a library, I was surprised to find that the `Handle`s that I passed in for `std_in`, `std_out` and `std_err` via `UseHandle` were automatically closed. This is not clear from the documentation, and- at least for the use case I was interested in- the opposite of what I needed. There are valid cases where we'd want the `Handle` to remain open after the process runs to completion. The function `createProcess_` in the `.Internals` module has the behavior I was looking for, and for my purpose, I can simply import from there. I'd like to propose two changes: 1. Add clear documentation to `createProcess` indicating that it will close the `Handle` automatically. 2. Add a new function to be exported from `System.Process` with the semantics of `createProcess_`. I'm open to bikeshedding on the name, but perhaps sticking with `createProcess_` makes the most sense. Note that I do *not* think we should change the existing semantics of `createProcess`: I think it's a large breaking change, and should be avoided. I'm happy to provide pull requests for both of these, I just wanted to check if there was objection before going ahead with it. Michael [1] https://github.com/haskell/process/issues/2

On Tue, 25 Nov 2014, Michael Snoyman wrote:
1. Add clear documentation to `createProcess` indicating that it will close the `Handle` automatically. 2. Add a new function to be exported from `System.Process` with the semantics of `createProcess_`. I'm open to bikeshedding on the name, but perhaps sticking with `createProcess_` makes the most sense.
I think the cleanest solution would be to add a new field (e.g. leaveOpen) to the CreateProcess record. (Btw. why are the other fields named with underscores?) Otherwise, every new switch to createProcess multiplies the number of its variants. Users who create the CreateProcess structure using 'shell' and 'proc' won't need to adapt their code.

On 25-11-2014 08:39, Henning Thielemann wrote:
On Tue, 25 Nov 2014, Michael Snoyman wrote:
1. Add clear documentation to `createProcess` indicating that it will close the `Handle` automatically. 2. Add a new function to be exported from `System.Process` with the semantics of `createProcess_`. I'm open to bikeshedding on the name, but perhaps sticking with `createProcess_` makes the most sense.
I think the cleanest solution would be to add a new field (e.g. leaveOpen) to the CreateProcess record. (Btw. why are the other fields named with underscores?) Otherwise, every new switch to createProcess multiplies the number of its variants. Users who create the CreateProcess structure using 'shell' and 'proc' won't need to adapt their code.
This would require a major version bump to the library. It may not be worthy for such a small change. Cheers, -- Felipe.

On Tue, 25 Nov 2014, Felipe Lessa wrote:
On 25-11-2014 08:39, Henning Thielemann wrote:
I think the cleanest solution would be to add a new field (e.g. leaveOpen) to the CreateProcess record. (Btw. why are the other fields named with underscores?) Otherwise, every new switch to createProcess multiplies the number of its variants. Users who create the CreateProcess structure using 'shell' and 'proc' won't need to adapt their code.
This would require a major version bump to the library. It may not be worthy for such a small change.
The small change "adding createProcess_" is the start of a messy path. The future proof solution would be to make the CreateProcess record opaque and provide setter functions for it. I don't think there is a need to read fields of CreateProcess. This change would also justify a major version bump. In a first step, the 'process' package could only provide new setter functions (without underscores).

On Tue Nov 25 2014 at 9:33:19 PM Henning Thielemann < lemming@henning-thielemann.de> wrote:
On Tue, 25 Nov 2014, Felipe Lessa wrote:
On 25-11-2014 08:39, Henning Thielemann wrote:
I think the cleanest solution would be to add a new field (e.g. leaveOpen) to the CreateProcess record. (Btw. why are the other fields named with underscores?) Otherwise, every new switch to createProcess multiplies the number of its variants. Users who create the CreateProcess structure using 'shell' and 'proc' won't need to adapt their code.
This would require a major version bump to the library. It may not be worthy for such a small change.
The small change "adding createProcess_" is the start of a messy path.
The future proof solution would be to make the CreateProcess record opaque and provide setter functions for it. I don't think there is a need to read fields of CreateProcess. This change would also justify a major version bump. In a first step, the 'process' package could only provide new setter functions (without underscores).
I read this as saying createProcess_ is not a good long-term solution, and instead we should strive for the most ideal long-term goal. The problem with this is that it makes it very difficult to have a clean migration path. Every time we have a breaking change- especially in a package like process which cannot be easily upgraded[1]- it makes it incredibly difficult to maintain software against multiple versions. So I'd argue that: 1. If we want to have a discussion about radically changing the API, let's have that discussion, but include a very clear set of plans for migration. Henning: have you given that any thought? 2. Let's not let such a discussion derail the possibility of a one-line patch which is very much backwards-compatible and simply avoids the need for people to reach for the .Internal module. Michael [1] https://www.fpcomplete.com/blog/2014/05/lenient-lower-bounds

FWIW- I have no problem with exporting anything you like from the
`.Internals` module, especially if it is being exported form there rather
than System.Process.
Ultimately, that is what such modules are for, they can let you violate
overly enthusiastic attempts at encapsulation until a better design can be
locked in.
So, +1.
I also have no problem with having a longer term discussion about a better
design for the main System.Process API, but I'd not let that block the two
line patch that just fixes the problem for now.
-Edward
On Wed, Nov 26, 2014 at 12:26 AM, Michael Snoyman
On Tue Nov 25 2014 at 9:33:19 PM Henning Thielemann < lemming@henning-thielemann.de> wrote:
On Tue, 25 Nov 2014, Felipe Lessa wrote:
On 25-11-2014 08:39, Henning Thielemann wrote:
I think the cleanest solution would be to add a new field (e.g. leaveOpen) to the CreateProcess record. (Btw. why are the other fields named with underscores?) Otherwise, every new switch to createProcess multiplies the number of its variants. Users who create the CreateProcess structure using 'shell' and 'proc' won't need to adapt their code.
This would require a major version bump to the library. It may not be worthy for such a small change.
The small change "adding createProcess_" is the start of a messy path.
The future proof solution would be to make the CreateProcess record opaque and provide setter functions for it. I don't think there is a need to read fields of CreateProcess. This change would also justify a major version bump. In a first step, the 'process' package could only provide new setter functions (without underscores).
I read this as saying createProcess_ is not a good long-term solution, and instead we should strive for the most ideal long-term goal. The problem with this is that it makes it very difficult to have a clean migration path. Every time we have a breaking change- especially in a package like process which cannot be easily upgraded[1]- it makes it incredibly difficult to maintain software against multiple versions.
So I'd argue that:
1. If we want to have a discussion about radically changing the API, let's have that discussion, but include a very clear set of plans for migration. Henning: have you given that any thought? 2. Let's not let such a discussion derail the possibility of a one-line patch which is very much backwards-compatible and simply avoids the need for people to reach for the .Internal module.
Michael
[1] https://www.fpcomplete.com/blog/2014/05/lenient-lower-bounds
_______________________________________________ Libraries mailing list Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries

There seems to be some confusion here. Edward seems to be saying that it should only be exported from Internals: On 26/11/2014 06:58, Edward Kmett wrote:
FWIW- I have no problem with exporting anything you like from the `.Internals` module, especially if it is being exported form there rather than System.Process.
Ultimately, that is what such modules are for, they can let you violate overly enthusiastic attempts at encapsulation until a better design can be locked in.
So, +1.
Whereas Michael seems to be saying:
2. Let's not let such a discussion derail the possibility of a one-line patch which is very much backwards-compatible and simply avoids the need for people to reach for the .Internal module.
And his patch is exactly about exporting this from System.Process, not from System.Process.Internals. Am I missing something? Ganesh

I don't have a particularly strong feeling here.
I confess I _was_ trying to subtly slide in the condition about where the
export-should happen from. Exporting it from .Internals is probably the
right direction in an idealized world from the standpoint of minimizing the
exposure of users to the change and making it easier to update the API with
a more sweeping design change in the future.
On the other hand, if that leads to a baroquely complicated patch because
of, say, .Internals exporting other stuff that is used elsewhere in the
package I'm more than happy to bend in the name of pragmatism.
-Edward
On Fri, Nov 28, 2014 at 7:48 AM, Ganesh Sittampalam
There seems to be some confusion here.
Edward seems to be saying that it should only be exported from Internals:
On 26/11/2014 06:58, Edward Kmett wrote:
FWIW- I have no problem with exporting anything you like from the `.Internals` module, especially if it is being exported form there rather than System.Process.
Ultimately, that is what such modules are for, they can let you violate overly enthusiastic attempts at encapsulation until a better design can be locked in.
So, +1.
Whereas Michael seems to be saying:
2. Let's not let such a discussion derail the possibility of a one-line patch which is very much backwards-compatible and simply avoids the need for people to reach for the .Internal module.
And his patch is exactly about exporting this from System.Process, not from System.Process.Internals.
Am I missing something?
Ganesh

On Wed, 26 Nov 2014, Michael Snoyman wrote:
So I'd argue that:
1. If we want to have a discussion about radically changing the API, let's have that discussion, but include a very clear set of plans for migration. Henning: have you given that any thought?
As I have written, a first step into the right direction could be setter functions for CreateProcess. Another step could be to export createProcess_, but give a warning in the documentation, that it is only a temporary solution. Alternatively you could add CreateProcess_, an opaque equivalent of CreateProcess, according setter functions, and createProcess_ which accepts CreateProcess_ as input. Or you generalize createProcess using a type class, such that it both accepts CreateProcess and CreateProcess_ as input. However this might make CreateProcess_ a long term solution. Alternatively, you could add a new module that provides an opaque CreateProcess record with setter functions and an according createProcess function.

In order to avoid confusion with the short-term discussion, I've renamed the subject. On Wed Nov 26 2014 at 10:16:09 AM Henning Thielemann < lemming@henning-thielemann.de> wrote:
On Wed, 26 Nov 2014, Michael Snoyman wrote:
So I'd argue that:
1. If we want to have a discussion about radically changing the API, let's have that discussion, but include a very clear set of plans for migration. Henning: have you given that any thought?
As I have written, a first step into the right direction could be setter functions for CreateProcess. Another step could be to export createProcess_, but give a warning in the documentation, that it is only a temporary solution. Alternatively you could add CreateProcess_, an opaque equivalent of CreateProcess, according setter functions, and createProcess_ which accepts CreateProcess_ as input. Or you generalize createProcess using a type class, such that it both accepts CreateProcess and CreateProcess_ as input. However this might make CreateProcess_ a long term solution. Alternatively, you could add a new module that provides an opaque CreateProcess record with setter functions and an according createProcess function.
These are good proposals. I think I most like the idea of an alternate module, as it has a number of benefits: 1. We can immediately experiment with it in a separate package that has a release cycle detached from process itself. 2. There's a clear migration path: at some point we deprecate the current module in favor of the new module, and eventually replace the current module with the new one. 3. We can do a fair amount of this in a very backwards-compatible manner. One thing that I think it unclear is what the *goal* of the new module is. I've seen three things mentioned: 1. Making the CreateProcess an abstract type/moving to setters. 2. Changing the behavior of createProcess to not automatically close Handles (controlled by a leaveOpen field/setter). 3. Replace the underscore naming scheme with camel case. I just want to make sure we're explicit about the goals we have here. Given that we take things slowly enough to avoid migration headaches, I'm fully in favor of (1), and if we're already going to be breaking the API with new setters, I'm also behind (3). I'm tentatively in favor of (2) as I think the current default is broken, but I'm concerned about introducing silent breakage for people expecting the current behavior. That said, if we're changing the API anyway, there likely won't be anything silent occurring. Michael

On Wed, 26 Nov 2014, Michael Snoyman wrote:
[1] https://www.fpcomplete.com/blog/2014/05/lenient-lower-bounds
Btw. although adding a new record field to CreateProcess is a breaking change, it would be possible to write code that works with both process-1.0 to 1.2 and newer versions: You just have to use the smart constructors 'proc' and 'shell'. I think, using them is the prefered way, anyway.

Henning Thielemann wrote:
On Tue, 25 Nov 2014, Michael Snoyman wrote:
1. Add clear documentation to `createProcess` indicating that it will close the `Handle` automatically. 2. Add a new function to be exported from `System.Process` with the semantics of `createProcess_`. I'm open to bikeshedding on the name, but perhaps sticking with `createProcess_` makes the most sense.
I think the cleanest solution would be to add a new field (e.g. leaveOpen) to the CreateProcess record.
Wouldn't it be more natural to add a new constructor to the StdStream type? Naming it could be tricky, though. ShareHandle, perhaps? Cheers, Bertram

On Sat, 29 Nov 2014, Bertram Felgenhauer wrote:
Henning Thielemann wrote:
On Tue, 25 Nov 2014, Michael Snoyman wrote:
1. Add clear documentation to `createProcess` indicating that it will close the `Handle` automatically. 2. Add a new function to be exported from `System.Process` with the semantics of `createProcess_`. I'm open to bikeshedding on the name, but perhaps sticking with `createProcess_` makes the most sense.
I think the cleanest solution would be to add a new field (e.g. leaveOpen) to the CreateProcess record.
Wouldn't it be more natural to add a new constructor to the StdStream type?
Yes, this would allow to choose a closing procedure per stream. (Nonetheless, a smart setter function 'leaveOpen' could turn all StdStream fields to ShareHandle.)

On Tue, Nov 25, 2014 at 5:27 AM, Michael Snoyman
When working on a library, I was surprised to find that the `Handle`s that I passed in for `std_in`, `std_out` and `std_err` via `UseHandle` were automatically closed. This is not clear from the documentation, and- at least for the use case I was interested in- the opposite of what I needed.
Can you explain that use case? It's usually an error to try to use the file descriptor in the parent as well as the child, because there's no way to prevent interleaved I/O at the OS level. -- brandon s allbery kf8nh sine nomine associates allbery.b@gmail.com ballbery@sinenomine.net unix, openafs, kerberos, infrastructure, xmonad http://sinenomine.net

On Tue Nov 25 2014 at 3:20:27 PM Brandon Allbery
On Tue, Nov 25, 2014 at 5:27 AM, Michael Snoyman
wrote: When working on a library, I was surprised to find that the `Handle`s that I passed in for `std_in`, `std_out` and `std_err` via `UseHandle` were automatically closed. This is not clear from the documentation, and- at least for the use case I was interested in- the opposite of what I needed.
Can you explain that use case? It's usually an error to try to use the file descriptor in the parent as well as the child, because there's no way to prevent interleaved I/O at the OS level.
I don't remember the exact situation I was in when I wrote that issue, but one situation would be opening a log file in the parent process and then running multiple child processes sequentially, redirecting all of their output to the same Handle. There is always one process acting on the file descriptor, so interleaving isn't a problem. Michael

On Tue Nov 25 2014 at 12:27:26 PM Michael Snoyman
I originally opened this as a Github issue[1] about this, but due to lack of attention, Herbert recommended that I send it to the libraries list as well. Here's the issue description copied verbatim from Github:
When working on a library, I was surprised to find that the `Handle`s that I passed in for `std_in`, `std_out` and `std_err` via `UseHandle` were automatically closed. This is not clear from the documentation, and- at least for the use case I was interested in- the opposite of what I needed. There are valid cases where we'd want the `Handle` to remain open after the process runs to completion.
The function `createProcess_` in the `.Internals` module has the behavior I was looking for, and for my purpose, I can simply import from there. I'd like to propose two changes:
1. Add clear documentation to `createProcess` indicating that it will close the `Handle` automatically. 2. Add a new function to be exported from `System.Process` with the semantics of `createProcess_`. I'm open to bikeshedding on the name, but perhaps sticking with `createProcess_` makes the most sense.
Note that I do *not* think we should change the existing semantics of `createProcess`: I think it's a large breaking change, and should be avoided.
I'm happy to provide pull requests for both of these, I just wanted to check if there was objection before going ahead with it.
Michael
Since it seems that there's no objection to exposing createProcess_ as a short term solution, I've put together a pull request[1]. The only code change is exposing createProcess_; all other changes are documentation. Michael [1] https://github.com/haskell/process/pull/12
participants (7)
-
Bertram Felgenhauer
-
Brandon Allbery
-
Edward Kmett
-
Felipe Lessa
-
Ganesh Sittampalam
-
Henning Thielemann
-
Michael Snoyman