Proposal: split Chan and TChan into read and write ends

Hi everybody, Just discovered all the great discussion here, and although my google-fu turned up no prior discussions on this subject, I apologize if it has already been discussed. I propose that Chan and TChan should be implemented as a pair of read and write ends, initialized as follows: newSplitChan :: IO (InChan a, OutChan a) I've implemented this already here: http://hackage.haskell.org/package/chan-split . You can ignore the type classes I've defined; they're there for my own reasons and not part of the proposal. Here is my best defense: 1) My own (and I assume others') use of Chans involves some bits of code which do only reads, and others which do only writes; a split implementation lets us use the type-checker to allocate read or write "permissions" when we pass around either end, and generally makes things easier to reason about. Others have independently reached this conclusion as well. 2) The API is simpler (and in the case of TChans *much* simpler) with a split approach; some examples: in TChan the types of the 'duplicate' functions actually suggest (IMHO) the details of how they treat existing messages, and require a less roundabout explanation dupTChan :: InTChan a -> STM (OutTChan a) cloneTChan :: OutTChan a -> STM (OutTChan a) another example, 'newBroadcastTChan' is actually just: -- | Create a new write end of a TChan. Use dupTChan to get an OutChan that values can be read from. newInTChan :: STM (InTChan a) and doesn't require any explanation beyond that. 3) It's trivial to modify the *Chan libs in this way with a few edits (really, they're already implemented this way, but with some added contortions to return both a read and write end for each operation), so there're no new tricky concurrency issues to reason about. 4) Values written to the write end can be reliably GC'd when readers go away, although recent GHC seems to be able to do this on the current implementation with -O2 on my simple tests. 5) While this is a big API change, I imagine the vast majority of users would only have to change a few type signatures and the Chan initialization action. An `OldChan` module could easily be implemented in terms of the new split version. Alternatively, a *Chan.Split module could be added, and the current Chan module defined in terms of it. So two weeks of discussion? Thanks all, Brandon http://brandon.si

I can't speak to the implications of changing the API, but the concept of separating the reader and writer ends of a channel makes absolute sense. So at the very least, this would be useful as a separate package. In a side node - I used to code in an old concurrent language where even variables were split into reader and writer ends (at the lowest level, not as a library). That is, each use of "x" was either "the reader of x" or "the writer of x". The compiler automatically infrerred which one was used, but manual annotation was possible. The compiler also enforced that only a single occurence of "the writer of x" existed at any given time in the system. This allowed doing things that are difficult to do in Haskell right now, without dropping into IO for IVars, such as passing around a deep structure with writers at the leaves to a function that would fill them with values, establishing a trivial two-way communication between multiple threads, and difference lists rather than lists being the most common collection (implemented as a simple pair of a Reader [ T ] and a Writer [ T ]). Extensive use of difference lists turned out to be very useful - in naive code list concatenation was O(1) in many more cases than in naive Haskell code, and there are all sort of concurrency control algorithms that can be trivially implemented by them (e.g., detecting/enforcing when a lazy evaluation of multiple threads is done, passing exceptions in a side channel to the real result, etc.). It is a different approach than the one took by lazy functional languages, and it has its up-sides; I miss its clarity of expressing the code intent when I am dealing with parallel lazy Haskell code. It would be "very nice" if there was a pure version of IVars that basically allowed direct lower-level access to the lazy evaluation mechanism. Of course, the language I used had other issues (this was the early 90s, and a lot of work has been done since then on type systems and type inference, efficient implementation, etc.). It was a research language that dies with the 5th generation project :-( At any rate, +1 for separate reader and writer endpoints for channels, at least for those who need it. Oren Ben-Kiki On Sat, Oct 27, 2012 at 5:23 AM, Brandon Simmons < brandon.m.simmons@gmail.com> wrote:
Hi everybody, Just discovered all the great discussion here, and although my google-fu turned up no prior discussions on this subject, I apologize if it has already been discussed.
I propose that Chan and TChan should be implemented as a pair of read and write ends, initialized as follows:
newSplitChan :: IO (InChan a, OutChan a)
I've implemented this already here: http://hackage.haskell.org/package/chan-split . You can ignore the type classes I've defined; they're there for my own reasons and not part of the proposal.
Here is my best defense:
1) My own (and I assume others') use of Chans involves some bits of code which do only reads, and others which do only writes; a split implementation lets us use the type-checker to allocate read or write "permissions" when we pass around either end, and generally makes things easier to reason about. Others have independently reached this conclusion as well.
2) The API is simpler (and in the case of TChans *much* simpler) with a split approach; some examples: in TChan the types of the 'duplicate' functions actually suggest (IMHO) the details of how they treat existing messages, and require a less roundabout explanation
dupTChan :: InTChan a -> STM (OutTChan a) cloneTChan :: OutTChan a -> STM (OutTChan a)
another example, 'newBroadcastTChan' is actually just:
-- | Create a new write end of a TChan. Use dupTChan to get an OutChan that values can be read from. newInTChan :: STM (InTChan a)
and doesn't require any explanation beyond that.
3) It's trivial to modify the *Chan libs in this way with a few edits (really, they're already implemented this way, but with some added contortions to return both a read and write end for each operation), so there're no new tricky concurrency issues to reason about.
4) Values written to the write end can be reliably GC'd when readers go away, although recent GHC seems to be able to do this on the current implementation with -O2 on my simple tests.
5) While this is a big API change, I imagine the vast majority of users would only have to change a few type signatures and the Chan initialization action. An `OldChan` module could easily be implemented in terms of the new split version. Alternatively, a *Chan.Split module could be added, and the current Chan module defined in terms of it.
So two weeks of discussion?
Thanks all, Brandon http://brandon.si
_______________________________________________ Libraries mailing list Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries

On Fri, 26 Oct 2012, Brandon Simmons wrote:
Hi everybody, Just discovered all the great discussion here, and although my google-fu turned up no prior discussions on this subject, I apologize if it has already been discussed.
I propose that Chan and TChan should be implemented as a pair of read and write ends, initialized as follows:
newSplitChan :: IO (InChan a, OutChan a)
I've implemented this already here: http://hackage.haskell.org/package/chan-split . You can ignore the type classes I've defined; they're there for my own reasons and not part of the proposal.
I have implemented something similar in plain Haskell 98: http://hackage.haskell.org/package/concurrent-split When using it I found that often there is a clear distinction between input and output end of a channel but sometimes it is not. E.g. a thread might send messages to itself. Of course I could pass around both channel ends in this case. I don't think it is worth to break the API. I am happy with separate packages. If the implementors of Control.Concurrent think that the implementations become cleaner then they might add new modules. This would allow for a smooth transition.

On Sat, Oct 27, 2012 at 4:23 AM, Brandon Simmons
I've implemented this already here: http://hackage.haskell.org/package/chan-split . You can ignore the type classes I've defined; they're there for my own reasons and not part of the proposal.
So what's the complete API that we're discussing here? I think the /idea/ is great, but it doesn't feel like a concrete proposal until we have a list of type signatures, so we can discuss the details like whether or not `dupChan` is a good name for that operation (my instinct is that it sounds pretty much the same as `cloneChan`, and it's going to be hard to remember the difference). I think my preferred signature for `newBroadcastTChan` would be `STM (InTChan a, STM (OutTChan a))` or something similar. Then `dupChan` might be unnecessary.

On Sat, Oct 27, 2012 at 6:59 AM, Ben Millwood
On Sat, Oct 27, 2012 at 4:23 AM, Brandon Simmons
wrote: I've implemented this already here: http://hackage.haskell.org/package/chan-split . You can ignore the type classes I've defined; they're there for my own reasons and not part of the proposal.
So what's the complete API that we're discussing here? I think the /idea/ is great, but it doesn't feel like a concrete proposal until we have a list of type signatures, so we can discuss the details like whether or not `dupChan` is a good name for that operation (my instinct is that it sounds pretty much the same as `cloneChan`, and it's going to be hard to remember the difference).
Good point, sorry. Here's what the API for TChan looks like after doing the mechanical transformations into split form (I'm not wedded to these names): newSplitTChan :: STM (InTChan a, OutTChan a) newSplitTChanIO :: IO (InTChan a, OutTChan a) -- this is 'newBroadcastTChan' with a more appropriate name: newInTChan :: STM (InTChan a) newInTChanIO :: IO (InTChan a) writeTChan :: InTChan a -> a -> STM () readTChan :: OutTChan a -> STM a peekTChan :: OutTChan a -> STM a tryPeekTChan :: OutTChan a -> STM (Maybe a) tryReadTChan :: OutTChan a -> STM (Maybe a) isEmptyTChan :: OutTChan a -> STM Bool unGetTChan :: OutTChan a -> a -> STM () dupTChan :: InTChan a -> STM (OutTChan a) cloneTChan :: OutTChan a -> STM (OutTChan a) If you want to look at implementation, see: http://hackage.haskell.org/packages/archive/chan-split/0.5.0/doc/html/src/Co... If it would be more helpful for me to provide a proper patch with the bare minimum modifications, I'd be happy to when I get a moment. Brandon
I think my preferred signature for `newBroadcastTChan` would be `STM (InTChan a, STM (OutTChan a))` or something similar. Then `dupChan` might be unnecessary.

I had a similar idea earlier this year, and uploaded this package: http://hackage.haskell.org/package/privileged-concurrency Jeff On 10/27/2012 11:47 AM, Brandon Simmons wrote:
On Sat, Oct 27, 2012 at 6:59 AM, Ben Millwood
wrote: On Sat, Oct 27, 2012 at 4:23 AM, Brandon Simmons
wrote: I've implemented this already here: http://hackage.haskell.org/package/chan-split . You can ignore the type classes I've defined; they're there for my own reasons and not part of the proposal. So what's the complete API that we're discussing here? I think the /idea/ is great, but it doesn't feel like a concrete proposal until we have a list of type signatures, so we can discuss the details like whether or not `dupChan` is a good name for that operation (my instinct is that it sounds pretty much the same as `cloneChan`, and it's going to be hard to remember the difference). Good point, sorry. Here's what the API for TChan looks like after doing the mechanical transformations into split form (I'm not wedded to these names):
newSplitTChan :: STM (InTChan a, OutTChan a) newSplitTChanIO :: IO (InTChan a, OutTChan a) -- this is 'newBroadcastTChan' with a more appropriate name: newInTChan :: STM (InTChan a) newInTChanIO :: IO (InTChan a) writeTChan :: InTChan a -> a -> STM () readTChan :: OutTChan a -> STM a peekTChan :: OutTChan a -> STM a tryPeekTChan :: OutTChan a -> STM (Maybe a) tryReadTChan :: OutTChan a -> STM (Maybe a) isEmptyTChan :: OutTChan a -> STM Bool unGetTChan :: OutTChan a -> a -> STM () dupTChan :: InTChan a -> STM (OutTChan a) cloneTChan :: OutTChan a -> STM (OutTChan a)
If you want to look at implementation, see: http://hackage.haskell.org/packages/archive/chan-split/0.5.0/doc/html/src/Co...
If it would be more helpful for me to provide a proper patch with the bare minimum modifications, I'd be happy to when I get a moment.
Brandon
I think my preferred signature for `newBroadcastTChan` would be `STM (InTChan a, STM (OutTChan a))` or something similar. Then `dupChan` might be unnecessary.
Libraries mailing list Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries

On Fri, Oct 26, 2012 at 8:23 PM, Brandon Simmons
Hi everybody, Just discovered all the great discussion here, and although my google-fu turned up no prior discussions on this subject, I apologize if it has already been discussed.
I propose that Chan and TChan should be implemented as a pair of read and write ends, initialized as follows:
newSplitChan :: IO (InChan a, OutChan a)
I've implemented this already here: http://hackage.haskell.org/package/chan-split . You can ignore the type classes I've defined; they're there for my own reasons and not part of the proposal.
I think this makes for a great package or perhaps an extra Unidirectional module nested inside the Chan module. Breaking the well used Chan API doesn't seem necessary in this case.

On Sat, Oct 27, 2012 at 12:28 PM, Johan Tibell
On Fri, Oct 26, 2012 at 8:23 PM, Brandon Simmons
wrote: Hi everybody, Just discovered all the great discussion here, and although my google-fu turned up no prior discussions on this subject, I apologize if it has already been discussed.
I propose that Chan and TChan should be implemented as a pair of read and write ends, initialized as follows:
newSplitChan :: IO (InChan a, OutChan a)
I've implemented this already here: http://hackage.haskell.org/package/chan-split . You can ignore the type classes I've defined; they're there for my own reasons and not part of the proposal.
I think this makes for a great package or perhaps an extra Unidirectional module nested inside the Chan module. Breaking the well used Chan API doesn't seem necessary in this case.
In the scenario where an extra module is added to the current libraries, it would be logical to then implement the current non-split Chan modules in terms of the split module. Are there arguments against doing that?

On 10/27/12 12:28 PM, Johan Tibell wrote:
I think this makes for a great package or perhaps an extra Unidirectional module nested inside the Chan module. Breaking the well used Chan API doesn't seem necessary in this case.
I agree. I like the idea, and I'd like to see it gain traction, but I'm not so keen on breaking the Chan/TChan API. Also, there seem to be some API issues to be ironed out (e.g., someone proposed newBrodcastTChan :: STM (InTChan a, STM (OutTChan a)) which has an obvious meaning, over newInTChan :: STM (InTChan a) which is going to take some explaining.) It sounds like many people have implemented versions of this and uploaded them to Hackage already. Perhaps the best approach would be to (1) unify all those packages into a single "blessed" package, to reduce redundancy; (2) advertise that package as the way to go for new projects; (3) add it to the Platform if it gains traction; (4) maybe get around to deprecating the current Chan/TChan API some years down the road when it's no longer popular. -- Live well, ~wren

* wren ng thornton
It sounds like many people have implemented versions of this and uploaded them to Hackage already. Perhaps the best approach would be to (1) unify all those packages into a single "blessed" package, to reduce redundancy; (2) advertise that package as the way to go for new projects; (3) add it to the Platform if it gains traction; (4) maybe get around to deprecating the current Chan/TChan API some years down the road when it's no longer popular.
+1 Roman

On 28/10/2012 01:37, wren ng thornton wrote:
On 10/27/12 12:28 PM, Johan Tibell wrote:
I think this makes for a great package or perhaps an extra Unidirectional module nested inside the Chan module. Breaking the well used Chan API doesn't seem necessary in this case.
I agree. I like the idea, and I'd like to see it gain traction, but I'm not so keen on breaking the Chan/TChan API.
That sums up my thoughts too. I am especially reluctant to change the Chan/TChan API since then some parts of my (in progress) book will need to be rewritten to take the changes into account. Extra type safety is good, but it doesn't come completely for free - there is a bit of extra plumbing and hence program noise from the split API.
It sounds like many people have implemented versions of this and uploaded them to Hackage already. Perhaps the best approach would be to (1) unify all those packages into a single "blessed" package, to reduce redundancy; (2) advertise that package as the way to go for new projects; (3) add it to the Platform if it gains traction; (4) maybe get around to deprecating the current Chan/TChan API some years down the road when it's no longer popular.
Yep, +1 Cheers, Simon
participants (9)
-
Ben Millwood
-
Brandon Simmons
-
Henning Thielemann
-
Jeff Shaw
-
Johan Tibell
-
Oren Ben-Kiki
-
Roman Cheplyaka
-
Simon Marlow
-
wren ng thornton