ANNOUNCE: pipes-core 0.0.1

Paolo Capriotti

10 Mar 2012 10 Mar '12

12:36 a.m.

I'm pleased to announce the release of version 0.0.1 of pipes-core, a library for efficient, safe and compositional IO, similar in scope to iteratees and conduits. http://hackage.haskell.org/package/pipes-core This is a reimplementation of the original 'Pipe' concept by Gabriel Gonzales. The package documentation contains an introduction to Pipes and a detailed tutorial, so here I'll just outline some of the differences from the original version: - A single composition operator. "Strict" composition has been removed, since its role wasn't actually that clear, and its purpose for resource finalization has been replaced by better constructs (see below). - A new primitive 'tryAwait', which makes it possible to recover from termination of an upstream pipe. Using 'tryAwait' you can write pipes that are able to consume all their input and return a value (like iteratees or Conduit's sinks can), and stateful pipes that can make use of their final state before terminating. - Full exception safety and exception-handling primitives. Pipes have been augmented with 'catch' and 'finally' primitives, which allow you to recover from exceptions and ensure finalization of resources within the Pipe monad. There's no more need to use resource-simple or monadic regions together with Pipes. - Multi-channel pipes. This is a generalization of Arrows using sums instead of pairs for contexts. We provide a number of combinators that allow you to combine Pipes in much the same way as Arrows, although there is not unfortunately an alternative for the Arrow syntax. Together with pipes-core, I also released a number of accessory packages with various utilities. Here is a complete list: * pipes-extra: File readers and writers, chunk pipes. http://hackage.haskell.org/package/pipes-extra * pipes-attoparsec: Utilities to convert a parser into a pipe. http://hackage.haskell.org/package/pipes-attoparsec * pipes-conduit: Conduit adapters. http://hackage.haskell.org/package/pipes-conduit * pipes-network: Utilities to deal with sockets. Ported from conduit. http://hackage.haskell.org/package/pipes-network * pipes-zlib: Pipes to deal with zipped data. Ported from conduit. http://hackage.haskell.org/package/pipes-zlib This is an experimental release, but it should be equipped with all the functionality needed to write serious applications. I encourage people to try it out and send feedback if anything looks too simplistic or too limiting. BR, Paolo Capriotti

Show replies by date

Mario Blažević

10 Mar 10 Mar

4:21 a.m.

On 12-03-09 07:36 PM, Paolo Capriotti wrote:

...

I'm pleased to announce the release of version 0.0.1 of pipes-core, a library for efficient, safe and compositional IO, similar in scope to iteratees and conduits.

I like your design, it seems to strike a good balance between elegance and practicality. The only thing missing for the latter is a deeper support for chunking. Of course, that would probably destroy some of the elegance [1]. I don't think that problem has been solved in any of the enumerator/iteratee/pipe/wire/conduit libraries so far. Did you consider adding some stream-splitting and merging pipes, like those in the SCC package [2] or those described in the last Monad.Reader issue [3]? Your arrow-like combinators seem well thought out, but they don't go very far. [1] http://www.haskell.org/pipermail/haskell-cafe/2010-August/082540.html [2] http://hackage.haskell.org/packages/archive/scc/0.7.1/doc/html/Control-Concu... [3] http://themonadreader.files.wordpress.com/2011/10/issue19.pdf

Paolo Capriotti

10:16 a.m.

On Sat, Mar 10, 2012 at 4:21 AM, Mario Blažević wrote:

...

I like your design, it seems to strike a good balance between elegance and practicality. The only thing missing for the latter is a deeper support for chunking. Of course, that would probably destroy some of the elegance [1]. I don't think that problem has been solved in any of the enumerator/iteratee/pipe/wire/conduit libraries so far.

Chunking is supported but not by primitive constructs. The way you implement chunked streams is to simply use some form of "container" representing a chunk as your input/output type. Of course, that means that the abstraction is now operating at the level of chunks instead of elements, which may be inconvenient, but I doubt that there exists a way to "lift" element operations to chunks in an efficient and general way. Another issue is how to deal with unconsumed input. For that, there is a ChunkPipe type (in pipes-extra) with a specialized monad instance that threads unconsumed input along. You can see an example of ChunkPipe in action in this prototype http server by Jeremy Shaw: http://src.seereason.com/pipes-http-2/pipes-http-2/. Note that this is based on a old version of pipes-core, however.

...

Did you consider adding some stream-splitting and merging pipes, like those in the SCC package [2] or those described in the last Monad.Reader issue [3]? Your arrow-like combinators seem well thought out, but they don't go very far.

I'm not sure why you say that they don't go very far. I looked at Splitter and Join in Monad.Reader 19, and they seem equivalent to 'splitP' and 'joinP' in pipes-core. There shouldn't be any problem implementing all the other combinators there in terms of monoidal primitives (e.g. 'not' is just 'swap'). There is also a 'zip' combinator in pipes-extra, that synchronizes two Producers on their respective outputs. Can you elaborate on use cases that seem to be missing? BR, Paolo

Mario Blažević

7:59 p.m.

On 12-03-10 05:16 AM, Paolo Capriotti wrote:

...

On Sat, Mar 10, 2012 at 4:21 AM, Mario Blažević wrote:

...
I like your design, it seems to strike a good balance between elegance and practicality. The only thing missing for the latter is a deeper support for chunking. Of course, that would probably destroy some of the elegance [1]. I don't think that problem has been solved in any of the enumerator/iteratee/pipe/wire/conduit libraries so far.

Chunking is supported but not by primitive constructs. The way you implement chunked streams is to simply use some form of "container" representing a chunk as your input/output type.

Of course, that means that the abstraction is now operating at the level of chunks instead of elements, which may be inconvenient, but I doubt that there exists a way to "lift" element operations to chunks in an efficient and general way.

Another issue is how to deal with unconsumed input. For that, there is a ChunkPipe type (in pipes-extra) with a specialized monad instance that threads unconsumed input along. You can see an example of ChunkPipe in action in this prototype http server by Jeremy Shaw: http://src.seereason.com/pipes-http-2/pipes-http-2/. Note that this is based on a old version of pipes-core, however.

The only sane way I've found to deal with chunks is to move the responsibility into the glue logic, which would be your (>+>) and (>>) combinators. The upstream argument of (>+>) could then produce chunks of any size it finds suitable, while the downstream argument would specify exactly how much input it needs without having to worry about the upstream chunk boundaries. How these requests are phrased is an open question. I've developed the incremental-parser package specifically for this purpose. Other approaches are possible, but I'm convinced that chunking should not be left to individual components. The chunk type probably shouldn't be reflected even in their types.

...

...
Did you consider adding some stream-splitting and merging pipes, like those in the SCC package [2] or those described in the last Monad.Reader issue [3]? Your arrow-like combinators seem well thought out, but they don't go very far.

I'm not sure why you say that they don't go very far. I looked at Splitter and Join in Monad.Reader 19, and they seem equivalent to 'splitP' and 'joinP' in pipes-core.

The main purpose of the Splitter type is to act as a conditional, sending each input item *either* into the left or into the right output, marking is as either true or false. Your splitterP sends each item to *both* left and right output. It's a tee, not a split. If your point is that the Splitter a m r type is isomorphic to Pipe a (Either a a) m r, that is true. There is a benefit to the abstraction, though. Once you introduce chunking into the picture, however, the Splitter type can be changed under the hood to send an entire chunk to its left or right output. The corresponding efficient chunked Pipe type would be Pipe [a] (Either [a] [a]) m r, which is not at all the same as Pipe [a] [Either a a] m r -- if you have to pack every single item of the input chunk into an Either value, you've lost all performance benefits of chunking. The former type is efficient but I'm not sure if it would allow you to abstract the chunking logic out of the individual components.

...

There shouldn't be any problem implementing all the other combinators there in terms of monoidal primitives (e.g. 'not' is just 'swap').

I agree, with the chunking reservation above. Anyway, consider adding the Boolean combinators to the library. I find them quite intuitive.

Twan van Laarhoven

10:19 p.m.

On 2012-03-10 11:16, Paolo Capriotti wrote:

...

Another issue is how to deal with unconsumed input. For that, there is a ChunkPipe type (in pipes-extra) with a specialized monad instance that threads unconsumed input along. You can see an example of ChunkPipe in action in this prototype http server by Jeremy Shaw: http://src.seereason.com/pipes-http-2/pipes-http-2/. Note that this is based on a old version of pipes-core, however.

A nice way to deal with unconsumed input (from a user's perspective) would be a function -- | Pass some unconsumed input back upstream. -- The next @await@ will return this input without blocking. unawait :: Monad m => a -> Pipe a b m () Twan

Mario Blažević

11:09 p.m.

On 12-03-10 05:19 PM, Twan van Laarhoven wrote:

...

On 2012-03-10 11:16, Paolo Capriotti wrote:

...
Another issue is how to deal with unconsumed input. For that, there is a ChunkPipe type (in pipes-extra) with a specialized monad instance that threads unconsumed input along. You can see an example of ChunkPipe in action in this prototype http server by Jeremy Shaw: http://src.seereason.com/pipes-http-2/pipes-http-2/. Note that this is based on a old version of pipes-core, however.

A nice way to deal with unconsumed input (from a user's perspective) would be a function

-- | Pass some unconsumed input back upstream. -- The next @await@ will return this input without blocking. unawait :: Monad m => a -> Pipe a b m ()

The function may be called unawait, but there's nothing stopping you from inserting something into the stream that wasn't in the input to start with. I find that this approach breaks too many invariants.

Twan van Laarhoven

11 Mar 11 Mar

2:05 a.m.

On 2012-03-11 00:09, Mario Blažević wrote:

...

On 12-03-10 05:19 PM, Twan van Laarhoven wrote:

...
-- | Pass some unconsumed input back upstream. -- The next @await@ will return this input without blocking. unawait :: Monad m => a -> Pipe a b m ()

The function may be called unawait, but there's nothing stopping you from inserting something into the stream that wasn't in the input to start with. I find that this approach breaks too many invariants.

Which invariants does it break exactly? I.e. what properties do you expect to hold that fail when you can push arbitrary values back up-stream? Twan

Mario Blažević

4:53 a.m.

On 12-03-10 09:05 PM, Twan van Laarhoven wrote:

...

On 2012-03-11 00:09, Mario Blažević wrote:

...
On 12-03-10 05:19 PM, Twan van Laarhoven wrote:

...
-- | Pass some unconsumed input back upstream. -- The next @await@ will return this input without blocking. unawait :: Monad m => a -> Pipe a b m ()

The function may be called unawait, but there's nothing stopping you from inserting something into the stream that wasn't in the input to start with. I find that this approach breaks too many invariants.

Which invariants does it break exactly? I.e. what properties do you expect to hold that fail when you can push arbitrary values back up-stream?

Are you asking for a written-up set of Pipe laws? I'm not aware of any, and I'd love to see one. The Category law would be broken, though: unawait x >>> id == yield x !== unawait x I suppose the additional Arrow laws, if they were transcribed to the pseudo-Arrow operations that Pipe supports, would be broken as well.

Paolo Capriotti

1:09 p.m.

...

The Category law would be broken, though:

unawait x >>> id == yield x !== unawait x

How did you get this equation? It's not even well-typed: unawait :: a -> Pipe a b m () yield :: b -> Pipe a b m () Someone actually implemented a variation of Pipes with unawait: https://github.com/duairc/pipes/blob/master/src/Control/Pipe/Common.hs (it's called 'unuse' there). I actually agree that it might break associativity or identity, but I don't have a counterexample in mind yet. BR, Paolo

Twan van Laarhoven

1:25 p.m.

On 2012-03-11 14:09, Paolo Capriotti wrote:

...

...
The Category law would be broken, though:

unawait x>>> id == yield x !== unawait x

How did you get this equation? It's not even well-typed:

unawait :: a -> Pipe a b m () yield :: b -> Pipe a b m ()

I would expect that (id >>> unawait x) >>> await !== unawait x >>> await === return x because in the general case of (p >>> unawait x) if is impossible to transform the unawaited value out of the composition. To do that you would need the inverse of p. You can get around this by having a special constructor for the identity. But then id !== arr id IMO, neither of these failed laws would be a big problem in practice, since no one will actually use identity pipes in combination with unawait. Or perhaps we should be satisfied when Pipe is a Semigroupoid? Twan

Paolo Capriotti

1:46 p.m.

On Sun, Mar 11, 2012 at 1:25 PM, Twan van Laarhoven wrote:

...

On 2012-03-11 14:09, Paolo Capriotti wrote:

...
...
The Category law would be broken, though:

unawait x>>> id == yield x !== unawait x

How did you get this equation? It's not even well-typed:

unawait :: a -> Pipe a b m () yield :: b -> Pipe a b m ()

I would expect that

(id >>> unawait x) >>> await !== unawait x >>> await === return x

There are several type errors in this equation, so I'm not exactly sure what you mean. If you intended to write something like: (id >>> (unawait x >> return y)) >>> await !== (unawait x >> return y) >>> await then I disagree, because both sides should evaluate to 'return y'. I'm not sure why you would expect x to be returned, since there is no 'await' after 'unawait' in the middle pipe.

...

because in the general case of

(p >>> unawait x)

if is impossible to transform the unawaited value out of the composition.

Why? The natual definition would be: p >+> (unawait x >> p') == (yield x >> p) >+> p' To

...

do that you would need the inverse of p. You can get around this by having a special constructor for the identity. But then

id !== arr id

IMO, neither of these failed laws would be a big problem in practice, since no one will actually use identity pipes in combination with unawait.

I'm extremely wary of this sort of reasoning, because failure of invariants in simple situations are a symptom of something being conceptually wrong in the abstraction.

...

Or perhaps we should be satisfied when Pipe is a Semigroupoid?

I don't think so, since we can always add formal identities. Usually, though, failure of the identity law implies failure of associativity, by just putting the "failing identity" in the middle of a composition. BR, Paolo

Twan van Laarhoven

8:16 p.m.

On 2012-03-11 14:46, Paolo Capriotti wrote:

...

On Sun, Mar 11, 2012 at 1:25 PM, Twan van Laarhoven wrote:

...
I would expect that

(id>>> unawait x)>>> await !== unawait x>>> await === return x

There are several type errors in this equation, so I'm not exactly sure what you mean. If you intended to write something like:

(id>>> (unawait x>> return y))>>> await !== (unawait x>> return y)>>> await

then I disagree, because both sides should evaluate to 'return y'. I'm not sure why you would expect x to be returned, since there is no 'await' after 'unawait' in the middle pipe.

Oops, I got confused by (>>) and (>>>). The intended semantics of unawait is unawait x >> await === return x So what I was trying to write is (id >+> unawait x) >> await === {by identity law} unawait x >> await === {by behavior of unawait} return x But that this is impossible to implement, and instead what you get is (id >+> unawait x) >> await === return () >> await === await

...

...
because in the general case of

(p >>> unawait x)

if is impossible to transform the unawaited value out of the composition.

Why? The natual definition would be:

p >+> (unawait x >> p') == (yield x >> p) >+> p'

Right, but then take p==id, which gives (id >+> (unawait x >> return ())) >> p' === ((yield x >> id) >+> return ()) >> p' === return () >> p' === p' So the unawait is not seen by p', whereas the identity law would give (id >+> (unawait x >> return ())) >> p' === (unawait x >> return ()) >> p' === unawait x >> p' I would like to get this latter semantics, and if the left side is id, it is fine. However, in (p :: Pipe a b r) >+> (unawait x >> q :: Pipe b c r) :: Pipe a c r x has type b. You can not write this in a form like unawait x' >> q :: Pipe a c r because here x' would have type a. But if p == id, this is exactly what you would like to get.

...

I'm extremely wary of this sort of reasoning, because failure of invariants in simple situations are a symptom of something being conceptually wrong in the abstraction.

...
Or perhaps we should be satisfied when Pipe is a Semigroupoid?

I don't think so, since we can always add formal identities. Usually, though, failure of the identity law implies failure of associativity, by just putting the "failing identity" in the middle of a composition.

A simple way to implement unawait that fails the identity law is by discarding left-over unawaits inside a vertical composition. I.e. unawait x >+> p === return () >+> p q >+> unawait y === q >+> return () As long as you do this consistently for *all* vertical compositions, I don't see how associativity is broken. In the first case, with unawait on the left, you don't need to discard the await. But I think it is probably more consistent if you do. Anway, 'purePipe id' is now a failing identity, in the sense that composing with it discards trailing awaits from the other thing composed with. Twan

Chris Smith

4:23 p.m.

On Sun, Mar 11, 2012 at 7:09 AM, Paolo Capriotti wrote:

...

Someone actually implemented a variation of Pipes with unawait: https://github.com/duairc/pipes/blob/master/src/Control/Pipe/Common.hs (it's called 'unuse' there).

I actually agree that it might break associativity or identity, but I don't have a counterexample in mind yet.

Indeed, on further thought, it looks like you'd run into problems here: unawait x >> await == return x (idP >+> unawait x) >> await == ??? The monadic operation is crucial there: without it, there's no way to observe which side of idP knows about the unawait, so you can keep it local and everything is fine... but throw in the Monad instance, and those pipes are no longer equivalent because they act differently in vertical composition. There is no easy way to fix this with (idP == pipe id). You could kludge the identity pipes and make that law hold, and I *think* you'd even keep associativity in the process so you would technically have a category again. But this hints to me that there is some *other* law you should expect to hold with regard to the interaction of Category and Monad, and now that is being broken. -- Chris Smith

Mario Blažević

4:30 p.m.

On 12-03-11 09:09 AM, Paolo Capriotti wrote:

...

...
The Category law would be broken, though:

unawait x>>> id == yield x !== unawait x How did you get this equation? It's not even well-typed:

unawait :: a -> Pipe a b m () yield :: b -> Pipe a b m ()

You're right, it's completely wrong. I was confused last night.

...

Someone actually implemented a variation of Pipes with unawait: https://github.com/duairc/pipes/blob/master/src/Control/Pipe/Common.hs (it's called 'unuse' there).

I actually agree that it might break associativity or identity, but I don't have a counterexample in mind yet.

It's difficult to say without having the implementation of both unawait and all the combinators in one package. I'll assume the following equations hold: unawait x >> await = return x unawait x >> yield y = yield y >> unawait x (p1 >> unawait x) >>> p2 = (p1 >>> p2) <* unawait x -- this one tripped me up first (unawait (x, y)) = unawait x The first two equations would let us move all the unawaits that are not immediately re-awaited to the end of their monadic pipeline stage: the unawait can always be performed as the last operation in bulk. The third equation let allows us to move these unawaits to the end of the pipeline. If these equations hold, unawait now appears to be law-abiding to me. I apologize for my unsubstantiated smears. The 'unuse' implementation you linked to drops the unmatched Unuse suspensions, so it doesn't follow the third equation.

...

go i True u (Free (Unuse a d)) = go i True u d go True o (Free (Unuse a u)) d@(Free (Await _)) = go True o u d

I think this implemanteation does break the Category law, but I'm having trouble testing it in GHCi.

Chris Smith

4:39 p.m.

On Sun, Mar 11, 2012 at 10:30 AM, Mario Blažević wrote:

...

It's difficult to say without having the implementation of both unawait and all the combinators in one package. I'll assume the following equations hold:

...

(p1 >> unawait x) >>> p2 = (p1 >>> p2) <* unawait x -- this one tripped me up

I don't think this could reasonably hold. For example, you'd expect that for any p, idP >> p == idP since idP never terminates at all. But then let p1 == idP, and you get something silly. The issue is with early termination: if p2 terminates first in the left hand side, you don't want the unawait to occur. -- Chris Smith

Mario Blažević

5:22 p.m.

On 12-03-11 12:39 PM, Chris Smith wrote:

...

...
(p1>> unawait x)>>> p2 = (p1>>> p2)<* unawait x -- this one tripped me up I don't think this could reasonably hold. For example, you'd expect

On Sun, Mar 11, 2012 at 10:30 AM, Mario Blažević wrote: that for any p, idP>> p == idP since idP never terminates at all. But then let p1 == idP, and you get something silly. The issue is with early termination: if p2 terminates first in the left hand side, you don't want the unawait to occur.

No, idP does terminate once it consumes its input. Your idP >> p first reproduces the complete input, and then runs p with empty input.

Chris Smith

5:36 p.m.

On Sun, Mar 11, 2012 at 11:22 AM, Mario Blažević wrote:

...

No, idP does terminate once it consumes its input. Your idP >> p first reproduces the complete input, and then runs p with empty input.

This is just not true. idP consumes input forever, and (idP >> p) = idP, for all pipes p. If it is composed with another pipe that terminates, then yes, the *composite* pipe can terminate, so for example ((q >+> idP) >> p) may actually do something with p. But to get that effect, you need to compose before the monadic bind... so for example (q >+> (idP >> p)) = (q >+> idP) = q. Yes, q can be exhausted, but when it is, idP will await input, which will immediately terminate the (idP >> p) pipe, producing the result from q, and ignoring p entirely. -- Chris Smith

Mario Blažević

12 Mar 12 Mar

2:53 a.m.

On 12-03-11 01:36 PM, Chris Smith wrote:

...

On Sun, Mar 11, 2012 at 11:22 AM, Mario Blažević wrote:

...
No, idP does terminate once it consumes its input. Your idP>> p first reproduces the complete input, and then runs p with empty input.

This is just not true. idP consumes input forever, and (idP>> p) = idP, for all pipes p.

If it is composed with another pipe that terminates, then yes, the *composite* pipe can terminate, so for example ((q>+> idP)>> p) may actually do something with p. But to get that effect, you need to compose before the monadic bind... so for example (q>+> (idP>> p)) = (q>+> idP) = q. Yes, q can be exhausted, but when it is, idP will await input, which will immediately terminate the (idP>> p) pipe, producing the result from q, and ignoring p entirely.

Sorry. I was describing the way it's done in SCC, and I assumed that pipes and pipes-core behaved the same. But GHCi says you're right:

...

:{ | runPipe ((fromList [1, 2, 3] >> return []) | >+> (idP >> fromList [4, 5] >> return []) | >+> consume) | :} [1,2,3]

May I enquire what was the reason for the non-termination of idP? Why was it not defined as 'forP yield' instead? The following command runs the way I expected.

...

:{ | runPipe ((fromList [1, 2, 3] >> return []) | >+> (forP yield >> fromList [4, 5] >> return []) | >+> consume) | :} [1,2,3,4,5]

Chris Smith

3:26 a.m.

On Sun, Mar 11, 2012 at 8:53 PM, Mario Blažević wrote:

...

May I enquire what was the reason for the non-termination of idP? Why was it not defined as 'forP yield' instead? The following command runs the way I expected.

With pipes-core (which, recall, is known to be unsound... just felt this is a good time for a reminder of that, even though I believe the subset that adds tryAwait and forP to be sound), you do get both (pipe id) and (forP yield). So discover which is the true identity, we can try: idP >+> forP yield == forP yield forP yield >+> idP == forP yield Yep, looks like idP is still the identity. Of course, the real reason (aside from the fact that you can check and see) is that forP isn't definable at all in Gabriel's pipes package. -- Chris Smith

Paolo Capriotti

9:26 a.m.

On Mon, Mar 12, 2012 at 3:26 AM, Chris Smith wrote:

...

With pipes-core (which, recall, is known to be unsound... just felt this is a good time for a reminder of that, even though I believe the subset that adds tryAwait and forP to be sound), you do get both (pipe id) and (forP yield).

I wouldn't say it's unsound, more like "not yet proved to be bug-free" :) Note that the latest master fixes all the issues found so far. I agree that it would be nice to have a proof of correctness, but I prefer to wait until it stabilizes a bit before embarking in a long verification of all the cases. Part of the reason to release it now is so that people can try it out and suggest changes and additions. I am pretty confident that, if there are other issues, they can be fixed without significantly altering the interface or the overall approach. BR, Paolo

Chris Smith

3:30 p.m.

On Mon, Mar 12, 2012 at 3:26 AM, Paolo Capriotti wrote:

...

I wouldn't say it's unsound, more like "not yet proved to be bug-free" :)

Note that the latest master fixes all the issues found so far.

I was referring to the released version of pipes-core, for which "known to be unsound" is an accurate description. Good to hear that you've got a fix coming, though. Given the history here, maybe working out the proofs of the category laws sooner rather than later would be a good thing. I'll have a look today and see if I can bang out a proof of the category laws for your new code without ensure. It will then be interesting to see how that compares to Gabriel's approach, which at this point we've heard a bit about but I haven't seen. -- Chris Smith

Paolo Capriotti

9:32 a.m.

On Mon, Mar 12, 2012 at 2:53 AM, Mario Blažević wrote:

...

May I enquire what was the reason for the non-termination of idP? Why was it not defined as 'forP yield' instead? The following command runs the way I expected.

The identity in a homset is unique, and in the case of 'Pipe a b m r', it happens to be 'idP'. 'forP yield' has its uses, but, as Chris showed, it's not a real identity. BR, Paolo

Twan van Laarhoven

11 Mar 11 Mar

8:33 p.m.

On 2012-03-11 17:30, Mario Blažević wrote:

...

It's difficult to say without having the implementation of both unawait and all the combinators in one package. I'll assume the following equations hold:

unawait x >> await = return x unawait x >> yield y = yield y >> unawait x (p1 >> unawait x) >>> p2 = (p1 >>> p2) <* unawait x -- this one tripped me up first (unawait (x, y)) = unawait x

I think you should instead move unwaits in and out of the composition on the left side: unawait x >> (p1 >+> p2) === (unawait x >> p1) >+> p2 This makes idP a left-identity for (>+>), but not a right-identity, since you can't move unawaits in and out of p2. Twan

Chris Smith

10:41 p.m.

On Sun, Mar 11, 2012 at 2:33 PM, Twan van Laarhoven wrote:

...

I think you should instead move unwaits in and out of the composition on the left side:

unawait x >> (p1 >+> p2) === (unawait x >> p1) >+> p2

This makes idP a left-identity for (>+>), but not a right-identity, since you can't move unawaits in and out of p2.

Not sure how we got to the point of debating which of the category laws pipes should break... messy business there. I'm going to be in favor of not breaking the laws at all. The problem here is that composition of chunked pipes requires agreement on the chunk type, which gives the type-level guarantees you need that all chunked pipes in a horizontal composition (by which I mean composition in the category... I think you were calling that vertical? no matter...) share the same chunk type. Paolo's pipes-extra does this by inventing a newtype for chunked pipes, in which the input type appears in the result as well. There are probably some details to quibble with, but I think the idea there is correct. I don't like this idea of implicitly just throwing away perfectly good data because the types are wrong. It shows up in the category-theoretic properties of the package as a result, but it also shows up in the fact that you're *throwing* *away* perfectly good data just because the type system doesn't give you a place to put it! What's become obvious from this is that a (ChunkedPipe a b m r) can NOT be modelled correctly as a (Pipe a b m r). -- Chris Smith

Paolo Capriotti

12 Mar 12 Mar

9:48 a.m.

On Sun, Mar 11, 2012 at 10:41 PM, Chris Smith wrote:

...

On Sun, Mar 11, 2012 at 2:33 PM, Twan van Laarhoven wrote:

...
I think you should instead move unwaits in and out of the composition on the left side:

unawait x >> (p1 >+> p2) === (unawait x >> p1) >+> p2

This makes idP a left-identity for (>+>), but not a right-identity, since you can't move unawaits in and out of p2.

Not sure how we got to the point of debating which of the category laws pipes should break... messy business there. I'm going to be in favor of not breaking the laws at all. The problem here is that composition of chunked pipes requires agreement on the chunk type, which gives the type-level guarantees you need that all chunked pipes in a horizontal composition (by which I mean composition in the category... I think you were calling that vertical? no matter...) share the same chunk type. Paolo's pipes-extra does this by inventing a newtype for chunked pipes, in which the input type appears in the result as well. There are probably some details to quibble with, but I think the idea there is correct. I don't like this idea of implicitly just throwing away perfectly good data because the types are wrong. It shows up in the category-theoretic properties of the package as a result, but it also shows up in the fact that you're *throwing* *away* perfectly good data just because the type system doesn't give you a place to put it! What's become obvious from this is that a (ChunkedPipe a b m r) can NOT be modelled correctly as a (Pipe a b m r).

I completely agree with this. For the subset consisting of pipes that never terminate (basically stream processors), it might be possible to add 'unawait'. However, the example (idP >+> unawait x) >> await shows that it's impossible to implement on general pipes without changing the Pipe type in some deeper way. ChunkPipe can be somewhat awkward to use because of the newtype wrapping/unwrapping, but from my experience, there's no need to use it very much in practice. What you can usually do is insert a 'regularize' Pipe (using ChunkPipe) early in the pipeline which splits chunks that cross "logical boundaries", so that the rest of the pipeline can deal with chunked input without worrying about leftovers. BR, Paolo

Twan van Laarhoven

10:41 a.m.

On 11/03/12 23:41, Chris Smith wrote:

...

On Sun, Mar 11, 2012 at 2:33 PM, Twan van Laarhoven wrote:

...
I think you should instead move unwaits in and out of the composition on the left side:

unawait x>> (p1>+> p2) === (unawait x>> p1)>+> p2

This makes idP a left-identity for (>+>), but not a right-identity, since you can't move unawaits in and out of p2.

Not sure how we got to the point of debating which of the category laws pipes should break... messy business there. I'm going to be in favor of not breaking the laws at all. The problem here is that composition of chunked pipes requires agreement on the chunk type, which gives the type-level guarantees you need that all chunked pipes in a horizontal composition (by which I mean composition in the category... I think you were calling that vertical? no matter...) share the same chunk type. Paolo's pipes-extra does this by inventing a newtype for chunked pipes, in which the input type appears in the result as well. There are probably some details to quibble with, but I think the idea there is correct. I don't like this idea of implicitly just throwing away perfectly good data because the types are wrong. It shows up in the category-theoretic properties of the package as a result, but it also shows up in the fact that you're *throwing* *away* perfectly good data just because the type system doesn't give you a place to put it! What's become obvious from this is that a (ChunkedPipe a b m r) can NOT be modelled correctly as a (Pipe a b m r).

Agreed. There are many things to be said for making sure that Pipe is a real category. I suppose the morally correct thing to do is, as you said, have a separate ChunkedPipe type. That would make the type system guarantee that there are no calls to 'unawait' in the second part of a categorical composition. The API could even look something like this: data Chunk data NoChunk data Pipe chunkiness a b m r await :: Pipe anyChunk a b m a yield :: b -> Pipe anyChunk a b m () unawait :: a -> Pipe Chunk a b m () runChunkedPipe :: Pipe Chunk a b m r -> Pipe NoChunk a b m r -- composition in the category (>+>) :: Pipe NoChunk a b m r -> Pipe NoChunk b c m r -> Pipe NoChunk a c m r The following generalization of category composition is still fine: compose :: Pipe anyChunk a b m r -> Pipe NoChunk b c m r -> Pipe anyChunk a c m r But this would not be: almostEntirelyNotUnlikeCompose :: Pipe anyChunk a b m r -> Pipe discardChunksHere b c m r -> Pipe anyChunk a c m r By the way, a ChunkedPipe can be implemented not only as type ChunkedPipe a b m r = StateT [a] (Pipe a b m) r but also as type ChunkedPipe a b m r = CodensityT (Pipe a b m) r by using the 'feed' function to implement unawait. Twan

4858

Age (days ago)

4860

Last active (days ago)

List overview

Download

25 comments

4 participants

participants (4)

Chris Smith
Mario Blažević
Paolo Capriotti
Twan van Laarhoven