Help requested: naming things in conduit

Hi all, I'm just about ready to make the 0.5 release of conduit. And as usual, I'm running up against the hardest thing in programming: naming things. Here's the crux of the matter: in older versions of conduit, functions would have a type signature of Source, Sink, or Conduit. For example: sourceFile :: MonadResource m => FilePath -> Source m ByteString I think most people can guess at what this function does: it produces a stream of ByteStrings, which are read from the given file. Now the trick: Source (and Sink and Conduit) are all type synonyms wrapping around the same type, Pipe. Ideally, we'd like to be able to reuse functions like sourceFile in other contexts, such as producing a Conduit that calls sourceFile[1]. However, the type synonym Source over-specifies some of the type parameters to Pipe, and therefore `sourceFile` can't be used directly to create a Conduit[2]. To get around this whole problem, I've added a number of type synonyms with rank-2 types, that don't over-specify. You can see the type synonyms here[3], and more explanation of the problem here[4]. So my question is: can anyone come up with better names for these synonyms? Just to summarize here: * All of the generalized types start with a G, e.g., Source becomes GSource. * For Sinks and Conduits, if leftovers are generated, there's an L after the G (e.g., GLSink). * For Sinks and Conduits which consume all of their input and then return the upstream result, we tack on an Inf for Infinite (e.g., GInfConduit, GLInfSink). I think these names are relatively descriptive, and certain `GSink ByteString m Int` is easier to follow than `Pipe l ByteString o u m Int`, but I was wondering if anyone had some better recommendations. Michael [1] For example, maybe we want to produce `conduitFiles :: MonadResource m => Conduit FilePath m ByteString` [2] This problem exists to a smaller extent in conduit 0.4. This is the purpose of the sinkToPipe function. [3] https://github.com/snoyberg/conduit/blob/52d7bc0b551b877de92be4c87f933e3ffb1... [4] https://github.com/snoyberg/conduit/blob/a853141d7b9eed047c7cc790979f73a3467...

On Thu, Jun 28, 2012 at 6:11 PM, Michael Snoyman
Hi all,
I'm just about ready to make the 0.5 release of conduit. And as usual, I'm running up against the hardest thing in programming: naming things.
Here's the crux of the matter: in older versions of conduit, functions would have a type signature of Source, Sink, or Conduit. For example:
sourceFile :: MonadResource m => FilePath -> Source m ByteString
I think most people can guess at what this function does: it produces a stream of ByteStrings, which are read from the given file.
Now the trick: Source (and Sink and Conduit) are all type synonyms wrapping around the same type, Pipe. Ideally, we'd like to be able to reuse functions like sourceFile in other contexts, such as producing a Conduit that calls sourceFile[1]. However, the type synonym Source over-specifies some of the type parameters to Pipe, and therefore `sourceFile` can't be used directly to create a Conduit[2].
To get around this whole problem, I've added a number of type synonyms with rank-2 types, that don't over-specify. You can see the type synonyms here[3], and more explanation of the problem here[4]. So my question is: can anyone come up with better names for these synonyms? Just to summarize here:
* All of the generalized types start with a G, e.g., Source becomes GSource. * For Sinks and Conduits, if leftovers are generated, there's an L after the G (e.g., GLSink). * For Sinks and Conduits which consume all of their input and then return the upstream result, we tack on an Inf for Infinite (e.g., GInfConduit, GLInfSink).
I think these names are relatively descriptive, and certain `GSink ByteString m Int` is easier to follow than `Pipe l ByteString o u m Int`, but I was wondering if anyone had some better recommendations.
I ran into this problem myself with my implementation that used 7 type parameter (the extra parameter wrt to conduit was used by Defer), and I couldn't think of any satisfactory solution. The dilemma here is: - exposing the full `Pipe` type as the primary API would be really confusing for new users - creating a bunch of type synonyms adds a lot of conceptual overhead, and it's actually a leaky abstraction, because `Pipe` will probably be shown in error messages, and appears in the signatures of basic combinators In the end, I gave up the 2 non-essential parameters, built the corresponding lost features on top of `Pipe` using newtypes, and decided to expose a 5-parameter `Pipe` type with no universally quantified synonyms. I'm not sure how easy this Pipe type is to understand, but at least all parameters have a clear meaning that can be explained in the documentation, whereas the `l` parameter is sort of a hack (like my 'd' parameter). BR, Paolo

On Thu, Jun 28, 2012 at 8:36 PM, Paolo Capriotti
On Thu, Jun 28, 2012 at 6:11 PM, Michael Snoyman
wrote: Hi all,
I'm just about ready to make the 0.5 release of conduit. And as usual, I'm running up against the hardest thing in programming: naming things.
Here's the crux of the matter: in older versions of conduit, functions would have a type signature of Source, Sink, or Conduit. For example:
sourceFile :: MonadResource m => FilePath -> Source m ByteString
I think most people can guess at what this function does: it produces a stream of ByteStrings, which are read from the given file.
Now the trick: Source (and Sink and Conduit) are all type synonyms wrapping around the same type, Pipe. Ideally, we'd like to be able to reuse functions like sourceFile in other contexts, such as producing a Conduit that calls sourceFile[1]. However, the type synonym Source over-specifies some of the type parameters to Pipe, and therefore `sourceFile` can't be used directly to create a Conduit[2].
To get around this whole problem, I've added a number of type synonyms with rank-2 types, that don't over-specify. You can see the type synonyms here[3], and more explanation of the problem here[4]. So my question is: can anyone come up with better names for these synonyms? Just to summarize here:
* All of the generalized types start with a G, e.g., Source becomes GSource. * For Sinks and Conduits, if leftovers are generated, there's an L after the G (e.g., GLSink). * For Sinks and Conduits which consume all of their input and then return the upstream result, we tack on an Inf for Infinite (e.g., GInfConduit, GLInfSink).
I think these names are relatively descriptive, and certain `GSink ByteString m Int` is easier to follow than `Pipe l ByteString o u m Int`, but I was wondering if anyone had some better recommendations.
I ran into this problem myself with my implementation that used 7 type parameter (the extra parameter wrt to conduit was used by Defer), and I couldn't think of any satisfactory solution.
The dilemma here is:
- exposing the full `Pipe` type as the primary API would be really confusing for new users - creating a bunch of type synonyms adds a lot of conceptual overhead, and it's actually a leaky abstraction, because `Pipe` will probably be shown in error messages, and appears in the signatures of basic combinators
In the end, I gave up the 2 non-essential parameters, built the corresponding lost features on top of `Pipe` using newtypes, and decided to expose a 5-parameter `Pipe` type with no universally quantified synonyms.
I'm not sure how easy this Pipe type is to understand, but at least all parameters have a clear meaning that can be explained in the documentation, whereas the `l` parameter is sort of a hack (like my 'd' parameter).
I think even five parameters are too many. The original conduit types had either 2 or 3 parameters, and each one was essential and easily explainable. I realize that- for now- type synonyms will not help at all with error messages (which I consider a serious problem), but at least normal API functions like sourceFile will get helpful signatures. One idea that I've toyed around with- but not really pursued- is creating actual newtypes for Source, Conduit, and Sink, and using Chris's typeclass approach for when we want general functions. After some basic fiddling, the typeclasses just seem to make everything more difficult to work with. You're correct by the way that we need a lot of type synonyms (I got 9 of them). But I still think it helps with the overhead instead of hurting. While it may be important for some cases to understand the different between GSink and GLSink, for most use cases simply knowing "oh, this thing takes a stream of `a` and gives a single result of `b`" is sufficient. But I think only real world usage is going to help us determine the best approach here. Michael
participants (2)
-
Michael Snoyman
-
Paolo Capriotti