packages with orphaned instances only

Hi, referring to previous messages about QuickCheck and Johan's proposal about moving instances I wonder how many of us have created obviously orphaned "Arbitrary" instances for container data types? Maybe indeed separate packages with (orphaned) instances only should be created for re-use. I haven't seen such packages on hackage, yet. And maybe such packages should be clearly recognizable. So is this a bad idea? Cheers Christian

On 7 January 2011 10:22, Christian Maeder
So is this a bad idea?
It sounds like a good idea to me. There might be some details to trash out, e.g. the granularity of packages - should there be a Quickcheck-orphans with Arbitrary instances for many structures, or separate packages (quickcheck-containers-instances, quickcheck-array-instances, ...) and perhaps a new top-level Hackage category is needed so people know where to find them - "Canonical instances", "Orphan instances", ...?

On 7 January 2011 23:07, Stephen Tetley
On 7 January 2011 10:22, Christian Maeder
wrote: So is this a bad idea?
It sounds like a good idea to me.
There might be some details to trash out, e.g. the granularity of packages - should there be a Quickcheck-orphans with Arbitrary instances for many structures, or separate packages (quickcheck-containers-instances, quickcheck-array-instances, ...) and perhaps a new top-level Hackage category is needed so people know where to find them - "Canonical instances", "Orphan instances", ...?
My biggest problem with "official" instances such as these is that they may not do what you want. For example, I have a lot of problems with the testsuite for graphviz because the default list (and hence String) instances do not match the behaviour required, so I have to be careful to make sure I use my custom arbString, etc. functions rather than directly calling arbitrary. That said, if these packages have one-instance-per-module (as in separate modules defining the Arbitrary instances for Map, Set, etc.) then this situation is greatly reduced, since if I want a custom Map instance I just have to make sure I don't import the module that provides the default one. -- Ivan Lazar Miljenovic Ivan.Miljenovic@gmail.com IvanMiljenovic.wordpress.com

Am 07.01.2011 14:14, schrieb Ivan Lazar Miljenovic:
My biggest problem with "official" instances such as these is that they may not do what you want. For example, I have a lot of problems with the testsuite for graphviz because the default list (and hence String) instances do not match the behaviour required, so I have to be careful to make sure I use my custom arbString, etc. functions rather than directly calling arbitrary.
This is a deficiency of classes in general that needs to be circumvented by other functions (like QuickCheck's forAll with a custom generator) new types (e.g. by newtype) or new classes.
That said, if these packages have one-instance-per-module (as in separate modules defining the Arbitrary instances for Map, Set, etc.) then this situation is greatly reduced, since if I want a custom Map instance I just have to make sure I don't import the module that provides the default one.
It would also be handy (for debugging purposes) if the Show instances could be changed for specific component types, but that's not really possible, too. C.

On 07/01/2011 13:14, Ivan Lazar Miljenovic wrote:
On 7 January 2011 23:07, Stephen Tetley
wrote: On 7 January 2011 10:22, Christian Maeder
wrote: So is this a bad idea?
It sounds like a good idea to me.
There might be some details to trash out, e.g. the granularity of packages - should there be a Quickcheck-orphans with Arbitrary instances for many structures, or separate packages (quickcheck-containers-instances, quickcheck-array-instances, ...) and perhaps a new top-level Hackage category is needed so people know where to find them - "Canonical instances", "Orphan instances", ...?
My biggest problem with "official" instances such as these is that they may not do what you want. For example, I have a lot of problems with the testsuite for graphviz because the default list (and hence String) instances do not match the behaviour required, so I have to be careful to make sure I use my custom arbString, etc. functions rather than directly calling arbitrary.
That said, if these packages have one-instance-per-module (as in separate modules defining the Arbitrary instances for Map, Set, etc.) then this situation is greatly reduced, since if I want a custom Map instance I just have to make sure I don't import the module that provides the default one.
Attempting to limit instance visibility is folly, unless you control the complete set of modules in your program. In practice you don't control the complete set of modules in your program, because some of them come from libraries, and you have no control over what the author of that library decides to import in the future. It's tempting to think that it's ok to add an orphan instance to a program, on the grounds that it's not a library and nobody can import it. But this is dangerous too: the program might break in the future when a conflicting instance is added to another library. You can't protect against this breakage by being careful with Cabal dependencies, because versions aren't bumped when new instances leak from dependencies lower down (and to try to enforce this in the PVP would be impossible). So, orphan instances are only sensible when we expect that everyone wants to use the same one. So Christian's suggestion is actually reasonable: if we must have orphan instances, make it so that everyone can easily use the same one, by putting them in a separate package. Cheers, Simon

Simon Marlow schrieb:
On 07/01/2011 13:14, Ivan Lazar Miljenovic wrote:
On 7 January 2011 23:07, Stephen Tetley
wrote: On 7 January 2011 10:22, Christian Maeder
wrote: So is this a bad idea?
It sounds like a good idea to me.
There might be some details to trash out, e.g. the granularity of packages - should there be a Quickcheck-orphans with Arbitrary instances for many structures, or separate packages (quickcheck-containers-instances, quickcheck-array-instances, ...) and perhaps a new top-level Hackage category is needed so people know where to find them - "Canonical instances", "Orphan instances", ...?
My biggest problem with "official" instances such as these is that they may not do what you want. For example, I have a lot of problems with the testsuite for graphviz because the default list (and hence String) instances do not match the behaviour required, so I have to be careful to make sure I use my custom arbString, etc. functions rather than directly calling arbitrary.
If there are multiple equally natural instances, then there should not be one at all.
That said, if these packages have one-instance-per-module (as in separate modules defining the Arbitrary instances for Map, Set, etc.) then this situation is greatly reduced, since if I want a custom Map instance I just have to make sure I don't import the module that provides the default one.
Attempting to limit instance visibility is folly, unless you control the complete set of modules in your program. In practice you don't control the complete set of modules in your program, because some of them come from libraries, and you have no control over what the author of that library decides to import in the future.
It's tempting to think that it's ok to add an orphan instance to a program, on the grounds that it's not a library and nobody can import it. But this is dangerous too: the program might break in the future when a conflicting instance is added to another library. You can't protect against this breakage by being careful with Cabal dependencies, because versions aren't bumped when new instances leak from dependencies lower down (and to try to enforce this in the PVP would be impossible).
So, orphan instances are only sensible when we expect that everyone wants to use the same one. So Christian's suggestion is actually reasonable: if we must have orphan instances, make it so that everyone can easily use the same one, by putting them in a separate package.
+1 It cannot be said often enough. Although we have already: http://www.haskell.org/haskellwiki/Multiple_instances

On 9 January 2011 07:20, Henning Thielemann
If there are multiple equally natural instances, then there should not be one at all.
It depends what you mean. In my case, I wanted the String instance for Arbitrary to follow certain rules (can't be the textual representation of a number, can't have certain characters in it, etc.). -- Ivan Lazar Miljenovic Ivan.Miljenovic@gmail.com IvanMiljenovic.wordpress.com

Then you should use a custom generator and not Arbitrary. On Sat, Jan 8, 2011 at 9:39 PM, Ivan Lazar Miljenovic < ivan.miljenovic@gmail.com> wrote:
On 9 January 2011 07:20, Henning Thielemann
wrote: If there are multiple equally natural instances, then there should not be one at all.
It depends what you mean. In my case, I wanted the String instance for Arbitrary to follow certain rules (can't be the textual representation of a number, can't have certain characters in it, etc.).
-- Ivan Lazar Miljenovic Ivan.Miljenovic@gmail.com IvanMiljenovic.wordpress.com
_______________________________________________ Libraries mailing list Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries

Tangential to the discussion, there is a nice idiom for using an alternative typeclass that works well in the Quickcheck case: newtypes. For example, if I have a test function: testString :: String -> Bool testString s = f1 s == f2 s and the default instance of String isn't doing what I want, I could define a custom arbitrary method: testString = do s <- myArbitraryStr return (f1 s == f2 s) But I could also newtype String, define an instance for that newtype, and use either of these (in my opinion) nice alternative spellings: testString (MyString s) = ... testString = do MyString s <- arbitrary I especially like the first form. Cheers, Edward

Ivan Lazar Miljenovic schrieb:
On 9 January 2011 07:20, Henning Thielemann
wrote: If there are multiple equally natural instances, then there should not be one at all.
It depends what you mean. In my case, I wanted the String instance for Arbitrary to follow certain rules (can't be the textual representation of a number, can't have certain characters in it, etc.).
This sounds like extra wishes that require a new type or some preprocessing within a quick check (e.g. filter out digits). You would not be happy with a private instance for Arbitrary String anyway, since different tests may have different requirements for the test strings (if not today, then maybe in future).

Am 07.01.2011 14:07, schrieb Stephen Tetley:
On 7 January 2011 10:22, Christian Maeder
wrote: So is this a bad idea?
It sounds like a good idea to me.
There might be some details to trash out, e.g. the granularity of packages - should there be a Quickcheck-orphans with Arbitrary instances for many structures, or separate packages (quickcheck-containers-instances, quickcheck-array-instances, ...) and
it would need to be fine granular potentially combining every class package with every data-type package to avoid unnecessary dependencies. But quickcheck-containers-instances should (and naturally will) depend on quickcheck-array-instances (and re-export these instances) because containers happens to depend on array.
perhaps a new top-level Hackage category is needed so people know where to find them - "Canonical instances", "Orphan instances", ...?
Right, that must be decided by the community. It would contain many fairly small packages. Christian

Stephen Tetley schrieb:
On 7 January 2011 10:22, Christian Maeder
wrote: So is this a bad idea?
It sounds like a good idea to me.
There might be some details to trash out, e.g. the granularity of packages - should there be a Quickcheck-orphans with Arbitrary instances for many structures, or separate packages (quickcheck-containers-instances, quickcheck-array-instances, ...) and perhaps a new top-level Hackage category is needed so people know where to find them - "Canonical instances", "Orphan instances", ...?
Since instances are always canonical (since you cannot safely have multiple instance laying around), I vote for 'Orphan instances'. We should explain this e.g. in http://www.haskell.org/haskellwiki/How_to_write_a_Haskell_program

On Fri, Jan 7, 2011 at 11:22 AM, Christian Maeder
Hi,
referring to previous messages about QuickCheck and Johan's proposal about moving instances I wonder how many of us have created obviously orphaned "Arbitrary" instances for container data types?
Maybe indeed separate packages with (orphaned) instances only should be created for re-use. I haven't seen such packages on hackage, yet. And maybe such packages should be clearly recognizable.
So is this a bad idea?
Cheers Christian
I wonder if it might be feasible to add some infrastructural support for this, e.g. so-called adapter packages, which get installed automatically when the packages they adapt between are both present? (Or, thinking further, perhaps even automatic importation of adapter modules at the language level?) You'd have a combinatorial explosion in the number of packages/modules, but this is just a consequence of the combinatorial number of instances... (maybe framing it as 'optional' parts of packages rather than adapter packages would be neater?).
_______________________________________________ Libraries mailing list Libraries@haskell.org http://www.haskell.org/mailman/listinfo/libraries
-- Work is punishment for failing to procrastinate effectively.

Gábor Lehel schrieb:
I wonder if it might be feasible to add some infrastructural support for this, e.g. so-called adapter packages, which get installed automatically when the packages they adapt between are both present? (Or, thinking further, perhaps even automatic importation of adapter modules at the language level?)
This would be analogous to automatic import of instances at a package level. However the adapter package might import other heavyweight packages that someone may not want to have.

2011/1/9 Henning Thielemann
Gábor Lehel schrieb:
I wonder if it might be feasible to add some infrastructural support for this, e.g. so-called adapter packages, which get installed automatically when the packages they adapt between are both present? (Or, thinking further, perhaps even automatic importation of adapter modules at the language level?)
This would be analogous to automatic import of instances at a package level.
However the adapter package might import other heavyweight packages that someone may not want to have.
Yes, that's a good analogy. Basically a way to ensure consistent instances not just in a single program, but across all of Hackage (modulo opt-out capabilities). Presumably the only dependencies for adapter packages would be the two packages they are adapting between in almost all cases? Either way, the criterium could be that it gets installed automatically if all of the dependencies are met. (Though to protect against mischief you'd need some safeguards like one of the authors of the adaptee packages having to approve it... or perhaps framing them as 'optional' parts of existing packages would be better? (Maybe that's already possible with cabal configurations / optional dependencies?)) If we had adapter modules at the language level with automatic importation and banned orphan instances outside of them, it would solve the import-unsafeness of OverlappingInstances, but instead would present a problem for people who don't want an instance imported not because they want their own, but simply to prevent accidental usage. (I.e. the various packages which have certain instances split out into separate Instances modules.)
-- Work is punishment for failing to procrastinate effectively.

Gábor Lehel schrieb:
2011/1/9 Henning Thielemann
: This would be analogous to automatic import of instances at a package level.
However the adapter package might import other heavyweight packages that someone may not want to have.
Yes, that's a good analogy. Basically a way to ensure consistent instances not just in a single program, but across all of Hackage (modulo opt-out capabilities).
Presumably the only dependencies for adapter packages would be the two packages they are adapting between in almost all cases?
I am uncertain. Someone might choose to move an instance to an adapter package just because it needs a critical additional import. Btw. what is the opposite of an orphan instance?

Am 09.01.2011 03:18, schrieb Gábor Lehel:
2011/1/9 Henning Thielemann
: Gábor Lehel schrieb:
I wonder if it might be feasible to add some infrastructural support for this, e.g. so-called adapter packages, which get installed automatically when the packages they adapt between are both present? (Or, thinking further, perhaps even automatic importation of adapter modules at the language level?)
This would be analogous to automatic import of instances at a package level.
However the adapter package might import other heavyweight packages that someone may not want to have.
Yes, that's a good analogy. Basically a way to ensure consistent instances not just in a single program, but across all of Hackage (modulo opt-out capabilities).
Yes, some infrastructural support is certainly needed to ensure at most one instance per type and class. There may be some dispute about which instance to add or if there should be no instance at all for certain combination (which is hard to enforce). Changing an instances later could be a serious problem (by subtly breaking code). I'm not sure if imports should be automatic, though.
Presumably the only dependencies for adapter packages would be the two packages they are adapting between in almost all cases? Either way, the criterium could be that it gets installed automatically if all of the dependencies are met. (Though to protect against mischief you'd need some safeguards like one of the authors of the adaptee packages having to approve it... or perhaps framing them as 'optional' parts of existing packages would be better? (Maybe that's already possible with cabal configurations / optional dependencies?))
If we had adapter modules at the language level with automatic importation and banned orphan instances outside of them, it would solve the import-unsafeness of OverlappingInstances, but instead would present a problem for people who don't want an instance imported not because they want their own, but simply to prevent accidental usage.
Since instances are always re-exported it is not really possible to prevent an accidental usage (except by asking the compiler what is used). In case you're writing your own instances, you want to be warned if they might clash with instances on hackage. Cheers Christian

Chiming in to the discussion: I'm curious to know why people don't naturally reach for newtypes when an Arbitrary instance for some type isn't what they're looking for. Cheers, Edward

On Fri, Jan 7, 2011 at 11:22 AM, Christian Maeder
Hi,
referring to previous messages about QuickCheck and Johan's proposal about moving instances I wonder how many of us have created obviously orphaned "Arbitrary" instances for container data types?
Maybe indeed separate packages with (orphaned) instances only should be created for re-use. I haven't seen such packages on hackage, yet. And maybe such packages should be clearly recognizable.
So is this a bad idea?
It's often not possible to define instances separately, as you often needs to access data constructors internal to the module that defines them. This is the case for the containers package. It's largely accidental that you can define e.g. an NFData instance for Data.Map outside the module and the instances are usually less efficient if defined outside the module [1]. 1. You often end up converting the data type to another data type (e.g. lists) that expose their data constructors and can thus be 'seq'ed. Johan
participants (9)
-
Christian Maeder
-
Edward Z. Yang
-
Gábor Lehel
-
Henning Thielemann
-
Ivan Lazar Miljenovic
-
Johan Tibell
-
Lennart Augustsson
-
Simon Marlow
-
Stephen Tetley