
I thought class 'Data' was in 'Data.Generics.Basics' because it provides generic access to 'data'-definitions. SYB's generic programming library code (strategies for queries and transformations) builds on that, so does (one version of) Uniplate. Most other generic programming libraries are based on generic access to _type_ representations, the basics of which would more accurately appear somewhere in 'Types.Generics' - no conflict with SYB here. One could move the actual generic libraries into 'Generics.*', but until there is an actual need for that, I'd prefer things to stay stable, with libraries building on generic data access in 'Data.Generics' and libraries building on generic type access appearing in 'Types.Generics'. One could rename some of the SYB modules, eg, 'Data.Generics.Schemes' -> 'Data.Generics.SybSchemes' and so forth, but as long as other 'data'-based libraries are not deprived of namespace there, and other 'type'-based libraries either don't provide general traversal schemes or live in 'Types.Generics', there is no immediate need for such renaming, beyong putting the modules in a 'syb' package, is there? (note that 'Data.Typeable' is outside 'Data.Generics', though it is part of the basics that SYB depends on)
Claus says that "half the instances of Data are controversial". Is that really right Claus? Isn't it just functions and IO?
As an ideal, I'd like 'Data.Generics.Instances.Dubious' to be empty - 'Data' instances should either be standard, or not exist at all (at least not in library code). What I did was simply to take anything that looked dubious and move it into a separate module, to facilitate further discussion and more control over imports. The discussion I had hoped for didn't happen, so that is still were my code stands, but I do hope it isn't the final state. As for numbers, I currently have 32 instances in 'Data.Generics.Instances.Standard' and 11 instances in 'Data.Generics.Instances.Dubious' [1]. My initial split was mostly on the basis of 'gfold'/'gmapT' not traversing substructures, so some more of the 'Standard' instances are actually incomplete, and some of the 'Dubious' instances could possibly be declared "safe (with side conditions)", but then someone would still have to look at making the instances more complete/less certain to generate runtime errors. The current 'Dubious' list has things like - 'Ratio a': while values of type 'a' actually exist here, they are not meant to be visible in a concrete way, only via the abstract interface; and the abstract interface can support a 'data'-like view - various 'Ptr a': here the 'a' is a phantom type, there are no objects of type 'a' to be traversed; but neither is there much 'data'-like about these pointers.. - 'b->a', 'IO a', 'ST s a', 'STM a': these are thoroughly un-'data'-like; though the instances could be improved to provide transformation access to the 'a' values, the same doesn't work for queries, and the '(->)b' context is completely out of range for 'Data'. The current 'Standard' list has various instances that just bomb on some operations, including the 'Array a b' instance, which otherwise nicely demonstrates how to handle abstract types. Moving the more stable and standard 'Data' instances into base might not hinder development/debugging of the remaining instances, but right now, I don't think it will include sufficiently many instances to avoid dependencies on syb. As I explained in my previous email, the implicit presence of instances is itself a source of bugs, due to the propagation of instances in Haskell, not to mention ghc bug #2182. Since very few of the current 'Data' instance importers actually need those imports (they just happen to be included if one imports 'Data.Generics'), I'd prefer to remove the implicit imports (and implied re-exports), making the remaining real dependencies explicit by depending on syb (again, see previous email). I'd really like to see the real issues addressed before we start worrying about names, as this has turned out to be a rats nest of bugs, including: - incomplete 'Data' instances (operations that bomb now, but might be given better implementations) - incompleteable 'Data' instances (operations that cannot be implemented, suggesting that these instances shouldn't exist) - 'deriving Data' depending on 'Data' instances for everything, instead of skipping substructure types that cannot be handled anyway (smarter deriving could avoid dumb instances, by annotating types that should not be traversed instead of traversing these types via dummy instances that are then globally available/irreplaceable) - unneccessary 'Data' instance import/export (Data.IntMap has absolutely no business bringing 'instance Data (IO a)' into scope) - ghc sessions retaining instances (#2182), leading to build errors even in separate module hierarchies - ghc listing "orphan instances" as a performance issue, re-emphasized recently by warnings turned into errors, which has led some to believe they are a design fault, rather than a representation of a valid design decision - it doesn't help that Haskell doesn't support instance import/ export control (yes, the instances are unnamed, but naming class, type, and module would seem sufficient to block instance imports/exports where they are not wanted) Claus [1] http://www.cs.kent.ac.uk/~cr3/toolbox/haskell/#syb-utils