
The issue is: SYB is being moved out of base into its own package. However, the Data class is, in a way, tied to base since it depends on the deriving mechanism.
My understanding is that the deriving mechanism would still work if class 'Data' was moved into 'syb', but changes in 'Data' would still need to be matched in the deriving mechanism (which isn't auto-generated from 'base', either). As long as 'syb' remains a core library, we can thus focus on assigning modules to 'syb' or 'base' by functionality.
Therefore, it was suggested that the entire Data.Generics.Basics module [2] should remain in base. This module defines the Data class and several associated functions and datatypes. I don't think anyone objected to this so far: please correct me if I'm wrong, or object now.
Assuming this is based on 'Data.Generics.Basics' and 'Data.Typeable' being of more general use than the rest of 'syb' (justifying a preferred dependency on 'base' rather than 'syb'), not any implementation constraints, I don't object in general. It does suggest a separate 'data-reflect' package for these two modules, but that could be left for later. However, if 'Data' is in 'base', and the 'data' types are in 'base', then the 'Data' instances for those 'data' types should probably also be in base (*) (the instance for 'Array a b' ought to move to 'array'). And the short-term issue with this is that these instances, their location, and their importers, need some revision, while 'base' wants to be stable. The hope was that splitting off 'syb' from 'base' would contain the changes in a package with named maintainer, outside 'base'. Wouldn't it be easier to have all of 'Data' in 'syb', at least until 'Data' and 'Typeable' move into their own package? But if you can find a way to make the 'Data'-in-'base' route work, I'm not going to object.
Then it was also suggested that Data.Generics.Instances [3] could stay in base (perhaps inside Basics as well). This, however, would prevent dealing with the "dubious" Data instances [4], and this was one of the motivating factors to split SYB from base. This refers concretely to the instances:
Rearranging the list slightly, for easier reference: -- these have (or produce) substructures of type 'a', which aren't -- traversed by the current Data instances (contrary to what one -- would expect, say, from a generic 'fmap' over these types)
instance (Data a, Data b) => Data (b -> a) instance Typeable a => Data (IO a) instance (Typeable s, Typeable a) => Data (ST s a) instance Typeable a => Data (STM a) instance Typeable a => Data (IORef a) instance Typeable a => Data (TVar a) instance Typeable a => Data (MVar a)
-- here, the 'a' is a phantom type, without matching substructures
instance Typeable a => Data (Ptr a) instance Typeable a => Data (StablePtr a) instance Typeable a => Data (ForeignPtr a)
-- here, the 'a' corresponds to substructures that should only -- be visible through the abstract interface, on top of which a -- 'data'-like view can be provided
instance (Data a, Integral a) => Data (Ratio a)
In addition, a longer list of instances offer only runtime errors for some 'Data' operations (most notably for 'gunfold', though abstract types in general have a problem with reflection support). Are these necessary or would they profit from closer investigation? If the latter, those instances should probably not be in 'base'.
These instances are defined in such a way that they do not traverse the datatype. In fact, there is no other possible implementation, and this implementation at least allows for datatypes which contain both "regular" and "dubious" elements to still have their "regular" elements traversed.
Well, there are alternative instances that would at least improve traversal support [3], but that wouldn't work for queries, I think.
However, this implies that a user cannot redefine such instances even in the case where s/he knows extra information about these types that would allow for a more useful instance definition, for instance.
Indeed, the implicit presence of these instances is the main issue, and reducing their presence and propagation affects 'base' and other core and extra libaries, so needs to happen soon.
Claus, please correct me if I'm wrong, but if the 11 "dubious" instances (or perhaps less, given your message in [5]) go in the syb package and the remaining, "standard" ones stay in base, we: - Mantain backwards compatibility regarding SYB in 6.10, and - Can still deal with the issue by releasing a new version of the syb package later, independently of GHC.
issues to consider, of the top of my head: - to what extent can core libraries be updated independent of 'base'? - unless 'base' can now be updated (there are two versions of 'base' in ghc head), 'base' must not depend on 'syb' - which other core libraries depend on 'syb'? are they updateable? - the current importers of (parts of) 'Data.Generics' need to be revised [1] - instances cannot be deprecated - since all instances are in one module, one could deprecate the module, but are module deprecations propagated to their importers automatically? - would 'Data.Generics' need to be deprecated, in favour of a version that does not implicitly re-export any/all instances? [2] Maintaining strict backwards-compatibility in 6.10 while still allowing for changes in 'syb' is going to be difficult, if only because clients might depend on 'Data.IntSet' and the like to re-export all current 'Data' instances, which we certainly want to stop. My 'syb-utils' [2] has alternatives to 'Data.Generics' that export either only standard instances or no instances, which would allow to deprecate all 'Data.Generics*' modules that are less specific about their instance exports, but would require use of alternative module names..
Since the deadline for 6.10 is approaching I'm assuming that we should try to minimize the changes there, while keeping future development in the syb package as open as possible.
Definitely. But some choices need to be made now. Mainly what goes where, how to handle deprecation, and how to reduce implicit instance propagation. Claus [1] http://article.gmane.org/gmane.comp.lang.haskell.libraries/9957 [2] http://www.cs.kent.ac.uk/~cr3/toolbox/haskell/#syb-utils [3] http://www.haskell.org/pipermail/libraries/2008-July/010319.html (*) this isn't a firm rule, either: recently, it was decided to keep the 'Data' instances for 'ghc' types out of 'ghc'..