Syb Renovations? Issues with Data.Generics

Calling all Syb/Data.Generics users!-) I keep running into problems with Data.Generics, mostly because I actually want to use it (no claims that it is the best or final solution, or that other approaches aren't equally in need of support, just that it is the best-supported working approach right now). Some tricky issues are (sometimes against published expectations) solvable, suggesting useful additions to the library, but some seemingly trivial things have me stumped, suggesting (to me, at least;-) a need for improvements either in the library or in its documentation. Part of the reason I'm interested in this now is that Data/Typeable instances seem likely (I hope:-) to be added to the GHC Api, where Thomas Schilling is working on improvements http://hackage.haskell.org/trac/ghc/wiki/GhcApiStatus http://hackage.haskell.org/trac/ghc/wiki/GhcApiAstTraversals also, the old question of porting HaRe to the GHC Api is currently being looked into again, by Chaddaï Fouché, and crucially depends on Syb's generic traversals. As it is still holiday season, it is a bit early for proposal deadlines, but I'd like to start a discussion of Syb/Data.Generics and collect the issues and solutions arising, in the hope of following up with concrete proposals for improvements. To start the discussion, a simple item: 1. inconvenient convenience instances of Data for non-"data" types Data.Generics.Instances defines instances of Data for many types, including some abstract types that don't really fit into the concrete value based model of Data, like 'IO a' and 'a->b'. Those instances give runtime errors for some class methods, and mainly offer faked (no-op) gmap traversals, serving as a convenience/enabler for 'deriving instance Data': http://www.haskell.org/pipermail/generics/2008-June/000346.html A list of the odd instances in Data.Generics.Instances, with examples of their oddities, can be found here: http://www.haskell.org/pipermail/generics/2008-June/000347.html My suggestion is to split this module into two, and stop the implicit import/export of the incomplete instances from Data.Generics. Reactions to this suggestion have been muted so far (Simon PJ was as surprised as I was about the existence of these instances, but has no strong opinion about the issue, Alexey Rodriguez supports the suggestion, Ian Lynagh points out the difficulty of transition), which is one reason why I'll try to move the discussion to libraries@. Pro: - the instances are still available, and only one explicit import away, so 'deriving instance Data' for types containing uninteresting functions is still convenient - the problematic instances are no longer implicitly imported, so applications that don't want these instances can now avoid them completely, or define their own instances - these convenience instances are not just inconvenient for some applications, due to the way intances are handled in Haskell; they actually violate some "natural" invariants like "everything queries every substructure of the specified type", "everywhere applies a transformation at every substructure of matching type" - the situation is similar to Text.Show.Functions, as the convenience instances don't provide the full expected functionality, just barely enough for deriving to get by Cons: - due to the implicit import and use of these instances, there is no obvious transition scheme; it seems that the least painful process would be to make the change without transition/deprecation period and to document the explicit import option [it would be useful to have a way of deprecating instance imports, so that any deriving scheme depending on imports from a deprecated location would trigger a warning, in this case suggesting the new import location] As I said, I'd like to wait until at least the Syb authors are back from holidays before setting any proposal deadlines, but I'd like to invite feedback from Syb users on this and other Syb issues. Here is a preview on other items I'd like to raise later on, please add your own: 2. Data.Generics.Utils Since Data/Typeable are compiler-derivable (in GHC) while other classes like Functor/Traversable/etc are not, it would be useful if generic instances for those other classes could be defined in terms of Data/Typeable. The Uniplate library already does this for its own classes via Data.Generics.PlateData, and it appears that at least Functor is defineable as well (code exists, proof is only informal at this stage, and those invariant violations and runtime errors in the implicitly imported dummy instances from (1) really get in the way): http://www.haskell.org/pipermail/generics/2008-June/000343.html http://www.haskell.org/pipermail/generics/2008-July/000349.html http://www.haskell.org/pipermail/generics/2008-July/000351.html What other classes can be defined in this way? Traversable (traverse) seems very nearly possible, what else? 3. Performance Naive use of Syb traversal schemes can lead to huge performance losses. Experienced users tend to write their own traversal schemes, using Syb's low-level Api directly, but we can take inspiration from some Uniplate/PlateData optimization techniques and generalise them for use with Syb's high-level traversal scheme Api, yielding similar performance gains for everywhere/everything: http://www.haskell.org/pipermail/generics/2008-July/000353.html Another direction that might be worth exploring is to use Maps instead of nested generic extensions to define adhoc-overloaded transformation and queries (I've actually started playing with that, but am currently stuck on GHC ticket #2463). 4. Useability There is probably nothing one can do to make the types of Syb's low-level Api less of a brain hazard, but not all of the stumbling blocks seem to be necessary consequences of the carefully crafted edifice of interactions between nearly polymorphic types, runtime type checks and type reflection. Examples: - there doesn't seem to be a way to get hold of a types' constructors, only of constructor representations, structure scaffolds, and structure generators - the actual domain on which a transformation/query acts is hidden behind the near-polymorphic default type of generic extensions - I can't seem to figure out how to use typeOf1, when the other Syb operations only give me 'forall a . Data a => a'; instead, I seem to be forced to use something like: [ mkTyConApp tyCon (init tyArgs) | not (null tyArgs) ] where (tyCon,tyArgs) = splitTyConApp typeRep - others? What are your personal gripes with Syb/Data/Typeable, and for which of them do you see a chance of addressing them by changing/adding code? Claus
participants (1)
-
Claus Reinke