
Dear Generic folk [I'm spamming libraries@haskell.org too, in case anyone interested in generics is not on generics@haskell.org.] As you know, Claus has offered a somewhat-detailed proposal for changes to the SYB library (below). But I don't think that we have an active maintainer for any of the generic-programming libraries (esp SYB) apart from Uniplate. Then there's the related question of what generic-programming technology to promote for clients of the GHC API. The obvious candidates are Claus himself, or Alexey Rodriguez, or Thomas Schilling; but perhaps there are others too? Maybe no one has stepped forward because you all think that I'm on the job! But I'm not... I'm busy with GHC itself, and would love a maintainer for SYB and associated gubbins. I fear that otherwise we may lose the benefits of Claus's homework. Simon | -----Original Message----- | From: generics-bounces@haskell.org [mailto:generics-bounces@haskell.org] On Behalf Of Claus Reinke | Sent: 21 July 2008 14:16 | To: generics@haskell.org | Subject: [Hs-Generics] Data.Generics with GPS (using Maps to avoid getting lost in Data) | | | summary: speed up Syb with Uniplate-inspired techniques | | == Performance issues in Syb == | | As it stands, classic Syb, while well-supported in GHC isn't exactly the | fastest generic programming option. In fact, it routinely seems to come | out last in performance comparisons, sometimes by a substantial factor. | | That isn't always a problem in practice, because the gains in | expressiveness/conciseness/maintainability outweigh the loss in | performance, because traversal performance is not the bottleneck, or | because practical use often involves hand-tuned traversal schemes | instead of the example schemes typically used in benchmarks. | | Be that as it may, performance is a consideration, negative results have | been published, and it seems that performance of experimental code has | led to Syb being abandoned in at least one project. While it would be | unrealistic to expect Syb traversals to compete with hand-written ones | with current compilers, there are several obvious areas where Syb | traversal performance could be subjected to improvements: | | (a) traversing irrelevant parts of the structure (because everything is | treated generically) | (b) combining results in simple, but inefficient ways ((++) nesting with | the structure) | (c) repeated runtime type checks to determine which function in a | generically extended transformation/query to apply | | While runtime type checks are inherent in SYB (and alternatives have | been proposed that claim to avoid them), all of a/b/c can be addressed | to some extent in defining tuned traversals using the basic library. | There tends to be a trade-off for a vs c, as avoiding senseless generic | traversals of specific substructures implies additional runtime type | checks to identify those substructures [...rest snipped...]

[I'm spamming libraries@haskell.org too, in case anyone interested in generics is not on generics@haskell.org.]
Given the low number of responders on generics@, it may well be easier to continue on libraries@, cc-ing anyone on generics@ who isn't on libraries@.
As you know, Claus has offered a somewhat-detailed proposal for changes to the SYB library (below). But I don't think that we have an active maintainer for any of the generic-programming libraries (esp SYB) apart from Uniplate. Then there's the related question of what generic-programming technology to promote for clients of the GHC API.
Thanks for raising this, Simon. I've actually been holding an email summarizing several issues (not just performance of default traversal schemes) that I'd like to see adressed in Syb (holding because the Syb authors were/are away, and because my performance improvement experiments are currently stuck on a GHC optimization issue). I'll send that email separately now.
The obvious candidates are Claus himself, or Alexey Rodriguez, or Thomas Schilling; but perhaps there are others too? Maybe no one has stepped forward because you all think that I'm on the job! But I'm not... I'm busy with GHC itself, and would love a maintainer for SYB and associated gubbins. I fear that otherwise we may lose the benefits of Claus's homework.
I'm quite willing to continue pursueing the issues I've raised until I can make concrete suggestions for improving Syb, including summing up the code changes I've been adding to my various messages (certainly there should be patches to accompany proposal tickets in the library process, and I should collect all the strands of text into a single document). I have been waiting for the original Syb authors to return from their well-earned summer camps, but there should probably be a Wiki page somewhere specifically for discussing Syb-related issues and solutions (meanwhile, I've started collecting links/info related to GHC Api type traversals here, including the main Syb issues: http://hackage.haskell.org/trac/ghc/wiki/GhcApiAstTraversals , please feel free to copy stuff from there to a Syb-specific page). But I wouldn't want to take on ownership of Syb at this point, for two reasons, both motivational: (a) it helps to have someone else to "blame" when the consequences of gfoldl's type once again hurt my brain;-), (b) it is really frustrating to get so little interest in these issues, well, we haven't even managed to start a proper discussion on any of the lists I've tried, and as long as there is a Syb owner other than myself, at least I won't be talking entirely to myself!-) Claus

Hi
As you know, Claus has offered a somewhat-detailed proposal for changes to the SYB library (below). But I don't think that we have an active maintainer for any of the generic-programming libraries (esp SYB) apart from Uniplate. Then there's the related question of what generic-programming technology to promote for clients of the GHC API.
Thanks for raising this, Simon. I've actually been holding an email summarizing several issues (not just performance of default traversal schemes) that I'd like to see adressed in Syb (holding because the Syb authors were/are away, and because my performance improvement experiments are currently stuck on a GHC optimization issue). I'll send that email separately now.
I think SYB would best be maintained by someone who does not already maintain some kind of boilerplate removal library. There are lots of experiments into GADT's and other mechanisms, but SYB1+2 is a very useful design point - and one that should be preserved. I think the maintainer should do three things in addition to general release management: 1) Speed improvements, if possible 2) API tweaks, maybe a few extra functions (universe equivalent would be nice) 3) Make it work with Hugs - I've always been surprised that SYB doesn't work with Hugs, and I don't think its that much work. As a result of these points, I think Claus is probably the perfect person to take over as maintainer.
(a) it helps to have someone else to "blame" when the consequences of gfoldl's type once again hurt my brain;-),
You can still blame those that went before - I don't think an SYB maintainer should be changing the type of gfoldl - its too fundamental.
(b) it is really frustrating to get so little interest in these issues, well, we haven't even managed to start a proper discussion on any of the lists
I am interested. I have starred your emails, and will respond in the next few days. I've been in a tent without electricity for the last few days! Thanks Neil

Hi Neil, personally, I think that historical preservation of a reference approach is quite a different issue (hosting a copy of the Syb page at haskell.org, and a copy of the library on hackage, would be a start), which might also need a maintainer at some point, but Syb1+2 is part of base right now, and available from Ralf's old pages. So far, it does seem as if most of the items I'm interested in can be done on top of Data.Generics, simply bypassing the higher-level API and providing an alternative, but very close, higher-level API on top of the same low-level API (though I sometimes wonder whether gfoldl's second parameter should be generic rather than polymorphic). But what it comes down to is that I'd like to start with a working, well supported approach (Syb 1+2), and see whether we can close any of the known gaps without starting from scratch with yet-another- generics-library. And if that means morphing what is there into something even more useful, by taking inspirations from Syb 4 or Uniplate or .., I'm all for it, as long as it is a continuous evolution supported by evidence, not a heart-liver-and-lung transplant supported by hope.
1) Speed improvements, if possible
What I'm working on are mostly more convenient access to better performance in the higher-level API (traversal schemes), reducing the need for hand-tuned traversals using the low-level API directly. The part inspired by Uniplate's PlateData seems to be working, the part about replacing nested typecases with Map lookup is currently burried in other effects.
2) API tweaks, maybe a few extra functions (universe equivalent would be nice)
It seems we've got fmap and traverse defineable in terms of Data/ Typeable, so one could derive the latter two, then get the former for free, so to speak. But should these be in some Data.Generics.Utils, or should they move into Data.Traversable, etc (which already has some default functions for defining instances of one class in terms of another)? And if you mean {-# LANGUAGE ScopedTypeVariables #-} {-# LANGUAGE RankNTypes #-} import Data.Generics universe :: forall a . Data a => a -> [a] universe = everything (++) ([] `mkQ` child) where child = return :: a -> [a] then we're probably just talking about better performance from naive definitions (1) again, or is anything else wrong with that definition?
3) Make it work with Hugs - I've always been surprised that SYB doesn't work with Hugs, and I don't think its that much work.
Hmm, I'm still fond of Hugs, but I haven't used it much recently, so I'd be the wrong person for that job. On casual glance, I can't even think of any Syb-essential language features that aren't supported in Hugs (apart from deriving Data/Typeable), and my old WinHugs doesn't seem to have/support a cpp, which gets in the way of just loading the code - why doesn't Syb work with Hugs? 4) Improve useability Things like 'typeOf1' not working with 'Data a => a', or 'gzipWithT' giving fun type errors unless we eta-expand its first parameter, Syb as a general test-bed for the quality of type error messages, etc. Here, I'm not thinking of immediate cure-alls, but of collecting the various issues, creating tickets and/or a Wiki page and looking for ways out, step by step. One item I haven't mentioned yet: can't we replace the gensym-based TypeRepKeys with something more systematic/standardised? I've wondered about this on previous occasions, to make TypeRepKeys more portable, but it would also be nice just to get rid of that IO tag (reminding us that the keys may change with each program run). Thanks for your confidence, but I'll probably just collect feedback here, contribute my code/docs when I've got everything together (would a separate syb-utils package be preferred, or direct changes to base?) and move on. I look forward to your comments, though, when you get out of that tent!-) Claus
participants (3)
-
Claus Reinke
-
Neil Mitchell
-
Simon Peyton-Jones