
On Tue, Jul 29, 2008 at 08:27:00PM +0100, Claus Reinke wrote:
My suggestion is to split this module into two, and stop the implicit import/export of the incomplete instances from Data.Generics. I don't think that this is a good idea. Could you please elaborate on your reasons? That's what the rest of that e-mail was supposed to be.
You explained why the change would not give as much flexibility as one might think, or at least not as easily, but you didn't explain why you think it is a bad idea to gain at least the flexibility to choose between instance and no instance for the problematic cases.
True. But the current situation is even worse: we either get all instances (good and bad) or none I think you should always get all:
That would be fine if all instances were completely defined. They aren't. So the partial instances are imposed globally and irrevocably, and instead of compiletime type errors, we get runtime errors.
By the way, is there something somewhere describing the alternate instance that you want to define?
That is the whole point, isn't it? The Data framework isn't designed to cope with things like (a->b) or (IO a), so there are no good instances one could define for these types (if anyone can suggest better instances, please do!-). Hence the incomplete instances mixed in with the standard ones in Data.Generics.Instances. Mostly, I don't want those instances at all (the incomplete ones), so that the typechecker will complain if I try to use Data on something it can't really handle. Scenario 1: We want to use deriving Data on types that have components of types (a->b) or (IO a), but we don't care about those components. This is the case for which the incomplete instances are provided. Scenario 2: Any attempt to use Data (a->b) or Data (IO a) indicates an error. If we want to derive Data for complex structures containing those types, we need to define Data instances for the immediately enclosing structures, or wrap those types in newtypes and define Data instances for those. This is the case for which the incomplete instances get in the way. Scenario 3: We want to use deriving Data on types that have components of types (a->b) or (IO a), we do care about what happens in those components. It is surprisingly tricky to come up with sensible Data instances for these types that do anything more than the current dummies, so this scenario isn't as likely as I thought at first. Scenario 4: We want to handle Data for (a->b) or (IO a) differently, depending on context. Unless we can wrap those types in newtypes, this is very nearly impossible, due to the way instances propagate through projects. The status quo supports only (1), and gives a mixture of runtime errors and wrong results for (2). In particular, the type checker does not help us to find the cases we need to cover to keep our programs from "going wrong". With selective import, we can support (1), or get compiletime errors consistently for (2). We cannot usually support both (1) and (2) in one program, but splitting the Instances module so that we can be more selective in imports seems a worthwhile improvement. Claus PS. The situation is not improved by the current reexports of Data.Generics.Instances from unexpected places. I have a package splitting Data.Generics.Instances into Standard and Dubious, and a Data.Generics.Alt that only reexports Standard instances. But as soon as I use this with, eg, an IntMap, I get duplicate instance errors. A quick grep shows that the following re-export all instances (sometimes deliberately, sometimes accidentally, by importing Data.Generics for other reasons): libraries/array/Data/Array.hs -- libraries/base/Data/Generics/Instances.hs libraries/base/Data/Generics.hs libraries/bytestring/Data/ByteString/Internal.hs libraries/bytestring/Data/ByteString/Lazy/Internal.hs libraries/containers/Data/IntMap.hs libraries/containers/Data/IntSet.hs libraries/containers/Data/Map.hs libraries/containers/Data/Set.hs libraries/containers/Data/Tree.hs libraries/haskell-src/Language/Haskell/Syntax.hs libraries/network/Network/URI.hs libraries/packedstring/Data/PackedString.hs libraries/template-haskell/Language/Haskell/TH/Quote.hs libraries/template-haskell/Language/Haskell/TH/Syntax.hs As far as I can see, none of these depends on the incomplete instances, so these instances get re-exported by accident. If Data.Generics.Instances was split into Data.Generics.Instances.Standard and Data.Generics.Instances.Dubious, and if Data.Generics.Alt only reexported the former, those modules could be more selective in their imports and the leaking of instances could be avoided.