[GHC] #13065: Prohibit user-defined Generic and Generic1 instances

#13065: Prohibit user-defined Generic and Generic1 instances -------------------------------------+------------------------------------- Reporter: dfeuer | Owner: Type: feature | Status: new request | Priority: normal | Milestone: 8.4.1 Component: Compiler | Version: 8.0.1 Keywords: Generics | Operating System: Unknown/Multiple Architecture: | Type of failure: Other Unknown/Multiple | Test Case: | Blocked By: Blocking: | Related Tickets: Differential Rev(s): | Wiki Page: -------------------------------------+------------------------------------- User-defined `Generic` and `Generic1` instances are problematic. === They are susceptible to breakage === Some details of the classes may change between GHC versions, and indeed have done so in the past. User-defined instances are likely to break in the face of various such "internal" changes. This is one reason why `Data.Sequence`, for example, does not have a `Generic` instance. === They require potentially-expensive consistency checks === GHC cannot assume that every type has at most one `Generic` and `Generic1` instance, so it needs to look for possible alternatives at instance resolution time. According to Simon (and maybe also Simon), this may be partly responsible for the performance regressions seen in Phab:D2899. === Downsides === Prohibiting user-defined instances does have some costs. Suppose a type was originally defined concretely, exposing its constructors and a `Generic` instance. The implementer may decide later to make the type abstract, and export pattern synonyms to retain the same interface. But the `Generic` instance will either change or disappear. Someone relying on that instance could be in trouble. If the instance disappears, they'll be forced to write code by hand that they didn't need to before. If it changes, their code may change its behavior unexpectedly. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13065 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13065: Prohibit user-defined Generic and Generic1 instances -------------------------------------+------------------------------------- Reporter: dfeuer | Owner: Type: feature request | Status: new Priority: normal | Milestone: 8.4.1 Component: Compiler | Version: 8.0.1 Resolution: | Keywords: Generics Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Other | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by simonpj): * cc: RyanGlScott (added) Comment: I'd like to know what Ryan Scott thinks of this idea. We have some classes (like `Typeable`) where you can't give user instances, but it seems less clear-cut for `Generic`. The efficiency issue is to do with the costs of doing the type-family consistency checks for all the `Generic` instances. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13065#comment:1 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13065: Prohibit user-defined Generic and Generic1 instances -------------------------------------+------------------------------------- Reporter: dfeuer | Owner: Type: feature request | Status: new Priority: normal | Milestone: 8.4.1 Component: Compiler | Version: 8.0.1 Resolution: | Keywords: Generics Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Other | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by dfeuer): Yes, sorry. I forgot the real troubles were `Rep` and `Rep1` rather than `Generic` and `Generic1`. One possible variation might be to disassociate the families from the classes, and only restrict the former. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13065#comment:2 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13065: Prohibit user-defined Generic and Generic1 instances -------------------------------------+------------------------------------- Reporter: dfeuer | Owner: Type: feature request | Status: new Priority: normal | Milestone: 8.4.1 Component: Compiler | Version: 8.0.1 Resolution: | Keywords: Generics Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Other | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by RyanGlScott): I believe I support the general idea in this proposal, although I'd like to add some clarifications to the reasons you've listed. Clarification number one is that hand-written `Generic` instances are already forbidden as of GHC 7.10, provided you have the `Safe` extension enabled. ----- Replying to [ticket:13065 dfeuer]:
User-defined `Generic` and `Generic1` instances are problematic.
=== They are susceptible to breakage === Some details of the classes may change between GHC versions, and indeed have done so in the past. User-defined instances are likely to break in
Agreed. `Generic` is truly unique in that any given `Generic` instance can and should be determined solely by the algebraic structure of the datatype. Any other use of `Generic` is an abuse of its intended purpose, in my opinion. the face of various such "internal" changes. This is one reason why `Data.Sequence`, for example, does not have a `Generic` instance. It's true that we've changed the internal details of `GHC.Generics` before, but believe me when I say that we tried to preserve backwards compatibility as much as possible :) There are a good many other breaking changes we //could// make, but we haven't yet (see #7492). With CPP, it's actually quite feasible to maintain fancy type-level generics hackery all the way back to GHC 7.2. That being said, I would argue that `Data.Sequence` shouldn't have a `Generic` instance for a different reason: it's an abstract type. A `Generic` instance necessarily leaks every implementation detail about your datatype, so having a `Generic` instance for an abstract type is nonsense.
=== They require potentially-expensive consistency checks === GHC cannot assume that every type has at most one `Generic` and `Generic1` instance, so it needs to look for possible alternatives at instance resolution time. According to Simon (and maybe also Simon), this may be partly responsible for the performance regressions seen in Phab:D2899.
=== Downsides === Prohibiting user-defined instances does have some costs. Suppose a type was originally defined concretely, exposing its constructors and a `Generic` instance. The implementer may decide later to make the type abstract, and export pattern synonyms to retain the same interface. But
If we're pursing this goal in the name of compiler performance (see also #12731, which I believe is very relevant here), we need to be careful in stating our goals. As I stated [https://ghc.haskell.org/trac/ghc/ticket/12731#comment:3 here], it wouldn't be enough to disallow hand-written `Generic` instances. You'd also need to check that there's only one `Generic` instance per datatype, lest you end up with a scenario like this (which is currently possible in GHC 8.0): {{{#!hs {-# LANGUAGE DeriveGeneric #-} {-# LANGUAGE FlexibleInstances #-} {-# LANGUAGE StandaloneDeriving #-} import GHC.Generics data Foo a = Foo a deriving instance {-# OVERLAPPABLE #-} Generic (Foo a) deriving instance {-# OVERLAPPING #-} Generic (Foo Int) }}} the `Generic` instance will either change or disappear. Someone relying on that instance could be in trouble. If the instance disappears, they'll be forced to write code by hand that they didn't need to before. If it changes, their code may change its behavior unexpectedly. I don't follow this argument. If the definition of a datatype changes, then either it's `Generic` instance must change, or you should just remove the `Generic` instance altogether. Full stop. Anything else would be a lie. ----- To contribute one more "downside" of this proposal, drawing from my own experience, there is only one scenario where I've found hand-written `Generic` instances to be necessary: backwards compatibility. In old versions of GHC, there were numerous bugs that prevented you from deriving `Generic` instances for sufficiently polykinded datatypes, forcing you to resort to manual implementation. But these bugs have all been squashed, AFAICT, so I don't see any reason why we'd need to worry about this going forward. In fact, manually implementing `Generic` instances is precisely how the `generic-deriving` library allows you to define `Generic` instances via Template Haskell (in case `deriving Generic` is buggy on an old GHC). So if we did implement this proposal, that part of `generic-deriving` would no longer work. But that's a sacrifice I think I'd be OK with. ----- If we do decide to press on with this propsal, let's not let this Trac ticket be the end of the discussion. Let's advertise this change as a concrete GHC proposal first. I bet there's someone out there who is relying on hand-written `Generic` instances who hasn't spoken up, and I'd hate to break their code without warning. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13065#comment:3 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13065: Prohibit user-defined Generic and Generic1 instances -------------------------------------+------------------------------------- Reporter: dfeuer | Owner: Type: feature request | Status: new Priority: normal | Milestone: 8.4.1 Component: Compiler | Version: 8.0.1 Resolution: | Keywords: Generics Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Other | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by rwbarton): Can you comment on why this is different from our stance on `Data`, which addresses largely similar problems but for which custom instances are apparently approved (e.g., `Data.Map.Map` has one)? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13065#comment:4 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13065: Prohibit user-defined Generic and Generic1 instances -------------------------------------+------------------------------------- Reporter: dfeuer | Owner: Type: feature request | Status: new Priority: normal | Milestone: 8.4.1 Component: Compiler | Version: 8.0.1 Resolution: | Keywords: Generics Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Other | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by RyanGlScott): I'm not as experienced with `Data.Data` as I am with `GHC.Generics`, so take my comments with a grain of salt. But the impression that I've gathered after working with it a little bit is that `Data.Data` has slightly different goals than `GHC.Generics`. With the former (`Data`), you mostly care about generically walking over data structures. That seems to explain why `Data.Data` gives you only a bare-minimum amount of metadata for each datatype and constructor, as they're only useful insofar as they serve as signposts when you're deep in a datatype traversal. Moreover, the documentation in `Data.Data` mentions the phrase "public representation" in several places. It seems acceptable to lie a little in certain `Data` instances—e.g., in primitive datatypes like `Int`, or abstract types like `Map`—in order to give programmers //some// interface that they can walk over. With the latter (`Generic`), there's an expectation that the representation type provides a faithful view of the datatype's structure and metadata. It occurs to me that we really don't state this expectation anywhere in the documentation, but this is nonetheless the reasoning that David Terei alluded to when he made the change to disallow hand-written `Generic` instances in 578fbeca31dd3d755e24e910c3a7327f92bc4ee3. If we allow users to hand-write their `Generic` instances, then I feel that we are allowing them to shoot themselves in the foot. Of course, the other thing that `Data` and `Generic` have in common is that they facilitate the ability to reduce boilerplate code. But even if you can't write `Generic` instances for your abstract types, you can still use `GHC.Generics` to help reduce boilerplate: just use the `Generic` instance for an isomorphic type. For instance, if you want to think of `Data.Map` as a list of key-value pairs, then you have the power to do so: {{{#!hs module Main (main) where import Data.Map import Generics.Deriving.Eq import GHC.Generics eqMapKVList :: (GEq k, GEq v) => Map k v -> Map k v -> Bool eqMapKVList x y = geq (toListRep x) (toListRep y) where toListRep :: Map k v -> Rep [(k, v)] () toListRep = from . toList main :: IO () main = do print $ eqMapKVList (fromList [('a', 'b')]) (fromList [('a', 'b')]) print $ eqMapKVList (fromList [('a', 'b')]) (fromList [('a', 'c')]) }}} -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13065#comment:5 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13065: Prohibit user-defined Generic and Generic1 instances -------------------------------------+------------------------------------- Reporter: dfeuer | Owner: Type: feature request | Status: new Priority: normal | Milestone: 8.4.1 Component: Compiler | Version: 8.0.1 Resolution: | Keywords: Generics Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Other | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by dfeuer): One place I think we still can't derive (seemingly reasonable) `Generic` instances is with GADTs. For example, today I can write {{{#!hs {-# LANGUAGE DeriveGeneric, StandaloneDeriving, GADTs, DataKinds, KindSignatures, FlexibleInstances, TypeFamilies #-} module GenGADT where import GHC.Generics data Foo :: Bool -> * -> * where X :: a -> Foo 'False a Y :: a -> Foo 'True a instance Generic (Foo 'False a) where type Rep (Foo 'False a) = D1 ('MetaData "Foo 'False" "GenGADT" "" 'False) (C1 ('MetaCons "X" 'PrefixI 'False) (Rec0 a)) to (M1 (M1 (K1 a))) = X a from (X a) = M1 (M1 (K1 a)) instance Generic (Foo 'True a) where type Rep (Foo 'True a) = D1 ('MetaData "Foo 'True" "GenGADT" "" 'False) (C1 ('MetaCons "Y" 'PrefixI 'False) (Rec0 a)) to (M1 (M1 (K1 a))) = Y a from (Y a) = M1 (M1 (K1 a)) }}} but I don't think GHC is able to derive such instances. I'm more concerned about backwards compatibility issues, though. As soon as a library chooses to derive a `Generic` instance for a type, that instance becomes part of the library API. Users may well come to rely on the existence of that instance, and also some of its details. If we prohibit custom instances, won't that strongly discourage libraries from deriving `Generic` for any but the most trivial exposed types? Let me get back to pattern synonyms. Suppose we have {{{#!hs data Tree a = Tree a [Tree a] deriving Generic }}} and we decide we want to play around with bifunctors, so we redefine this as {{{#!hs --newtype Fix p a = In {out :: p (Fix p a) a} --instance Bifunctor p => Functor (Fix p) where ... data TreeF t a = TreeF a [t] instance Bifunctor TreeF where ... newtype Tree a = Tree (Fix Tree) deriving Functor -- et cetera }}} We can recover most of the original interface using bidirectional pattern synonyms to work around the newtypes. But if we can't write our own `Generic` instance, we'll break everything. Existing library users won't define instances for `TreeF`, so their `Generic`-derived instances for `Tree` will no longer pass the type checker. Ouch. A recent `containers` version added `Generic` and `Generic1` for `Data.Tree`; had the prohibition been under discussion at the time, I'd have thought twice and thrice about whether that was wise. I don't think this issue was quite as significant before the rise of pattern synonyms; in that era, any structural change to an exported transparent datatype was necessarily a breaking one. With pattern synonyms, there are more places we ''can'' change representations, and therefore more reasons to avoid preventing such changes. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13065#comment:6 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13065: Prohibit user-defined Generic and Generic1 instances -------------------------------------+------------------------------------- Reporter: dfeuer | Owner: Type: feature request | Status: new Priority: normal | Milestone: 8.4.1 Component: Compiler | Version: 8.0.1 Resolution: | Keywords: Generics Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Other | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by RyanGlScott): GHC isn't able to derive `Generic` instances for GADTs since as it stands today, there's no sensible way to do it. The ones you hand-defined are bogus, since they would have you believe that there's //two// separate datatypes named `Foo`, each with different constructors. And then there's the separate issue that they don't capture the `'True` and `'False` existential equality information. I don't see what pattern synonyms change here. The issue is fundamentally about what promises `Generic` instances make, and if we want them to be honest, then we cannot allow hand-defined `Generic` instances at all, even if they might be temptingly convenient. `GHC.Generics` is a feature which necessarily mirrors the definition of datatype, so if a datatype's internal structure changes, then its `Generic` instance must change too. Full stop. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13065#comment:7 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13065: Prohibit user-defined Generic and Generic1 instances -------------------------------------+------------------------------------- Reporter: dfeuer | Owner: Type: feature request | Status: new Priority: normal | Milestone: 8.4.1 Component: Compiler | Version: 8.0.1 Resolution: | Keywords: Generics Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Other | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by RyanGlScott): About two weeks have passed since this was first proposed, and there doesn't seem to be any definite consensus. In fact, I'm not sure if you even want this to be implemented anymore, since you seem reticent to give up the ability to define custom `Generic` instances. If you do wish to pursue this further, I'd highly recommend creating a GHC proposal for it, since this seems to be a topic of contention. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13065#comment:8 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

About two weeks have passed since this was first proposed, and there doesn't seem to be any definite consensus. In fact, I'm not sure if you even want this to be implemented anymore, since you seem reticent to give up the ability to define custom `Generic` instances. If you do wish to
#13065: Prohibit user-defined Generic and Generic1 instances -------------------------------------+------------------------------------- Reporter: dfeuer | Owner: Type: feature request | Status: new Priority: normal | Milestone: 8.4.1 Component: Compiler | Version: 8.0.1 Resolution: | Keywords: Generics Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Other | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by dfeuer): Replying to [comment:8 RyanGlScott]: pursue this further, I'd highly recommend creating a GHC proposal for it, since this seems to be a topic of contention. I am of two minds. I see very good reasons to make this change, and I see very good reasons to be wary of it. I'm not the main person pushing for it; I'm just the one who wrote it up. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13065#comment:9 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13065: Prohibit user-defined Generic and Generic1 instances -------------------------------------+------------------------------------- Reporter: dfeuer | Owner: Type: feature request | Status: new Priority: normal | Milestone: 8.4.1 Component: Compiler | Version: 8.0.1 Resolution: | Keywords: Generics Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Other | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonpj): OK, well, since there are good reasons to be wary let's not rush to make a change. The status quo is by definition cheaper. While the idea was ''precipitated'' by the question of tuples (and the overhead of reading interface files etc), it should not be ''driven'' by that. Maybe there's another way to address the perf question. It was only that, if there was a quick consensus, it seemed like an easy way. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13065#comment:10 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler
participants (1)
-
GHC