[GHC] #13251: Must perform family consistency check on non-imported identifiers

#13251: Must perform family consistency check on non-imported identifiers -------------------------------------+------------------------------------- Reporter: ezyang | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.1 (Type checker) | Keywords: TypeFamilies | Operating System: Unknown/Multiple Architecture: | Type of failure: GHC accepts Unknown/Multiple | invalid program Test Case: | Blocked By: Blocking: | Related Tickets: Differential Rev(s): | Wiki Page: -------------------------------------+------------------------------------- Currently, the family consistency check checks pairs of *imported* modules (and the modules they transitively import) for consistency. However, there are a number of mechanisms by which we can refer to an identifier from a module without explicitly importing it. Here is one example from Template Haskell: {{{ -- A.hs {-# LANGUAGE TypeFamilies #-} module A where type family F a -- B.hs {-# LANGUAGE TypeFamilies #-} module B where import A type instance F Bool = String g :: F Bool g = "af" -- C.hs {-# LANGUAGE TypeFamilies #-} module C where import A type instance F Bool = Int h :: F Bool -> IO () h = print -- D.hs {-# LANGUAGE TemplateHaskell #-} import C import Language.Haskell.TH.Syntax main = h $( return (VarE (Name (OccName "g") (NameG VarName (PkgName "main") (ModName "B")))) ) }}} This does an unsafe coerce: {{{ ezyang@sabre:~/Dev/labs/T13102$ ghc-head --make B.hs D.hs -fforce-recomp [1 of 4] Compiling A ( A.hs, A.o ) [2 of 4] Compiling B ( B.hs, B.o ) [3 of 4] Compiling C ( C.hs, C.o ) [4 of 4] Compiling Main ( D.hs, D.o ) Linking D ... ezyang@sabre:~/Dev/labs/T13102$ ./D 8070450533355229282 }}} Clearly, checking consistency on imports is not enough: we must also check up on original names that come by other mechanisms. (Other ways we can end up with identifiers without imports include overloading, see #13102. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13251 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13251: Must perform family consistency check on non-imported identifiers -------------------------------------+------------------------------------- Reporter: ezyang | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler (Type | Version: 8.1 checker) | Resolution: | Keywords: TypeFamilies Operating System: Unknown/Multiple | Architecture: Type of failure: GHC accepts | Unknown/Multiple invalid program | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Description changed by ezyang: @@ -54,0 +54,25 @@ + + A few things to note about how to fix this: + + * Currently, type family instances are checked for consistency as we + process imports. Template Haskell splices can occur much later in a + Haskell file, so we must correspondingly do these checks later. + + * If we refer to an identifier by synthesizing a name manually, it is as + if we imported it. This also means that a reference of this sort implies + an implicit import of the defining module (#13102) and we should consider + instances from it visible (at the moment, it's not considered visible.) + (Actually, with TH, this is a bit tricky, because if we take these + semantics, an instance might be visible below a top-level splice, but + invisible above it.) + + * It is probably simplest if the type family compatibility check happens + at the end. So we should go ahead and revive idea (2) from + https://ghc.haskell.org/trac/ghc/ticket/11062#comment:9 ; if there are + overlapping families we should just not reduce the type family. + + * For wired in things, it's pretty easy to find out if we have an implicit + import: if we bang on `checkWiredInTyCon`, that means we intended for the + instance to visible; so we should collect all of the TyCons we banged on + this way. For TH, this isn't exactly going to work, but maybe we can just + track when NameGs get synthesized. New description: Currently, the family consistency check checks pairs of *imported* modules (and the modules they transitively import) for consistency. However, there are a number of mechanisms by which we can refer to an identifier from a module without explicitly importing it. Here is one example from Template Haskell: {{{ -- A.hs {-# LANGUAGE TypeFamilies #-} module A where type family F a -- B.hs {-# LANGUAGE TypeFamilies #-} module B where import A type instance F Bool = String g :: F Bool g = "af" -- C.hs {-# LANGUAGE TypeFamilies #-} module C where import A type instance F Bool = Int h :: F Bool -> IO () h = print -- D.hs {-# LANGUAGE TemplateHaskell #-} import C import Language.Haskell.TH.Syntax main = h $( return (VarE (Name (OccName "g") (NameG VarName (PkgName "main") (ModName "B")))) ) }}} This does an unsafe coerce: {{{ ezyang@sabre:~/Dev/labs/T13102$ ghc-head --make B.hs D.hs -fforce-recomp [1 of 4] Compiling A ( A.hs, A.o ) [2 of 4] Compiling B ( B.hs, B.o ) [3 of 4] Compiling C ( C.hs, C.o ) [4 of 4] Compiling Main ( D.hs, D.o ) Linking D ... ezyang@sabre:~/Dev/labs/T13102$ ./D 8070450533355229282 }}} Clearly, checking consistency on imports is not enough: we must also check up on original names that come by other mechanisms. (Other ways we can end up with identifiers without imports include overloading, see #13102. A few things to note about how to fix this: * Currently, type family instances are checked for consistency as we process imports. Template Haskell splices can occur much later in a Haskell file, so we must correspondingly do these checks later. * If we refer to an identifier by synthesizing a name manually, it is as if we imported it. This also means that a reference of this sort implies an implicit import of the defining module (#13102) and we should consider instances from it visible (at the moment, it's not considered visible.) (Actually, with TH, this is a bit tricky, because if we take these semantics, an instance might be visible below a top-level splice, but invisible above it.) * It is probably simplest if the type family compatibility check happens at the end. So we should go ahead and revive idea (2) from https://ghc.haskell.org/trac/ghc/ticket/11062#comment:9 ; if there are overlapping families we should just not reduce the type family. * For wired in things, it's pretty easy to find out if we have an implicit import: if we bang on `checkWiredInTyCon`, that means we intended for the instance to visible; so we should collect all of the TyCons we banged on this way. For TH, this isn't exactly going to work, but maybe we can just track when NameGs get synthesized. -- -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13251#comment:1 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13251: Must perform family consistency check on non-imported identifiers -------------------------------------+------------------------------------- Reporter: ezyang | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler (Type | Version: 8.1 checker) | Resolution: | Keywords: TypeFamilies Operating System: Unknown/Multiple | Architecture: Type of failure: GHC accepts | Unknown/Multiple invalid program | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by rwbarton): Not thoroughly thought out, but how about in `checkWiredInTyCon` (or something which wraps it) and in `tcLookupGlobal`, we check whether the thing lives in a module that was actually imported and, if not, add the relevant import information to `tcg_imports`, and either do the ensuing type family consistency checks immediately, or perhaps add them to `tcg_pending_fam_checks`? I mean, we also have this behavior: {{{#!hs {-# LANGUAGE TemplateHaskell #-} import Language.Haskell.TH.Syntax main = print $( return (VarE (Name (OccName "sigFPE") (NameG VarName (PkgName "unix-2.7.0.1") (ModName "System.Posix.Signals")))) ) }}} {{{ rwbarton@morphism:~/ghc2$ ghc x/TH -fforce-recomp [1 of 1] Compiling Main ( x/TH.hs, x/TH.o ) Loading package ghc-prim ... linking ... done. Loading package integer-gmp ... linking ... done. Loading package base ... linking ... done. Loading package array-0.5.0.0 ... linking ... done. Loading package deepseq-1.3.0.2 ... linking ... done. Loading package containers-0.5.5.1 ... linking ... done. Loading package pretty-1.1.1.1 ... linking ... done. Loading package template-haskell ... linking ... done. Linking x/TH ... x/TH.o:(.text+0x4b): undefined reference to `unixzm2zi7zi0zi1_SystemziPosixziSignals_sigFPE_closure' x/TH.o: In function `S4hM_srt': (.data+0x48): undefined reference to `unixzm2zi7zi0zi1_SystemziPosixziSignals_sigFPE_closure' collect2: error: ld returned 1 exit status }}} If we updated the `ImportAvails` when we type checked the spliced thing, wouldn't that also create a dependency on the `unix` package at link time? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13251#comment:2 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13251: Must perform family consistency check on non-imported identifiers -------------------------------------+------------------------------------- Reporter: ezyang | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler (Type | Version: 8.1 checker) | Resolution: | Keywords: TypeFamilies Operating System: Unknown/Multiple | Architecture: Type of failure: GHC accepts | Unknown/Multiple invalid program | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Description changed by rwbarton: @@ -53,1 +53,2 @@ - up with identifiers without imports include overloading, see #13102. + up with identifiers without imports include overloading, see + ticket:13102#comment:13. New description: Currently, the family consistency check checks pairs of *imported* modules (and the modules they transitively import) for consistency. However, there are a number of mechanisms by which we can refer to an identifier from a module without explicitly importing it. Here is one example from Template Haskell: {{{ -- A.hs {-# LANGUAGE TypeFamilies #-} module A where type family F a -- B.hs {-# LANGUAGE TypeFamilies #-} module B where import A type instance F Bool = String g :: F Bool g = "af" -- C.hs {-# LANGUAGE TypeFamilies #-} module C where import A type instance F Bool = Int h :: F Bool -> IO () h = print -- D.hs {-# LANGUAGE TemplateHaskell #-} import C import Language.Haskell.TH.Syntax main = h $( return (VarE (Name (OccName "g") (NameG VarName (PkgName "main") (ModName "B")))) ) }}} This does an unsafe coerce: {{{ ezyang@sabre:~/Dev/labs/T13102$ ghc-head --make B.hs D.hs -fforce-recomp [1 of 4] Compiling A ( A.hs, A.o ) [2 of 4] Compiling B ( B.hs, B.o ) [3 of 4] Compiling C ( C.hs, C.o ) [4 of 4] Compiling Main ( D.hs, D.o ) Linking D ... ezyang@sabre:~/Dev/labs/T13102$ ./D 8070450533355229282 }}} Clearly, checking consistency on imports is not enough: we must also check up on original names that come by other mechanisms. (Other ways we can end up with identifiers without imports include overloading, see ticket:13102#comment:13. A few things to note about how to fix this: * Currently, type family instances are checked for consistency as we process imports. Template Haskell splices can occur much later in a Haskell file, so we must correspondingly do these checks later. * If we refer to an identifier by synthesizing a name manually, it is as if we imported it. This also means that a reference of this sort implies an implicit import of the defining module (#13102) and we should consider instances from it visible (at the moment, it's not considered visible.) (Actually, with TH, this is a bit tricky, because if we take these semantics, an instance might be visible below a top-level splice, but invisible above it.) * It is probably simplest if the type family compatibility check happens at the end. So we should go ahead and revive idea (2) from https://ghc.haskell.org/trac/ghc/ticket/11062#comment:9 ; if there are overlapping families we should just not reduce the type family. * For wired in things, it's pretty easy to find out if we have an implicit import: if we bang on `checkWiredInTyCon`, that means we intended for the instance to visible; so we should collect all of the TyCons we banged on this way. For TH, this isn't exactly going to work, but maybe we can just track when NameGs get synthesized. -- -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13251#comment:3 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13251: Must perform family consistency check on non-imported identifiers -------------------------------------+------------------------------------- Reporter: ezyang | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler (Type | Version: 8.1 checker) | Resolution: | Keywords: TypeFamilies Operating System: Unknown/Multiple | Architecture: Type of failure: GHC accepts | Unknown/Multiple invalid program | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by ezyang): Yes, I think that's along the ideas of what we ought to do, although Simon thinks it would be simpler if we didn't do the family instance checks until the very end of the entire module. Template Haskell NameG on an external package which we don't actually depend on is an interesting problem. One possibility is that we should validate these references, and complain if the identifier comes from something we didn't depend on (I guess this is the preload set?) I haven't thought too carefully about this. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13251#comment:4 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13251: Must perform family consistency check on non-imported identifiers -------------------------------------+------------------------------------- Reporter: ezyang | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler (Type | Version: 8.1 checker) | Resolution: | Keywords: TypeFamilies Operating System: Unknown/Multiple | Architecture: Type of failure: GHC accepts | Unknown/Multiple invalid program | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonpj): Here's a summary. === Instance providers === * Define an '''instance provider''' thus: a module A is an instance provider for M if the class and family instances in A, and those of A's instance providers, are visible in M. * An '''orphan instance provider''' for M is an instance provider for M that is also an orphan module. Instance providers come from three places: * M's direct imports are clearly import providers for M. * Less obviously, the use of a wired-in `TyCon` to deal with built-in syntax (e.g. overloaded lists) may add a new instance provider that was not (transitively) imported. A particular case is `GHC.Exts`. * Also less obviously, if Template Haskell splices in a definition like {{{ f = P.Q.g }}} then `P.Q` becomes an instance provider, because the family instances available there may have been used to typecheck `P.Q.g`. === Consistency checking === We must do family-instance consistency checks for the transitive closure of all M's instance providers. The easiest place to do this is right at the end of type checking, when we have a complete set of instance providers to check. The only downside is that, in the meantime, we could have inconsistent type-family instances, and we would need to account for that in type-family lookup (for example by returning not-found). === Instance lookup === Generally, we vigorously load instances into a single instance environment in the External Package State. But which ones are visible? Answer: the instance providers. We can use this to filter out the instance(s) we match on. But we need only check against the orphan instance providers because if the instance is in a no-orphan module X, then X must be in the instance providers else how could we be looking up that particular predicate? === Tracking instance providers === We can track the instance providers by accumulating * M's direct imports * Modules loaded by `checkWiredInTyCon` * Ditto for Template Haskell -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13251#comment:5 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13251: Must perform family consistency check on non-imported identifiers -------------------------------------+------------------------------------- Reporter: ezyang | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler (Type | Version: 8.1 checker) | Resolution: | Keywords: TypeFamilies Operating System: Unknown/Multiple | Architecture: Type of failure: GHC accepts | Unknown/Multiple invalid program | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by rwbarton): This plan sounds generally good. Nitpick:
Less obviously, the use of a wired-in `TyCon` to deal with built-in syntax (e.g. overloaded lists) may add a new instance provider that was not (transitively) imported. A particular case is `GHC.Exts`.
The things involved in overloaded lists are `fromListN` and the `IsList` type class, both of which are known-key things, not wired-in. (The built- in list type is not the issue here.) Then "Tracking instance providers" needs to include this case as well. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13251#comment:6 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13251: Must perform family consistency check on non-imported identifiers -------------------------------------+------------------------------------- Reporter: ezyang | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler (Type | Version: 8.1 checker) | Resolution: | Keywords: TypeFamilies Operating System: Unknown/Multiple | Architecture: Type of failure: GHC accepts | Unknown/Multiple invalid program | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonpj):
Then "Tracking instance providers" needs to include this case as well
Yes, good point. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13251#comment:7 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13251: Must perform family consistency check on non-imported identifiers -------------------------------------+------------------------------------- Reporter: ezyang | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Compiler (Type | Version: 8.1 checker) | Resolution: | Keywords: TypeFamilies Operating System: Unknown/Multiple | Architecture: Type of failure: GHC accepts | Unknown/Multiple invalid program | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by rwbarton): GHCi lets you use a fully-qualified name to refer to an identifier without importing it (similar to TH), so I suppose we ought to treat a module used in this way as an instance provider, as well. Example from slyfox, involving an orphan instance. {{{ ghci -ignore-package=regex-tdfa-rc GHCi, version 8.0.2: http://www.haskell.org/ghc/ :? for help Loaded GHCi configuration from /home/slyfox/.ghci Prelude C TP CM> "a" Text.Regex.TDFA.=~ "^(a)|(a)$" ::[[ String ]] <interactive>:1:1: error: • No instance for (Text.Regex.Base.RegexLike.RegexMaker Text.Regex.TDFA.Common.Regex Text.Regex.TDFA.Common.CompOption Text.Regex.TDFA.Common.ExecOption [Char]) arising from a use of ‘Text.Regex.TDFA.=~’ • In the expression: "a" Text.Regex.TDFA.=~ "^(a)|(a)$" :: [[String]] In an equation for ‘it’: it = "a" Text.Regex.TDFA.=~ "^(a)|(a)$" :: [[String]] Prelude C TP CM> :m Text.Regex.TDFA Prelude Text.Regex.TDFA> "a" Text.Regex.TDFA.=~ "^(a)|(a)$" ::[[ String ]] [["a","a",""]] }}} (Actually, we could in principle decide freely whether or not to provide the instances here in the sense of making them visible; but we definitely have to do the family instance consistency checks.) -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13251#comment:8 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler
participants (1)
-
GHC