[GHC] #10270: inconsistent semantics of type class instance visibility outside recursive modules

#10270: inconsistent semantics of type class instance visibility outside recursive modules -------------------------------------+------------------------------------- Reporter: skilpat | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.10.1 Keywords: | Operating System: MacOS X Architecture: x86_64 | Type of failure: GHC rejects (amd64) | valid program Test Case: | Blocked By: Blocking: | Related Tickets: Differential Revisions: | -------------------------------------+------------------------------------- When you have an instance defined in a module that's part of a recursive module loop, when should modules outside the loop see it? Right now, GHC behaves differently depending on whether it's in batch or in single-shot compilation mode. Which one is correct? Dunno! Here's the example (code at the bottom). The gist is that you have two modules A and B that define data types T and U, respectively, which depend on each other, so A and B are recursive modules and we need a boot file -- say, for A. Now suppose we define an instance Eq T in the implementation of A but not in the boot file. B imports the boot file, so it doesn't know about this instance. And then I have some third module, Main, outside the loop, which imports only B. And now the central question: ''Does Main know about the Eq T defined in A?'' We can test how GHC answers this question by defining an (orphan) instance for Eq T in Main. In batch compilation mode, Main is rejected for defining a duplicate instance. In single-shot compilation mode, however, Main is accepted and any equality test in Main uses the locally defined instance; i.e., B doesn't know about A's Eq T and so neither does Main. So which is correct? If you ask me, the latter semantics is correct, but I can see why the former might be argued as well (e.g., according to the fixed-point semantics of import/export described in (1)). In any case, the semantics between the two compilation modes should probably agree, right? {{{#!hs -- A.hs-boot module A where data T -- a mutually recursive data type, along with B.U t :: T -- some value to test for Eq in Main -- B.hs module B(module A, module B) where -- export A.{T,t} since Main doesn't import A import {-# SOURCE #-} A -- mutually recursive data type across modules data U = U | UT T -- A.hs module A where import B -- mutually recursive data type across modules data T = T | TU U -- the true instance instance Eq T where _ == _ = True -- some value to test Eq instance in Main t :: T t = T -- Main.hs module Main where import B -- no import of A -- an orphan instance for Eq T. -- okay in one-shot mode; not okay in batch (--make) mode instance Eq T where _ == _ = False -- in one-shot mode, this prints False main = putStrLn $ show $ t == t }}} Commands and output for batch mode: {{{ $ ghc --make Main [1 of 4] Compiling A[boot] ( A.hs-boot, A.o-boot ) [2 of 4] Compiling B ( B.hs, B.o ) [3 of 4] Compiling A ( A.hs, A.o ) [4 of 4] Compiling Main ( Main.hs, Main.o ) A.hs:8:10: Duplicate instance declarations: instance Eq T -- Defined at A.hs:8:10 instance Eq T -- Defined at Main.hs:7:10 }}} Commands and output for single-shot mode: {{{ $ ghc -c A.hs-boot $ ghc -c B.hs $ ghc -c A.hs $ ghc -c Main.hs $ ghc A.o B.o Main.o -o main $ ./main False }}} Tested on GHC 7.6.3, 7.8.3, and 7.10.1. (1) ''A Formal Specification for the Haskell 98 Module System''. Iavor S. Diatchki, Mark P. Jones, and Thomas Hallgren. Haskell '02. http://web.cecs.pdx.edu/~mpj/pubs/hsmods.pdf -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10270 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10270: inconsistent semantics of type class instance visibility outside recursive modules -------------------------------------+------------------------------------- Reporter: skilpat | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.10.1 Resolution: | Keywords: Operating System: MacOS X | Architecture: x86_64 Type of failure: GHC rejects | (amd64) valid program | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Revisions: -------------------------------------+------------------------------------- Comment (by ezyang): This is probably due to the retypecheck loop: {{{ See bug #930. This code fixes a long-standing bug in --make. The problem is that when compiling the modules *inside* a loop, a data type that is only defined at the top of the loop looks opaque; but after the loop is done, the structure of the data type becomes apparent. The difficulty is then that two different bits of code have different notions of what the data type looks like. The idea is that after we compile a module which also has an .hs-boot file, we re-generate the ModDetails for each of the modules that depends on the .hs-boot file, so that everyone points to the proper TyCons, Ids etc. defined by the real module, not the boot module. Fortunately re-generating a ModDetails from a ModIface is easy: the function TcIface.typecheckIface does exactly that. Picking the modules to re-typecheck is slightly tricky. Starting from the module graph consisting of the modules that have already been compiled, we reverse the edges (so they point from the imported module to the importing module), and depth-first-search from the .hs-boot node. This gives us all the modules that depend transitively on the .hs-boot module, and those are exactly the modules that we need to re-typecheck. Following this fix, GHC can compile itself with --make -O2. }}} As a side effect, the instances also get dragged in. In one-shot mode, we use a different mechanism, the `tcg_type_env_var`, to ensure we can deal with the recursive case. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10270#comment:1 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10270: inconsistent semantics of type class instance visibility outside recursive modules -------------------------------------+------------------------------------- Reporter: skilpat | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.10.1 Resolution: | Keywords: Operating System: MacOS X | Architecture: x86_64 Type of failure: GHC rejects | (amd64) valid program | Test Case: Blocked By: | Blocking: Related Tickets: #9562 | Differential Revisions: -------------------------------------+------------------------------------- Changes (by goldfire): * related: => #9562 Comment: I think #9562 is related. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10270#comment:2 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10270: inconsistent semantics of type class instance visibility outside recursive modules -------------------------------------+------------------------------------- Reporter: skilpat | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.10.1 Resolution: | Keywords: Operating System: MacOS X | Architecture: x86_64 Type of failure: GHC rejects | (amd64) valid program | Test Case: Blocked By: | Blocking: Related Tickets: #9562 | Differential Revisions: -------------------------------------+------------------------------------- Comment (by simonpj): I rather agree that the one-shot story is the right one. And yes, the two should agree. Edward I recall that we had extensive discussion about how to arrange that the right instances are in scope. We had a New Plan; but (a) I have forgotten the plan and (b) I'm not sure whether it was ever implemented. Can you remind me? Simon -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10270#comment:3 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10270: inconsistent semantics of type class instance visibility outside recursive modules -------------------------------------+------------------------------------- Reporter: skilpat | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.10.1 Resolution: | Keywords: Operating System: MacOS X | Architecture: x86_64 Type of failure: GHC rejects | (amd64) valid program | Test Case: Blocked By: | Blocking: Related Tickets: #9562 | Differential Revisions: -------------------------------------+------------------------------------- Comment (by ezyang): The plan was for orphan instances, to check if any given instance was in the set of orphan modules which are transitively imported from this module. We did implement it. Unfortunately, this doesn't directly help here because the instance in A.hs is NOT an orphan, so we just assume that it's in scope. So in some sense, this is the same problem that we were grappling with hsigs: sometimes, a non-orphan instance should NOT be visible, because it was not defined in the hs-boot/hsig file. We ''also'' spent a bit of time thinking about how to solve this, but all of our solutions boiled down to some variant of "put the set of transitively imported modules (including non- orphan modules)" into the interface, with various schemes for optimizing this since this results in quadratic growth of interface files. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10270#comment:4 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10270: inconsistent semantics of type class instance visibility outside recursive modules -------------------------------------+------------------------------------- Reporter: skilpat | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.10.1 Resolution: | Keywords: Operating System: MacOS X | Architecture: x86_64 Type of failure: GHC rejects | (amd64) valid program | Test Case: Blocked By: | Blocking: Related Tickets: #9562 | Differential Revisions: -------------------------------------+------------------------------------- Comment (by ezyang): Actually, the situation is a bit better here, because hs-boot operates strictly locally. Here is a fix proposal: For non-orphan instances, we can't easily tell if an instance is supposed to be in scope unless we know whether or not the defining module is transitively reachable from the imports of a module. Fortunately, in `--make`, we do know this information from the import graph. So, for each module we compile, calculate a set of local hs/hs-boot modules which should be visible, and pass that along to the type-checker. Then, like for orphans, we only treat an instance as visible if it is in this set. (Note: we can't do this strategy for external instances, because we don't have the full dependency graph anymore.) I thought of this technique when I noticed that `--make` behavior was inconsistent, depending on whether or not a module was compiled before or after we retypecheck an hs-boot loop; if you always make sure modules which transitively have a `{-# SOURCE #-}` import on a booted module are type-checked BEFORE you typecheck the real implementation (even if there isn't a dependency), you will ensure that only the correct interfaces are in scope. But this is awfully delicate and wouldn't work well in the parallel make case, so just calculating the reachable set of nodes seems better. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10270#comment:5 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10270: inconsistent semantics of type class instance visibility outside recursive modules -------------------------------------+------------------------------------- Reporter: skilpat | Owner: ezyang Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.10.1 Resolution: | Keywords: Operating System: MacOS X | Architecture: x86_64 Type of failure: GHC rejects | (amd64) valid program | Test Case: Blocked By: | Blocking: Related Tickets: #9562 | Differential Revisions: -------------------------------------+------------------------------------- Changes (by ezyang): * owner: => ezyang -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10270#comment:6 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler
participants (1)
-
GHC