[GHC] #8995: When generalising, use levels rather than global tyvars

#8995: When generalising, use levels rather than global tyvars ------------------------------------+------------------------------------- Reporter: simonpj | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.8.2 Keywords: | Operating System: Unknown/Multiple Architecture: Unknown/Multiple | Type of failure: None/Unknown Difficulty: Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | ------------------------------------+------------------------------------- A long time ago Didier Remy described how to use 'ranks' or 'levels' to make the test "is this type variable free in the environment" when generalising a type. Oleg gives a [http://okmij.org/ftp/ML/generalization.html great explanation here]. GHC still uses the old "find the free type variables in the environment" method. But in fact GHC ''does'' maintain levels, just as Didier explains. The level is called `Untouchables` and is used to make sure we don't infer non-principal types for things involving GADTs. The same level stuff is used to prevent skolem-escape. Details with `Note [Untouchable type variables]` and `Note [Skolem escape prevention]` in `TcType`. So it ought to be straightforward to use the same levels for generalisation. I had a go, and came across the following tricky points: * We need to bump the level (`pushUntouchables`) when starting to typecheck a let RHS. Currently we don't. That will be a bit less efficient, because we'll end up deferring unification constraints that currently get solved immediately. * However the current set up is actually wrong. In particular, we generate a constraint where the implications do not have monotonically increasing level numbers (invariant `(ImplicInv)` in `TcType`). Try this program with `-ddump-tc-trace`. {{{ {-# LANGUAGE GADTs #-} module Bar where data T where MkT :: a -> (a->Int) -> T f x z = case x of MkT _ _ -> let g y = (y, [z,Nothing]) in (g True, g 'c') }}} I don't know how to make this bug cause an actual problem, but it's very unsettling. Bumping the level before doing the RHS would solve this. * Once we have bumped the level, we then have to do some more type- variable promotion, especially of type variables free in the finally- chosen type of the binder. This came out rather awkwardly in the code. * Because we now get more deferred equality constraints, there's a real risk that we'll end up quantifying over them. If we get, say `(gbl ~ Maybe lcl)`, where `gbl` is a type var free in the envt, we '''do not''' want to quantify over it! (It would be sound, but would lead to horrible types.) A long time ago Didier Remy described how to use 'ranks' or 'levels' to make the test "is this type variable free in the environment" when generalising a type. Oleg gives a [http://okmij.org/ftp/ML/generalization.html great explanation here]. GHC still uses the old "find the free type variables in the environment" method. But in fact GHC ''does'' maintain levels, just as Didier explains. The level is called `Untouchables` and is used to make sure we don't infer non-principal types for things involving GADTs. The same level stuff is used to prevent skolem-escape. Details with `Note [Untouchable type variables]` and `Note [Skolem escape prevention]` in `TcType`. So it ought to be straightforward to use the same levels for generalisation. I had a go, and came across the following tricky points: * We need to bump the level (`pushUntouchables`) when starting to typecheck a let RHS. Currently we don't. That will be a bit less efficient, because we'll end up deferring unification constraints that currently get solved immediately. * However the current set up is actually wrong. In particular, we generate a constraint where the implications do not have monotonically increasing level numbers (invariant `(ImplicInv)` in `TcType`). Try this program with `-ddump-tc-trace`. {{{ {-# LANGUAGE ExistentialQuantification #-} module Bar where data T = forall a. MkT a f x z = case x of MkT y -> let g y = [y,Nothing] in g True }}} Look for the `reportUnsolved` trace and you'll nested implications, both with level 1. I don't know how to make this bug cause an actual problem, but it's very unsettling. Bumping the level before doing the RHS would solve this. * Once we have bumped the level, we then have to do some more type- variable promotion, especially of type variables free in the finally- chosen type of the binder. This came out rather awkwardly in the code. * Because we now get more deferred equality constraints, there's a real risk that we'll end up quantifying over them. If we get, say `(gbl ~ Maybe lcl)`, where `gbl` is a type var free in the envt, we '''do not''' want to quantify over it! (It would be sound, but would lead to horrible types.) Rather we want to promote `lcl`, and not quantify. Currently we don't get many deferred equality constraints, becuase they all get done "on the fly". But if we do get one, we do quantify over it, which yields a rather stupid type. Try this: {{{ {-# LANGUAGE ExistentialQuantification #-} module Bar where data T = forall a. MkT a f x z = case x of MkT _ -> let g y = [z,Nothing] in (g True, g 'c') }}} We get this `g :: forall t a2. Maybe a ~ Maybe a2 => t -> [Maybe a2]`, where `x::a` is in the envt. Not very clever! This only shows up with existentials and In general I think we want to find all the type variables reachable by equality constraints from the environment, and not quantify over them. * We also use the type environment `getGlobalTyVars` in ''kind'' generalisation, and kind inference currently does not use the untoucahble story. Adapting it to use levels might be tricky because we currently assume we solve all kind equality constraints on-the-fly, and don't gather any deferreed constraints. Maybe it's easy, but needs thought. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8995 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#8995: When generalising, use levels rather than global tyvars -------------------------------------+------------------------------------ Reporter: simonpj | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.8.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Unknown/Multiple Type of failure: None/Unknown | Difficulty: Unknown Test Case: | Blocked By: Blocking: | Related Tickets: -------------------------------------+------------------------------------ Comment (by simonpj): I've pushed my work-in-progress to branch `wip/T8995-level- generalisation`. Validate says this {{{ Unexpected failures: callarity/should_run StrictLet [exit code non-0] (normal) codeGen/should_run cgrun010 [exit code non-0] (normal) ghci.debugger/scripts break026 [bad stdout] (ghci) ghci.debugger/scripts print020 [bad stderr] (ghci) ghci/scripts ghci046 [bad stderr] (ghci) perf/haddock haddock.base [stat not good enough] (normal) polykinds T6068 [bad stderr] (ghci) }}} Pretty good really. A couple of these are Lint failures, though. The big reason I can't now get rid of `getGlobalTyVars` is the kind- generalisation point. Simon -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8995#comment:1 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#8995: When generalising, use levels rather than global tyvars -------------------------------------+------------------------------------ Reporter: simonpj | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.8.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Unknown/Multiple Type of failure: None/Unknown | Difficulty: Unknown Test Case: | Blocked By: Blocking: | Related Tickets: -------------------------------------+------------------------------------ Comment (by remy): Hi Simon, Oleg pointed me to your implementation issue in case I could help... I do not understand all the details because I do not know the internals of GHC, but I do believe that you only need one notion of ranks. However, I am a bit confused by your description. First, because GHC does not generalizes unanotated local let-binding, then you should not have to play with ranks around (unnanotaed) let-bindings or perhaps the case you are describing is that of annotated local-let bindings. So below, I'll asume that you do generalize then as in ML, and I'll describe how ranks can work for ML (formally, you can see [1], which has later be generalized by typing constraints [2]). You say that you want to increase the rank on RHS of a Let-bindings, but I do not see why. Or is it a typo and you meant LHS? In ML, we increase the rank on LHS of a let-bindings, because this is the type that we will have to generalize. So when generalizing we just have to pick variables of higher-rank (i.e. those introduced during the type checking of the LHS that haven't be downgraded during resolution of the constraint). More precisely, "let x = a1 in a2" is typechecked at rank n as follows: 1) typecheck a1 at rank (n+1): this generates constraint C with fresh variables/nodes introduced at rank (n+1). 2) solve the fresh part of the constraint (that at rank n+1); this may downgrade some fresh nodes to rank n or lower. 3) generalize nodes that remain at rank (n+1); this returns a type scheme S. 4) typecheck a2 at rank n in the environment extended with x : S. In particular, I do not understand why you would increase the level when typechecking the RHS. You just return to the level n at which the whole let-biding is being typechecked. In step 2, variables may be downgraded to lower ranks in two cases: 1) when they have to be unified with a type of a lower rank (either one that has to be of a lower rank, e.g. a type variable introduced at a lower rank, 2) when they are equal to a term whose variables are all of lower rank. My understanding is that Step 2 is what you describe as one of the problem. Steps 1) and 2) can also be explained in term of typing constraints as presented in [2]. At generalization points it is useful to remove useless quantifications (which would be correct but unnecessarily copy too much of the type scheme). This is done by rule C-LetApp (p. 32) that transforms a constraints: let x : forall (Xs, Ys | C1) T in C2 into exists (Ys) let x : forall (Xs | C1) T in C2 provided "Ys" are disjoint from "ftv (C2)" and "exists (Xs) C1 _dertermines_ Ys". Here, turning "all (Ys)" into "exist (YS)" amounts to decreasing the rank of "Ys". The definition of "determines" is semantical at this point, but we later give syntactic sufficient conditions in the case of equality constraints (lemma 1.8.7 on page 82) which, as explained on p. 83, includes the two cases corresponding the ones above: 1) a variable X may be moved to Ys when it is dominated by a node of lower rank (a free variable exists (Xs) C1). 2) a variable X may be moved to Ys when all variables it dominates are already in Ys. So it does not harm at all to keep delayed constraints in type schemes, but 1) the generic part of the constraint should be simplified, so as to ensure that the (generic part of the) type scheme is solvable and 2) delayed constraints must be (carefully) taken into account at generalisation time to avoid generalizing too many type variables (those that are "determined" from the context) I hope I haven't completely misunderstood your problem... Didier [1] http://hal.inria.fr/docs/00/07/70/06/PS/RR-1766.ps) [2] ATTAPL, the essence of ML. (Page numbers refers to the online draft version http://cristal.inria.fr/attapl/preversion.ps.gz) -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8995#comment:2 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#8995: When generalising, use levels rather than global tyvars -------------------------------------+------------------------------------ Reporter: simonpj | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.8.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Unknown/Multiple Type of failure: None/Unknown | Difficulty: Unknown Test Case: | Blocked By: Blocking: | Related Tickets: -------------------------------------+------------------------------------ Comment (by simonpj): Thanks Didier. First, two confusions: * I think of let-bindings of form `let x = <rhs> in <body>`. So when I say "RHS" I mean the bit that you called `a2` above. So I think we are in agreement here. * In principle GHC does not do local let-generalisation, as our papers advertise. But GHC must compile Haskell 98, so it must somehow support local let-generalisation. So in fact: * Local let-gen is enabled by `-XNoMonoLocalBinds` and disabled by `-XMonoLocalBinds`. * By default, we have `-XNoMonoLocalBinds`. * But with `-XGADTs` or `-XTypeFamilies` we also switch on `-XMonoLocalBinds`. * But that can again be overridden, so `-XGADTs -XNoMonoLocalBings` would attempt to generalise local let-bindings despite the problems with doing so. The other thing is that the "monomorphism restriction" can mean that even a top-level binding can have an environment with free type variables. `-XNoMonomorphismRestriction` switches this off. So this ticket is all about the "best-efforts" generalisation you get when you have `-XNoMonoLocalBinds`. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8995#comment:3 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#8995: When generalising, use levels rather than global tyvars -------------------------------------+------------------------------------- Reporter: simonpj | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler (Type | Version: 7.8.2 checker) | Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Revisions: -------------------------------------+------------------------------------- Changes (by thomie): * component: Compiler => Compiler (Type checker) -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8995#comment:4 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#8995: When generalising, use levels rather than global tyvars -------------------------------------+------------------------------------- Reporter: simonpj | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Compiler (Type | Version: 7.8.2 checker) | Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by nfrisby): * cc: nfrisby (added) -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8995#comment:5 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler
participants (1)
-
GHC