[GHC] #11475: Lint should check for inexhaustive alternatives

#11475: Lint should check for inexhaustive alternatives -------------------------------------+------------------------------------- Reporter: simonpj | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.10.3 Keywords: | Operating System: Unknown/Multiple Architecture: | Type of failure: None/Unknown Unknown/Multiple | Test Case: | Blocked By: Blocking: | Related Tickets: Differential Rev(s): | Wiki Page: -------------------------------------+------------------------------------- GHC's simplifier prunes inaccessible alternatives from case expressions. E.g. {{{ case x of A y -> e1 _ -> ....(case x of A y -> r1 B -> r2 C -> r3) ... }}} It'll prune the `A` alternative from the inner case to get {{{ case x of A y -> e1 _ -> ....(case x of B -> r2 C -> r3) ... }}} BUT, there is no independent check from Lint that this pruning does the Right Thing. Trac #11172 turned out to be a bug in the pruning, which led to something like {{{ case x of Left y -> e1 }}} but in fact `x` ended up being bound to `(Right v)` for some value `v`. But because there is only a single alternative left, GHC does not test the tag, but instead goes ahead and accesses it the `Right` value as if it was a `Left` value. And because of pointer-tagging, it'll assume the wrong tag on the pointer, and very soon you are in seg-fault land. So this ticket is to suggest: make Core Lint do it's best to check that all cases are in fact exhaustive. There are two ways in which we prune cases: * Enclosing pattern matches (as above) * GADT pattern matches may be exhaustive even when they don't mention all constructors. For the latter, it's impossible for Lint to reproduce all the reasoning of the type checker, but I bet that in 99.5% of the cases the existing function `dataConCantMatch` will do the job. So if Lit * Maintains an environment saying what each variable can't match (as the simplifier does) * Does a simple-minded `dataConCantMatch` test based on the typ of the scrutinee I think we'd be a long way forward. It's possible that a valid program could trigger a Lint report, but let's see. Any volunteers? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11475 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#11475: Lint should check for inexhaustive alternatives -------------------------------------+------------------------------------- Reporter: simonpj | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.10.3 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Description changed by simonpj: @@ -20,2 +20,6 @@ - Right Thing. Trac #11172 turned out to be a bug in the pruning, which led - to something like + Right Thing. Yet, lacking such a check, a program can pass Lint and then + seg-fault, which is Very Bad. + + + Trac #11172 is an example. It turned out to be a bug in the pruning, + which led to something like New description: GHC's simplifier prunes inaccessible alternatives from case expressions. E.g. {{{ case x of A y -> e1 _ -> ....(case x of A y -> r1 B -> r2 C -> r3) ... }}} It'll prune the `A` alternative from the inner case to get {{{ case x of A y -> e1 _ -> ....(case x of B -> r2 C -> r3) ... }}} BUT, there is no independent check from Lint that this pruning does the Right Thing. Yet, lacking such a check, a program can pass Lint and then seg-fault, which is Very Bad. Trac #11172 is an example. It turned out to be a bug in the pruning, which led to something like {{{ case x of Left y -> e1 }}} but in fact `x` ended up being bound to `(Right v)` for some value `v`. But because there is only a single alternative left, GHC does not test the tag, but instead goes ahead and accesses it the `Right` value as if it was a `Left` value. And because of pointer-tagging, it'll assume the wrong tag on the pointer, and very soon you are in seg-fault land. So this ticket is to suggest: make Core Lint do it's best to check that all cases are in fact exhaustive. There are two ways in which we prune cases: * Enclosing pattern matches (as above) * GADT pattern matches may be exhaustive even when they don't mention all constructors. For the latter, it's impossible for Lint to reproduce all the reasoning of the type checker, but I bet that in 99.5% of the cases the existing function `dataConCantMatch` will do the job. So if Lit * Maintains an environment saying what each variable can't match (as the simplifier does) * Does a simple-minded `dataConCantMatch` test based on the typ of the scrutinee I think we'd be a long way forward. It's possible that a valid program could trigger a Lint report, but let's see. Any volunteers? -- -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11475#comment:1 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#11475: Lint should check for inexhaustive alternatives -------------------------------------+------------------------------------- Reporter: simonpj | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.10.3 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonpj): Transferred from ghc-defs list I am currently giving #11475 a try. However I hit a few problems and wanted to ask for a bit of advice about how to proceed. I already can handle the following: {{{ case x of A y -> e1 _ -> ....(case x of B -> r2 C -> r3) ... }}} My first problem then was the following construct: {{{ case (case ... of ... -> Left x) of Left y -> ... }}} Or more simple: {{{ case Left x of Left y -> ... }}} So I have to check which constructors the scrutinee may return. I know the simplifier has to implement such logic somewhere but did not want to use the implementation for two reasons: - I didn't know where to find it - If GHC uses the same test to prune impossible alternatives and to check if they were pruned correctly, extending the linter is useless. A bug in the test code will cause also cause a bug in the linter. So I implemented such a test myself, but now I am stuck at something like this: {{{ foo a b = a : b bar = ... case foo a b of (:) x y -> ... }}} I cannot see through `foo`, so I assume it may return `(:)` and `[]`. The simplifier on the other hand is able to see through foo and did remove the `[]` case, causing my current implementation to throw an incorrect lint error. How can I get a list/set of all possible `AltCons` returned by a function/constant? Or a list of the impossible ones? I tried to get some information from the unfolding of a variable, but it seems like unfoldings are only attached to top-level names. Jonas -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11475#comment:2 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#11475: Lint should check for inexhaustive alternatives -------------------------------------+------------------------------------- Reporter: simonpj | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.10.3 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonpj): I suggest being less ambitious! Do the simple things first. {{{ case (case ... of ... -> Left x) of Left y -> ... }}} GHC itself is unlikely to prune the alternative in the outer case, precisely because it's not so obvious that the inner case can only return Left. {{{ bar = ... case foo a b of (:) x y -> ... }}} Same here. In short, dealing with in-scope pattern matches and GADTs is worth doing first. Then when you get a false positive (which you say you are) we can think about them. (For `foo`, the `CprResult` part of the strictness signature will tell that `foo` only returns `(:)` but I'm a bit surprised that GHC actually uses this to prune the case alternatives. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11475#comment:3 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#11475: Lint should check for inexhaustive alternatives -------------------------------------+------------------------------------- Reporter: simonpj | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.10.3 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by simonpj): * cc: anselm.scholl@… (added) -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11475#comment:4 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#11475: Lint should check for inexhaustive alternatives -------------------------------------+------------------------------------- Reporter: simonpj | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.10.3 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by jscholl): Using the {{{CRPResult}}} did indeed help, I can now compile a few more modules. However, the next false positive is a little bit more complicated as there is no evidence I could use attached to it as far as I can tell. Suppose there is an expression {{{ foo x = case x of A -> ... B -> ... y -> error $ "foo" ++ (case y of C -> "C" D -> "D") }}} This is fine and works as expected. But now GHC decides to float out the expression, yielding: {{{ lvl y = error $ "foo" ++ (case y of C -> "C" D -> "D") foo x = case x of A -> ... B -> ... y -> lvl y }}} I don't see where GHC would record that {{{y}}} can only match {{{C}}} and {{{D}}}. I think it doesn't do it anywhere, most likely because no one needed that information until today. A possible solution might be to look for unexported local functions which are non-exhaustive and record the set of allowed constructors for their arguments. All calls to such a function would then be required to only included the correct possible constructors. Or is there a better way? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11475#comment:5 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#11475: Lint should check for inexhaustive alternatives -------------------------------------+------------------------------------- Reporter: simonpj | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.10.3 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonpj): Ah yes, that is tricky. I'm really not sure how far it is profitable to go here. That is, when does keeping Lint happy become burdensome? But let me suggest a possible approach for this particular case. * Ids have Unfoldings inside them. One such is `OtherCon [AltCon]` (see `CoreSyn`) which says "this Id does not match any of these constructors". * Since they are Ids even lambda-bound variables may have unfoldings. I'm in two minds about whether this is a good idea, but you could give the `\y` in the defn of `lvl` above a `OtherCon[A,B]` unfolding. Then you wouldn't get a Lint error from the `case` in `lvl`. * But in exchange you'd have a new thing to check: that `lvl` was indeed applied to arguments that were not `A` or `B`. So I'm not sure we are any further forward. Maybe this can only be done with a better flow analysis... a "super-Lint" if you like. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11475#comment:6 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler
participants (1)
-
GHC