
#15696: Derived Ord instance for enumerations with more than 8 elements seems to be incorrect -------------------------------------+------------------------------------- Reporter: mrkkrp | Owner: (none) Type: bug | Status: patch Priority: highest | Milestone: 8.6.2 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D5196, Wiki Page: | Phab:D5201 -------------------------------------+------------------------------------- Comment (by simonpj): The `dataToTag#` saga has revealed several things, which I'd like to summarise here. * We have been guilty of mixing up two things: * A ''semantic'' property, namely that an expression, or variable, is guaranteed to be non-bottom. For example, given {{{ data T = MkT !Int f (MkT y) = ...y... }}} in the body of `f`, we know that `y` is non-bottom. * A ''representational'' propery, that a variable is bound to a heap pointer that (a) points directly to an evaluated heap-allocated value (e.g. a cons cell), and (b) is properly tagged, for data constructors with small enough tags. The representational property implies the semantic property, ''but not vice versa''. For example, in the above code, are we guaranteed that `y` (being the strict field of `MkT` is bound to a properly-tagged pointer to an `Int`? I used to think yes, but [https://ghc.haskell.org/trac/ghc/ticket/15696#comment:36 this example] (look at the bullets in that comment) shows that the answer is NO. My conclusion: we should keep the two properties separate. The Simplifier can maintain the first; but the second is much more fragile, and is the business of the code generator alone. * We have never implemented #14626, but it's pretty easy to do so provided we focus our attention on the code generator. The example there is {{{ f x = case x of y Red -> Green DEFAULT -> y }}} The resturn contract of `case` specifies that `y` is bound to a properly-tagged pointer directly to the value; The code generator can remember this fact, in its environment; and simply ''return'' `y` in the DEFAULT branch rather than ''entering'' it. * `exprOkForSpeculation` finishes with this case (in `app_ok`) {{{ _other -> isUnliftedType (idType fun) -- c.f. the Var case of exprIsHNF || idArity fun > n_val_args -- Partial apps || (n_val_args == 0 && isEvaldUnfolding (idUnfolding fun)) -- Let-bound values }}} I think the final disjunct is wrong, becuase it is fragile. Consider {{{ case x of y { DEFAULT -> ...y.... } }}} In that branch, is `y` ok-for-speculation? It looks so, because it has an evald unfolding (`OtherCon`). But the binder-swap transformation performed by FloatOut transforms it to {{{ case x of y { DEFAULT -> ...x.... } }}} and `x` is NOT ok-for-spec. So the binder-swap transformation might invalidate invariants. This is pretty obscure in practice, but I think we can make the compiler simpler and more correct by deleting those two lines. I think it will have zero effect because `exprOkForSpecuation` is seldom if ever called on expressions of lifted type. * Only two primops (currently) evaluate a lifted-type argument: `SeqOp` and `DataToTagOp`. The former has a special case in `app_ok` (in `exprOkForSpeculation` and the latter should too. * Once all this is done, the hack of making `DataToTagOp` have `can_fail=True` can be dispensed with. * We have related loose ends in #14677 and #15155. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15696#comment:60 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler