[GHC] #15696: Derived Ord instance for enumerations with more than 8 elements seems to be incorrect

#15696: Derived Ord instance for enumerations with more than 8 elements seems to be incorrect -------------------------------------+------------------------------------- Reporter: mrkkrp | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: 8.6.1 Component: Compiler | Version: 8.6.1 Keywords: | Operating System: Unknown/Multiple Architecture: | Type of failure: None/Unknown Unknown/Multiple | Test Case: | Blocked By: Blocking: | Related Tickets: Differential Rev(s): | Wiki Page: -------------------------------------+------------------------------------- Sometimes. Here is the thread that explains the bug: https://github.com/haskell/containers/issues/568 I originally reported this as a bug on `containers` issue tracker, but we seem to have concluded that this is probably a bug in the GHC optimizer itself. I think the shortest repro so far is this: {{{#!hs import qualified Data.Set as S main = print $ let {-# noinline f #-} f () = T2 in S.fromList [f (), f ()] data T = T1 | T2 | T3 | T4 | T5 | T6 | T7 | T8 | T9 deriving (Show, Read, Eq, Ord, Bounded, Enum) }}} which prints {{{#!hs fromList [T2,T2] }}} The person who derived this from my original repro says:
And as I said earlier, comment out the T9 constructor => prints fromList [T2] as it should.
Another interesting quote:
Can confirm. Tested with ghc-8.6.1, containers-0.6.0.1 and leancheck-0.7.5 (so it does not seem to depend on the testing framework). Error occurs:
* with ghc -O1 and -O2 (but not with -O0) * and if data type has at least 9 elements
So, likely a bug in ghc's optimizer.
in some cases, input has duplicates, but not always.
This is a bad one, makes GHC 8.6.1 totally unusable for me. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15696 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15696: Derived Ord instance for enumerations with more than 8 elements seems to be incorrect -------------------------------------+------------------------------------- Reporter: mrkkrp | Owner: (none) Type: bug | Status: new Priority: highest | Milestone: 8.6.1 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by dfeuer): * priority: normal => highest * cc: dfeuer (added) * failure: None/Unknown => Incorrect result at runtime -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15696#comment:1 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15696: Derived Ord instance for enumerations with more than 8 elements seems to be incorrect -------------------------------------+------------------------------------- Reporter: mrkkrp | Owner: (none) Type: bug | Status: new Priority: highest | Milestone: 8.6.1 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonpj): That's terrible! The following all produce the same (wrong) restul {{{ S.fromList (map f [ () | _ <- [1..10] ]) -- T2,T2 S.fromList [f (), f ()] -- T2,T2 S.fromList [f (), f (), f()] -- T2,T2 }}} However this works fine {{{ f () `S.insert` (f () `S.insert` S.empty) }}} The comparison operations generated by `deriving` look ok to me. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15696#comment:2 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15696: Derived Ord instance for enumerations with more than 8 elements seems to be incorrect -------------------------------------+------------------------------------- Reporter: mrkkrp | Owner: (none) Type: bug | Status: new Priority: highest | Milestone: 8.6.1 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by osa1): * cc: osa1 (added) Comment: The problem is in this Core generated for this program: {{{ -- RHS size: {terms: 5, types: 2, coercions: 0, joins: 0/0} f_r6li f_r6li = \ ds_d3M6 -> case ds_d3M6 of { () -> T2 } -- RHS size: {terms: 2, types: 0, coercions: 0, joins: 0/0} lvl_r6lj lvl_r6lj = f_r6li () main2 main2 = case dataToTag# lvl_r6lj of a#_a2rY { __DEFAULT -> case dataToTag# lvl_r6lj of b#_a2rZ { __DEFAULT -> ... }}} We get the tag of a CAF (`lvl_r6lj`) before evaluating it, so we get tag of a thunk. The need for evaluating argument of `dataToTag#` is explained in `Note [dataToTag#]` in primops.txt.pp. It seems like we're inlining `getTag`, and then somehow eliminating the case expression in `getTag` (to evaluate its argument). If I change the `INLINE` annotation of `getTag` to `NOINLINE` this works as expected. I don't know why we're elminating the `case` in `getTag` after inlining it yet. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15696#comment:3 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15696: Derived Ord instance for enumerations with more than 8 elements seems to be incorrect -------------------------------------+------------------------------------- Reporter: mrkkrp | Owner: (none) Type: bug | Status: new Priority: highest | Milestone: 8.6.1 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by osa1): Ah, so it turns out we have a special case in CorePrep (which runs after simplifications) about `dataToTag#`, and we generate a case expression around its argument after all the simplifications. It's explained in `Note [dataToTag magic]` in CorePrep, and I can see in STG that it works as expected: {{{ lvl_r6ru = \u [] f_r6rt (); lvl1_r6rv = CCS_DONT_CARE :! [lvl_r6ru []]; main2 = \u [] case case lvl_r6ru of sat_s6xD { __DEFAULT -> dataToTag# [sat_s6xD]; } of a#_s6xE { __DEFAULT -> case case lvl_r6ru of sat_s6xF { __DEFAULT -> dataToTag# [sat_s6xF]; } of ... }}} So perhaps this is not because we get tag of a thunk. I don't know why not inlining `getTag` fixes this. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15696#comment:4 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15696: Derived Ord instance for enumerations with more than 8 elements seems to be incorrect -------------------------------------+------------------------------------- Reporter: mrkkrp | Owner: (none) Type: bug | Status: new Priority: highest | Milestone: 8.6.1 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by j.waldmann): * cc: j.waldmann (added) -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15696#comment:5 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15696: Derived Ord instance for enumerations with more than 8 elements seems to be incorrect -------------------------------------+------------------------------------- Reporter: mrkkrp | Owner: (none) Type: bug | Status: new Priority: highest | Milestone: 8.6.1 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by RyanGlScott): Here's an example which doesn't depend on any code from `containers`. It also makes the derived `Ord` code explicit: {{{#!hs {-# LANGUAGE MagicHash #-} module Main where import qualified Data.Foldable as Foldable import GHC.Exts (dataToTag#, tagToEnum#, (==#), (<#)) main :: IO () main | not_ordered a b = print $ Foldable.foldl' (flip wumbo) (singleton a) b | otherwise = pure () where {-# NOINLINE f #-} f () = T2 {-# NOINLINE a #-} a = f () {-# NOINLINE b #-} b = [f ()] data T = T1 | T2 | T3 | T4 | T5 | T6 | T7 | T8 | T9 deriving (Eq, Show) instance Ord Main.T where compare a b = case dataToTag# a of a' -> case dataToTag# b of b' -> if tagToEnum# (a' <# b') :: Bool then LT else if tagToEnum# (a' ==# b') :: Bool then EQ else GT data Set a = Bin !a !(Set a) !(Set a) | Tip deriving Show not_ordered :: Ord a => a -> [a] -> Bool not_ordered _ [] = False not_ordered x (y : _) = x >= y wumbo :: Ord a => a -> Set a -> Set a wumbo x0 = go x0 x0 where go :: Ord a => a -> a -> Set a -> Set a go orig _ Tip = singleton orig go orig x t@(Bin y l r) = case compare x y of LT -> error "not used here" GT -> Bin y l (go orig x r) EQ -> t {-# INLINE wumbo #-} singleton :: a -> Set a singleton x = Bin x Tip Tip }}} -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15696#comment:6 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15696: Derived Ord instance for enumerations with more than 8 elements seems to be incorrect -------------------------------------+------------------------------------- Reporter: mrkkrp | Owner: (none) Type: bug | Status: new Priority: highest | Milestone: 8.6.1 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonpj): That's really helpful Ryan, thank you. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15696#comment:7 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15696: Derived Ord instance for enumerations with more than 8 elements seems to be incorrect -------------------------------------+------------------------------------- Reporter: mrkkrp | Owner: (none) Type: bug | Status: new Priority: highest | Milestone: 8.6.1 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by RyanGlScott): Apologies, I meant to mention in comment:6 what `wumbo` actually is: it's a stripped down version of `insert` intended to highlight that it appears to behave differently between GHC 8.4.3 and 8.6.1. The semantics of `wumbo` differs from that of `insert`, but here is the important bit: {{{ $ /opt/ghc/8.4.3/bin/ghc -O2 -fforce-recomp Bug.hs && ./Bug [1 of 1] Compiling Main ( Bug.hs, Bug.o ) Linking Bug ... Bin T2 Tip Tip $ /opt/ghc/8.6.1/bin/ghc -O2 -fforce-recomp Bug.hs && ./Bug [1 of 1] Compiling Main ( Bug.hs, Bug.o ) Linking Bug ... Bin T2 Tip (Bin T2 Tip Tip) }}} GHC 8.4.3's answer is definitely the correct one, since the only way you'd get 8.6.1's answer is by hitting the `GT` case of `wumbo` (which shouldn't happen if you're comparing `T2` to `T2`). -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15696#comment:8 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15696: Derived Ord instance for enumerations with more than 8 elements seems to be incorrect -------------------------------------+------------------------------------- Reporter: mrkkrp | Owner: (none) Type: bug | Status: new Priority: highest | Milestone: 8.6.1 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by monoidal): I'm minimizing this. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15696#comment:9 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15696: Derived Ord instance for enumerations with more than 8 elements seems to be incorrect -------------------------------------+------------------------------------- Reporter: mrkkrp | Owner: (none) Type: bug | Status: new Priority: highest | Milestone: 8.6.2 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by bgamari): * milestone: 8.6.1 => 8.6.2 -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15696#comment:10 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15696: Derived Ord instance for enumerations with more than 8 elements seems to be incorrect -------------------------------------+------------------------------------- Reporter: mrkkrp | Owner: (none) Type: bug | Status: new Priority: highest | Milestone: 8.6.2 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by dfeuer): Didn't we just recently start using some pointer tagging for types with more than 7 constructors? I'm thinking something could be a drop off in that code. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15696#comment:11 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15696: Derived Ord instance for enumerations with more than 8 elements seems to be incorrect -------------------------------------+------------------------------------- Reporter: mrkkrp | Owner: (none) Type: bug | Status: new Priority: highest | Milestone: 8.6.2 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by monoidal): Here's a smaller version. `ghc T15696 && ./T15696` prints `EQ` correctly, `ghc -O T15696 && ./T15696` prints `LT`. {{{ #!hs {-# LANGUAGE MagicHash #-} module Main where import GHC.Exts (dataToTag#, tagToEnum#, (==#), (<#)) main :: IO () main = print $ compare a T2 where {-# NOINLINE f #-} f = T2 {-# NOINLINE a #-} a = f data T = T1 | T2 | T3 | T4 | T5 | T6 | T7 | T8 | T9 deriving (Eq, Show, Ord) {- instance Ord Main.T where compare a b = case dataToTag# a of a' -> case dataToTag# b of b' -> if tagToEnum# (a' <# b') :: Bool then LT else if tagToEnum# (a' ==# b') :: Bool then EQ else GT -} }}} -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15696#comment:12 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15696: Derived Ord instance for enumerations with more than 8 elements seems to be incorrect -------------------------------------+------------------------------------- Reporter: mrkkrp | Owner: (none) Type: bug | Status: new Priority: highest | Milestone: 8.6.2 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonpj): Avoiding the use of type classes: {{{ main :: IO () main = print $ cmpT a T2 where {-# NOINLINE f #-} f = T2 {-# NOINLINE a #-} a = f data T = T1 | T2 | T3 | T4 | T5 | T6 | T7 | T8 | T9 -- deriving (Eq, Show, Ord) cmpT a b = case dataToTag# a of a' -> case dataToTag# b of b' -> if tagToEnum# (a' <# b') :: Bool then LT else if tagToEnum# (a' ==# b') :: Bool then EQ else GT }}} With `-O` we get `LT` for GHC 8.4 and 8.2 and earlier versions. Without `-O` it returns `EQ` as it should. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15696#comment:13 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15696: Derived Ord instance for enumerations with more than 8 elements seems to be
incorrect
-------------------------------------+-------------------------------------
Reporter: mrkkrp | Owner: (none)
Type: bug | Status: new
Priority: highest | Milestone: 8.6.2
Component: Compiler | Version: 8.6.1
Resolution: | Keywords:
Operating System: Unknown/Multiple | Architecture:
Type of failure: Incorrect result | Unknown/Multiple
at runtime | Test Case:
Blocked By: | Blocking:
Related Tickets: | Differential Rev(s):
Wiki Page: |
-------------------------------------+-------------------------------------
Comment (by simonpj):
Omer, might you look at this. With `-ddump-stg` I see
{{{
Main.cmpT :: forall a1 a2. a1 -> a2 -> GHC.Types.Ordering
[GblId, Arity=2, Caf=NoCafRefs, Str=, Unf=OtherCon []] =
[] \r [a2_s3tf b_s3tg]
case
case a2_s3tf of sat_s3th [Occ=Once] {
__DEFAULT -> dataToTag# [sat_s3th];
}
of
a'_s3ti
{ __DEFAULT ->
case
case b_s3tg of sat_s3tj [Occ=Once] {
__DEFAULT -> dataToTag# [sat_s3tj];
}
of
b'_s3tk
{ __DEFAULT ->
case <# [a'_s3ti b'_s3tk] of {
__DEFAULT ->
case ==# [a'_s3ti b'_s3tk] of {
__DEFAULT -> GHC.Types.GT [];
1# -> GHC.Types.EQ [];
};
1# -> GHC.Types.LT [];
};
};
};
}}}
which looks right. In another variant (I made `dataToTag#` lazy) I saw
{{{
Main.cmpT :: forall a1 a2. a1 -> a2 -> GHC.Types.Ordering
[GblId,
Arity=2,
Caf=NoCafRefs,
Str=,
Unf=OtherCon []] =
[] \r [a2_s3tf b_s3tg]
case a2_s3tf of x1_s3th [Occ=Once] {
__DEFAULT ->
case dataToTag# [x1_s3th] of a'_s3ti {
__DEFAULT ->
case b_s3tg of x2_s3tj [Occ=Once] {
__DEFAULT ->
case dataToTag# [x2_s3tj] of b'_s3tk {
__DEFAULT ->
case <# [a'_s3ti b'_s3tk] of {
__DEFAULT ->
case ==# [a'_s3ti b'_s3tk] of {
__DEFAULT -> GHC.Types.GT [];
1# -> GHC.Types.EQ [];
};
1# -> GHC.Types.LT [];
};
};
};
};
};
}}}
But both stubbornly return `LT` instead of `EQ`. This must be a code-gen
or RTS issue. I have not looked at the Cmm. Might you do so?
--
Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15696#comment:14
GHC http://www.haskell.org/ghc/
The Glasgow Haskell Compiler

#15696: Derived Ord instance for enumerations with more than 8 elements seems to be incorrect -------------------------------------+------------------------------------- Reporter: mrkkrp | Owner: (none) Type: bug | Status: new Priority: highest | Milestone: 8.6.2 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by RyanGlScott): Weirdly enough, I get different answers than the ones simonpj reported for the program in comment:13. To be explicit, if I'm using this program: {{{#!hs {-# LANGUAGE MagicHash #-} module Main where import GHC.Exts (dataToTag#, tagToEnum#, (==#), (<#)) main :: IO () main = print $ compare a T2 where {-# NOINLINE f #-} f = T2 {-# NOINLINE a #-} a = f data T = T1 | T2 | T3 | T4 | T5 | T6 | T7 | T8 | T9 deriving (Eq, Show) instance Ord Main.T where compare a b = case dataToTag# a of a' -> case dataToTag# b of b' -> if tagToEnum# (a' <# b') :: Bool then LT else if tagToEnum# (a' ==# b') :: Bool then EQ else GT }}} Then I consistently get `LT` regardless of optimization level: {{{ $ /opt/ghc/8.6.1/bin/ghc -O0 -fforce-recomp Bug.hs && ./Bug [1 of 1] Compiling Main ( Bug.hs, Bug.o ) Linking Bug ... LT $ /opt/ghc/8.6.1/bin/ghc -O2 -fforce-recomp Bug.hs && ./Bug [1 of 1] Compiling Main ( Bug.hs, Bug.o ) Linking Bug ... LT }}} If I replace all uses of `dataToTag#` with `getTag`, however: {{{#!hs {-# LANGUAGE MagicHash #-} module Main where import GHC.Base (getTag) import GHC.Exts (tagToEnum#, (==#), (<#)) main :: IO () main = print $ compare a T2 where {-# NOINLINE f #-} f = T2 {-# NOINLINE a #-} a = f data T = T1 | T2 | T3 | T4 | T5 | T6 | T7 | T8 | T9 deriving (Eq, Show) instance Ord Main.T where compare a b = case getTag a of a' -> case getTag b of b' -> if tagToEnum# (a' <# b') :: Bool then LT else if tagToEnum# (a' ==# b') :: Bool then EQ else GT }}} Only then do I get `EQ` without optimization: {{{ $ /opt/ghc/8.6.1/bin/ghc -O0 -fforce-recomp Bug.hs && ./Bug [1 of 1] Compiling Main ( Bug.hs, Bug.o ) Linking Bug ... EQ $ /opt/ghc/8.6.1/bin/ghc -O2 -fforce-recomp Bug.hs && ./Bug [1 of 1] Compiling Main ( Bug.hs, Bug.o ) Linking Bug ... LT }}} What's more, I consistently get the same sets of results in each version of GHC dating back to 8.2.2. This makes me believe that the bug that was exposed here has actually been lurking for quite a while (but perhaps a difference in inlining behavior in 8.6 only just recently exposed it). -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15696#comment:15 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15696: Derived Ord instance for enumerations with more than 8 elements seems to be incorrect -------------------------------------+------------------------------------- Reporter: mrkkrp | Owner: (none) Type: bug | Status: new Priority: highest | Milestone: 8.6.2 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by RyanGlScott): The fact that using `dataToTag#` directly produces incorrect results is perhaps not terribly surprising, giving that it [http://git.haskell.org/ghc.git/blob/21efbc7599e39ec93b8b13b7d7b84811226e6f6f... must always be applied to an evalauted argument] (see `Note [dataToTag#]`). The fact that the version using `getTag` breaks is more worrisome, since `getTag` actually forces its argument (with a bang pattern). -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15696#comment:16 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15696: Derived Ord instance for enumerations with more than 8 elements seems to be incorrect -------------------------------------+------------------------------------- Reporter: mrkkrp | Owner: (none) Type: bug | Status: new Priority: highest | Milestone: 8.6.2 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by maoe): * cc: maoe (added) -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15696#comment:17 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15696: Derived Ord instance for enumerations with more than 8 elements seems to be incorrect -------------------------------------+------------------------------------- Reporter: mrkkrp | Owner: (none) Type: bug | Status: new Priority: highest | Milestone: 8.6.2 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by RyanGlScott): It's also worth noting that you can trim `T` down to just two constructors: {{{#!hs {-# LANGUAGE MagicHash #-} module Main where import GHC.Base (getTag) import GHC.Exts (tagToEnum#, (==#), (<#)) main :: IO () main = print $ compare a T2 where {-# NOINLINE f #-} f = T2 {-# NOINLINE a #-} a = f data T = T1 | T2 deriving (Eq, Show) instance Ord Main.T where compare a b = case getTag a of a' -> case getTag b of b' -> if tagToEnum# (a' <# b') :: Bool then LT else if tagToEnum# (a' ==# b') :: Bool then EQ else GT }}} And the bug will still trigger. (If you define `data T = T2`, then the bug will go away.) -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15696#comment:18 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

The fact that using `dataToTag#` directly produces incorrect results is
#15696: Derived Ord instance for enumerations with more than 8 elements seems to be incorrect -------------------------------------+------------------------------------- Reporter: mrkkrp | Owner: (none) Type: bug | Status: new Priority: highest | Milestone: 8.6.2 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by dfeuer): Replying to [comment:16 RyanGlScott]: perhaps not terribly surprising... Except that we have special code to make sure that always happens regardless. The STG Simon pasted looks right from that standpoint: it always applies `dataToTag#` under a case on its argument. So my bet is that Simon is right: something is going wrong in code generation or the RTS. Side note: it looks like I was wrong about tagging large types. I believe there was some talk of that, but it doesn't look like it's happened as yet. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15696#comment:19 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

Didn't we just recently start using some pointer tagging for types with more than 7 constructors? I'm thinking something could be a drop off in
#15696: Derived Ord instance for enumerations with more than 8 elements seems to be incorrect -------------------------------------+------------------------------------- Reporter: mrkkrp | Owner: (none) Type: bug | Status: new Priority: highest | Milestone: 8.6.2 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by heisenbug): Replying to [comment:11 dfeuer]: that code. Nope. That code is not ripe yet. This summer was very busy for me. Another thing: With Peter I discovered last Haskell eXchange (Oct. 2017) that enumerated types with derived `Ord` instances could use `getTag` to derive a constant-time `compare` (and `(==)` too). We were not aware of this optimization already being in the codebase. We'll surely revisit this. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15696#comment:20 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15696: Derived Ord instance for enumerations with more than 8 elements seems to be incorrect -------------------------------------+------------------------------------- Reporter: mrkkrp | Owner: (none) Type: bug | Status: patch Priority: highest | Milestone: 8.6.2 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D5196 Wiki Page: | -------------------------------------+------------------------------------- Changes (by osa1): * status: new => patch * differential: => Phab:D5196 Comment: Simon, please review. There are a few ways to fix this and I probably did the most conservative thing by always introducing a case around `dataToTag#`. Perhaps we want to change how we record evaluated-ness of CAFs in their Ids instead. We may also want to pay more attention to #14677 if we rely on strictness annotations of data constructor fields to decide on evaluatedness of values elsewhere. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15696#comment:21 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15696: Derived Ord instance for enumerations with more than 8 elements seems to be incorrect -------------------------------------+------------------------------------- Reporter: mrkkrp | Owner: (none) Type: bug | Status: patch Priority: highest | Milestone: 8.6.2 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D5196 Wiki Page: | -------------------------------------+------------------------------------- Comment (by osa1): One other thing to note is, as other commenters above already mentioned, this is _not_ a new bug! The bug was there since years. I think something else (maybe changes in simplifier) revealed it. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15696#comment:22 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15696: Derived Ord instance for enumerations with more than 8 elements seems to be incorrect -------------------------------------+------------------------------------- Reporter: mrkkrp | Owner: (none) Type: bug | Status: patch Priority: highest | Milestone: 8.6.2 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D5196 Wiki Page: | -------------------------------------+------------------------------------- Changes (by alpmestan): * cc: alpmestan (added) -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15696#comment:23 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15696: Derived Ord instance for enumerations with more than 8 elements seems to be incorrect -------------------------------------+------------------------------------- Reporter: mrkkrp | Owner: (none) Type: bug | Status: patch Priority: highest | Milestone: 8.6.2 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D5196 Wiki Page: | -------------------------------------+------------------------------------- Comment (by osa1): Thinking about this more. I think the only case where we can actually avoid entering the argument of `dataToTag#` is when the argument is a static closure. For anything else (a dynamic closure, a CAF) we need to enter it. I don't know how often people do `dataToTag# C` (for some constructor `C`), perhaps it's fine to always enter it. If it is then we can remove the special case for `dataToTag#` in CorePrep and in the codegen (in Cmm) we can just enter the argument. If we decide to do that then perhaps we should also revisit the `getTag` and remove the bang pattern on its argument. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15696#comment:24 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15696: Derived Ord instance for enumerations with more than 8 elements seems to be incorrect -------------------------------------+------------------------------------- Reporter: mrkkrp | Owner: (none) Type: bug | Status: patch Priority: highest | Milestone: 8.6.2 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D5196 Wiki Page: | -------------------------------------+------------------------------------- Comment (by dfeuer): Replying to [comment:24 osa1]:
Thinking about this more. I think the only case where we can actually avoid entering the argument of `dataToTag#` is when the argument is a static closure. For anything else (a dynamic closure, a CAF) we need to enter it.
I'm a tad confused. If the tag bits are non-zero, can't we just use them and avoid dereferencing the pointer? Or is that already folded into "enter it"? Or am I just wrong? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15696#comment:25 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15696: Derived Ord instance for enumerations with more than 8 elements seems to be incorrect -------------------------------------+------------------------------------- Reporter: mrkkrp | Owner: (none) Type: bug | Status: patch Priority: highest | Milestone: 8.6.2 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D5196 Wiki Page: | -------------------------------------+------------------------------------- Comment (by dfeuer): Ah... I am not entirely wrong, but I am partly wrong, because types with many constructors get tagged 1 when evaluated. I think we should at least be able to add a rule that recognizes when `dataToTag#` is used with a type with only a few constructors: see the logic in the `cgAlts gc_plan bndr (AlgAlt tycon) alts` case in `compiler/codeGen/StgCmmExpr.hs`. That way we only have to call `getConstrTag` when 1. The type is unknown, 2. The type has too many constructors, or 3. The tag bits are 0. I imagine you can do this quite easily by copying a bit of the `splitTyConApp_maybe` logic from `tagToEnumRule` to `dataToTagRule` in `PrelRules.hs` and adding a `DataToTagUsingTagOp` to supplement `DataToTagOp`. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15696#comment:26 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15696: Derived Ord instance for enumerations with more than 8 elements seems to be incorrect -------------------------------------+------------------------------------- Reporter: mrkkrp | Owner: (none) Type: bug | Status: patch Priority: highest | Milestone: 8.6.2 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D5196 Wiki Page: | -------------------------------------+------------------------------------- Comment (by dfeuer): Aha! Actually, I believe even one fall-back should be able to use tag bits, as long as they're greater than 1. So: === When the type is recognized as having a small number of constructors Check the tag bits. If they '''are 0''', enter the closure. Otherwise, subtract 1 to get the tag. === When the type is unknown Check the tag bits. If they '''are less than or equal to 1''', enter the closure. Otherwise, subtract 1 to get the tag. === When the type has too many constructors Enter the closure. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15696#comment:27 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15696: Derived Ord instance for enumerations with more than 8 elements seems to be incorrect -------------------------------------+------------------------------------- Reporter: mrkkrp | Owner: (none) Type: bug | Status: patch Priority: highest | Milestone: 8.6.2 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D5196 Wiki Page: | -------------------------------------+------------------------------------- Changes (by sgraf): * cc: sgraf (added) -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15696#comment:28 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15696: Derived Ord instance for enumerations with more than 8 elements seems to be incorrect -------------------------------------+------------------------------------- Reporter: mrkkrp | Owner: (none) Type: bug | Status: patch Priority: highest | Milestone: 8.6.2 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D5196 Wiki Page: | -------------------------------------+------------------------------------- Comment (by dfeuer): I guess what I'm suggesting would be implemented most easily by replacing the `dataToTag#` primop a little bit. {{{#!hs unKnownType, smallType, largeType :: Int# unKnownType = 0# smallType = 1# largeType = 2# dataToTag# x = dataToTagUsing# unknownType x -- Takes the "strategy" to use dataToTagUsing# :: Int# -> a -> Int# }}} This arrangement is designed so we shouldn't have to do any painful restructuring of the `caseRules` or `dataToTagRule`. The only potential problem I see is if any user-written `RULES` match on `dataToTag#`, because that won't work anymore. Should we worry about that? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15696#comment:29 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15696: Derived Ord instance for enumerations with more than 8 elements seems to be incorrect -------------------------------------+------------------------------------- Reporter: mrkkrp | Owner: (none) Type: bug | Status: patch Priority: highest | Milestone: 8.6.2 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D5196 Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonpj): Before deciding on a ''solution'', let's record what the ''problem'' is. The original thinking was this. * `dataToTag#` is a primop, so it has no business doing anything as complicated as evaluating its argument; we already have `case` expressions that the code-gen knows how to compile. * So `dataToTag#` expects an evaluated argument; in fact, stronger than that, it expects a pointer ''to the data value itself'', so that it can extract the tag directly from the info table. * But if it so happens that the value in question already ''is'' evaluated, that's a waste. Example {{{ f x = case x of y -> ...(dataToTag# y)... }}} It seems a bit silly for `dataToTag#` to re-evaluate `y`. * Hence, in `CorePrep` (see `Note [dataToTag magic]`) we add a `case` around the argument to `dataToTag#` ''unless'' the argument is already evaluated. * The "already-evaluated" test is `exprIsHNF`. * '''But alas''', while `exprIsHNF` guarantees that the thing will evaluate in O(1) time, it does ''not'' guarantee that we have a pointer to the data value itself. Omer accurately diagnoses this problem in the Note in his draft Phab:D5196: {{{ Note [Always introduce case around dataToTag# arg] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ We used to only generate a case expression around dataToTag# argument when it's not known to be evaluated (checked with `exprIsHNF`), but this is incorrect. Here's an example: data T = T1 | T2 main = print (I# (dataToTag# a)) where {-# NOINLINE f #-} f = T2 {-# NOINLINE a #-} a = f `f` is a static closure f_r1zz :: Main.T [GblId, Caf=NoCafRefs, Unf=OtherCon []] = CCS_DONT_CARE Main.T2! []; but `a` is an _updatable_ CAF! a_r1zA :: Main.T [GblId, Unf=OtherCon []] = [] \u [] f_r1zz; An updatable CAF is not in HNF (entering the closure does the `newCAF()` stuff and updates the closure with an IND_STATIC, the usual CAF evaluation routines), but according to `exprIsHNF` `a` is! }}} * I thought of having a more conservative test in `CorePrep`. But the rot goes further. Consider this, from `Note [dataToTag magic]` {{{ data T = MkT !Bool f v = case v of MkT y -> dataToTag# y }}} We certainly know that `y` will be evaluated, because `MkT` is a strict constructor. But does it guarantee to point directly to the data value? No! The case-to-let transformation in the Simplifier (`Simplify.doCaseToLet`) uses `exprIsHNF` and hence will drop the eval on `MkT`'s argument for things like Omer's `a` binding. And that is right in a way: the argument to `MkT a` certainly isn't bottom! But nor does it point to the data value. So that is a problem. But I think there is another. Look in comment:13 and comment:14. Here we have the extra `case` expressions correctly inserted, but we ''still'' get the wrong answer. '''So I think there may be ''two'' bugs'''. I'd like to understand the second before looking for a fix. Omer: could you investigate why comments 13 and 14 go wrong? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15696#comment:30 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15696: Derived Ord instance for enumerations with more than 8 elements seems to be incorrect -------------------------------------+------------------------------------- Reporter: mrkkrp | Owner: (none) Type: bug | Status: patch Priority: highest | Milestone: 8.6.2 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D5196 Wiki Page: | -------------------------------------+------------------------------------- Comment (by osa1): Sure. The problem in comment:13 is exactly what I explained in my patch (and you also copied it here in comment:30). I get this STG with GHC HEAD, -O0: {{{ f_r1C3 :: Main.T [GblId, Caf=NoCafRefs, Unf=OtherCon []] = CCS_DONT_CARE Main.T2! []; a_r1C4 :: Main.T [GblId, Unf=OtherCon []] = [] \u [] f_r1C3; sat_s1OH :: GHC.Types.Ordering [LclId] = [] \u [] case dataToTag# [a_r1C4] of a'_s1OD { __DEFAULT -> case <# [a'_s1OD 1#] of sat_s1OE [Occ=Once] { __DEFAULT -> case tagToEnum# [sat_s1OE] of { GHC.Types.False -> case a'_s1OD of { __DEFAULT -> GHC.Types.GT []; 1# -> GHC.Types.EQ []; }; GHC.Types.True -> GHC.Types.LT []; }; }; }; }}} Notice that we do `dataToTag# [a_r1C4]` and `a_r1C4` is an updatable CAF. The result is I get `LT` instead of `EQ`. (this is one of the regression tests I added) I get very similar STG (with the same bug) and same results with these configurations: GHC HEAD, GHC 8.2.2, GHC 8.0.2. All tried with -O0, -O1, -O2. I'm guessing that you added a NOINLINE for `cmpT` in comment:14. When I do that I get the right answer with all optimisation settings (GHC HEAD), and that makes sense becuase the arguments are now known to be evaluated and `exprIsHNF` correctly returns `False` for the arguments, so we do case on the args. I don't know how you get incorrect result in the STG shown in comment:14. Could it be that you used an older binary of the test program or something like that? If you give more detailed instructions to reproduce I can take a look. I agree that if the STG in comment:14 is producing wrong result then there's at least one more bug. However I get identical STG when I add NOINLINE aroung `cmpT` (with GHC HEAD, with -O2) and it works as expected. You said you also made `dataToTag#` lazier, but because I get identical STG as you I haven't tried to change the primop laziness (it shouldn't matter in STG). -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15696#comment:31 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15696: Derived Ord instance for enumerations with more than 8 elements seems to be incorrect -------------------------------------+------------------------------------- Reporter: mrkkrp | Owner: (none) Type: bug | Status: patch Priority: highest | Milestone: 8.6.2 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D5196 Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonpj):
I get this STG with GHC HEAD, -O0:
That's what you get for monoidal's code in comment:18. Note the missing eval before the `dataToTag#`. But it is ''not'' what you get for the code in comment:13. The evals are there, as comment:14 shows. And yet we get the wrong answer. But why? That is the second bug. Can you see why that happens? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15696#comment:32 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15696: Derived Ord instance for enumerations with more than 8 elements seems to be incorrect -------------------------------------+------------------------------------- Reporter: mrkkrp | Owner: (none) Type: bug | Status: patch Priority: highest | Milestone: 8.6.2 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D5196 Wiki Page: | -------------------------------------+------------------------------------- Comment (by osa1): I'm confused
That's what you get for monoidal's code in comment:18
I didn't even look at comment:18. I tried your code in comment:13 and what happens there is exactly what I explained in my patch.
The evals are there, as comment:14 shows.
comment:14 shows that evals are there if I don't inline `cmpT`, and that makes sense as I explained in comment:31. We can't assume that args are evaluated so we eval them.
And yet we get the wrong answer.
I get the right answer when I get the STG in comment:14. Could you try to reproduce the error in comment:14 and give me more detailed instructions (showing the source and invoked GHC commands)? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15696#comment:33 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15696: Derived Ord instance for enumerations with more than 8 elements seems to be incorrect -------------------------------------+------------------------------------- Reporter: mrkkrp | Owner: (none) Type: bug | Status: patch Priority: highest | Milestone: 8.6.2 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D5196 Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonpj): Oops sorry. I'd modified the code slightly! Try this {{{ {-# LANGUAGE MagicHash #-} module Main where import GHC.Exts (dataToTag#, tagToEnum#, (==#), (<#)) import GHC.Base ( getTag ) main :: IO () main = print $ cmpT a T2 where {-# NOINLINE f #-} f = T2 {-# NOINLINE a #-} a = f data T = T1 | T2 | T3 | T4 | T5 | T6 | T7 | T8 | T9 -- deriving (Eq, Show, Ord) cmpT a b = case getTag a of a' -> case getTag b of b' -> if tagToEnum# (a' <# b') :: Bool then LT else if tagToEnum# (a' ==# b') :: Bool then EQ else GT }}} Compile with -O and run. It prints `LT` when it should print `EQ`. But the evals are there! -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15696#comment:34 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15696: Derived Ord instance for enumerations with more than 8 elements seems to be incorrect -------------------------------------+------------------------------------- Reporter: mrkkrp | Owner: (none) Type: bug | Status: patch Priority: highest | Milestone: 8.6.2 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D5196 Wiki Page: | -------------------------------------+------------------------------------- Comment (by osa1): OK. This is the STG I get: {{{ Main.main_f [InlPrag=NOINLINE] :: Main.T [GblId, Caf=NoCafRefs, Unf=OtherCon []] = CCS_DONT_CARE Main.T2! []; Main.main_a [InlPrag=NOINLINE] :: Main.T [GblId, Unf=OtherCon []] = [] \u [] Main.main_f; Main.main1 :: GHC.Base.String [GblId] = [] \u [] case dataToTag# [Main.main_a] of a'_s3tm { __DEFAULT -> case <# [a'_s3tm 1#] of { __DEFAULT -> case a'_s3tm of { __DEFAULT -> GHC.Show.$fShowOrdering1; 1# -> GHC.Show.$fShowOrdering3; }; 1# -> GHC.Show.$fShowOrdering5; }; }; }}} Exactly the same problem, no evals around the CAF (`main_a`). Are you maybe confused because in the STG dump you also see the STG for `cmpT`? It's actually inlined in `main` so the top-level for `cmpT` is not used. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15696#comment:35 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15696: Derived Ord instance for enumerations with more than 8 elements seems to be incorrect -------------------------------------+------------------------------------- Reporter: mrkkrp | Owner: (none) Type: bug | Status: patch Priority: highest | Milestone: 8.6.2 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D5196 Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonpj):
Are you maybe confused because in the STG dump you also see the STG for cmpT? It's actually inlined in main so the top-level for cmpT is not used.
That is exactly what I was doing. Thanks!
I backed up some more to look at comment:6, which does not have a top-
level no-inline CAF. I see this:
{{{
$sgo_r5Tk :: Main.T -> Main.T -> Main.Set Main.T -> Main.Set Main.T
[GblId, Arity=3, Str=, Unf=OtherCon []] =
sat-only [] \r [orig_s5Wj ds_s5Wk ds1_s5Wl]
case ds1_s5Wl of wild_s5Wm [Occ=Once] {
Main.Bin y_s5Wn l_s5Wo [Occ=Once] r_s5Wp [Occ=Once] ->
case
case ds_s5Wk of sat_s5Wq [Occ=Once] {
__DEFAULT -> dataToTag# [sat_s5Wq];
}
of
a'_s5Wr
{ __DEFAULT ->
case dataToTag# [y_s5Wn] of b'_s5Ws {
__DEFAULT ->
case <# [a'_s5Wr b'_s5Ws] of {
__DEFAULT ->
...
}}}
Here we take apart a `Bin`, and call `dataToTag#` on the contents; and
because of the `exprIsHNF` stuff there is
no guarantee that the argument to `Bin` points directly to the data value.
But in looking at this I found something else! In coment:6 there is no
top-level CAF with a NOINLINE, so
how do things go wrong. Here's how.
* We start with
{{{
thk = f ()
g x = ...(case thk of v -> Bin v Tip Tip)...
}}}
So far so good; the argument to `Bin` (which is strict) is evaluated.
* Now we do float-out. And in doing so we do a reverse binder-swap (see
`Note [Binder-swap during float-out]` in `SetLevels`) thus
{{{
g x = ...(case thk of v -> Bin thk Nil Nil)...
}}}
The goal of the reverse binder-swap is to allow more floating -- and it
does! We float the `Bin` to top level:
{{{
lvl = Bin thk Tip Tip
g x = ...(case thk of v -> lvl))...
}}}
* Now you can see that the argument of `Bin`, namely `thk`, points to the
thunk, not to the value as it did before; and that gives rise to the bug.
Is this wrong? Not really. We are still guaranteed that the argument to
`Bin` in `lvl` will be evaluated (by that `case thk`) before `lvl` is
used. But we are no longer guaranteed that the argument to `Bin` points
directly to the evaluated value.
So now I understand.
--
Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15696#comment:36
GHC http://www.haskell.org/ghc/
The Glasgow Haskell Compiler

#15696: Derived Ord instance for enumerations with more than 8 elements seems to be incorrect -------------------------------------+------------------------------------- Reporter: mrkkrp | Owner: (none) Type: bug | Status: patch Priority: highest | Milestone: 8.6.2 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D5196 Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonpj): OK, moving on to solutions. Your patch seems OK. But I wonder if `CorePrep` is the right place to do it. After all, some other STG manipulation might break it again. I think the right thing is for the code generator to do the job; that is, in effect implement `dataToTag#` properly. That is, in `StgCmmPrim`, in {{{ cgOpApp (StgPrimOp primop) args res_ty = do }}} add a special case for `DataToTagOp`, when we are compiling `dataToTag# x`. Then behave exactly as if we'd seen `case x of y -> dataToTag## y`, where by `dataToTag## y` I mean generate the code the looks in the info table. (We have that code here {{{ -- #define dataToTagzh(r,a) r=(GET_TAG(((StgClosure *)a)->header.info)) -- Note: argument may be tagged! emitPrimOp dflags [res] DataToTagOp [arg] = emitAssign (CmmLocal res) (getConstrTag dflags (cmmUntag dflags arg)) }}} ). And by "behave exactly as if we'd see case ..." I roughly mean call `StgCmmExpr.cgCase`. But that need some `alts` which we don't conveniently have. The easiest thing would be to take `-- the general case` equation for `cgCase` and split off the bit that does the eval, so that we can call it from `dataToTag#`. Doing this is not trivial, but it feels like the Right Thing, and will remove the magic from `dataToTag#`. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15696#comment:37 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15696: Derived Ord instance for enumerations with more than 8 elements seems to be incorrect -------------------------------------+------------------------------------- Reporter: mrkkrp | Owner: (none) Type: bug | Status: patch Priority: highest | Milestone: 8.6.2 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D5196 Wiki Page: | -------------------------------------+------------------------------------- Comment (by osa1): Right, fixing this in Cmm is also what I suggested in comment:24. I think two key questions are: - Can we optimise this? That is, can we know for certain that the argument of a `dataToTag#` points directly to a value (without going through blackholes or other indirections like IND_STATIC)? - Is this worth optimising? Unless the answer to both of those is yes, then I think it makes sense to do this in Cmm (when compiling the primop). Then we can remove a bunch of notes about `getTag` and hacks in CorePrep, and simplify `getTag` to `getTag = dataToTag#`. I think we can also update the primop as `can_fail = False`. It seems to me that Core is not the right place to answer (1). Even in STG I don't know if we can answer that question with certainty. I think it's doable in Cmm where we know about "lambda form" of ids (`LambdaFormInfo`) which tells us about how to enter an object. If "lambda form" of an argument is `LFCon` then we optimise, otherwise we enter. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15696#comment:38 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15696: Derived Ord instance for enumerations with more than 8 elements seems to be incorrect -------------------------------------+------------------------------------- Reporter: mrkkrp | Owner: (none) Type: bug | Status: patch Priority: highest | Milestone: 8.6.2 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D5196 Wiki Page: | -------------------------------------+------------------------------------- Comment (by dfeuer): What I've been suggesting is that even when we don't statically know if it's been evaluated or tagged, we should surely be using tag bits when we find them. It looks like an oversight that we haven't done so in the past, and I foresee substantial benefits to fixing that. Perhaps we can win back more performance than we lose. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15696#comment:39 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15696: Derived Ord instance for enumerations with more than 8 elements seems to be incorrect -------------------------------------+------------------------------------- Reporter: mrkkrp | Owner: (none) Type: bug | Status: patch Priority: highest | Milestone: 8.6.2 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D5196 Wiki Page: | -------------------------------------+------------------------------------- Comment (by osa1): Right, agreed that looking at tag bits would work for small types. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15696#comment:40 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15696: Derived Ord instance for enumerations with more than 8 elements seems to be incorrect -------------------------------------+------------------------------------- Reporter: mrkkrp | Owner: (none) Type: bug | Status: patch Priority: highest | Milestone: 8.6.2 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D5196 Wiki Page: | -------------------------------------+------------------------------------- Comment (by simonpj): I agree with comment:38. Yes, the code generator is the right place to do it. Yes, we could ooptimise for `LFCon`, but the simplifier will have done that already. The only case I think we could reliably optimise, that would not be done already, is {{{ case x of y A -> blah DEFAULT -> ...(dataToTag# y)... }}} In this case we really do know that `y` points to the value. It would not be hard to let the code gen spot this; but I doubt it would happen much.
Right, agreed that looking at tag bits would work for small types.
This should happen automatically if we use the code for `cgCase`. It already has a fast-path for the case when the scrutinee is evaluated. But you point is perhaps that for small types we don't need to index the info table: the tag is in the bits. Yes, that's a good idea. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15696#comment:41 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15696: Derived Ord instance for enumerations with more than 8 elements seems to be incorrect -------------------------------------+------------------------------------- Reporter: mrkkrp | Owner: (none) Type: bug | Status: patch Priority: highest | Milestone: 8.6.2 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D5196 Wiki Page: | -------------------------------------+------------------------------------- Comment (by dfeuer): Ah, I see what you're saying. The 0 tag is handled by the `case` wrapper (or C-- equivalent). So for small types, we don't need any extra test; we can just use the tag bits directly. I suppose also that my attempt to use tag bits for unknown types is more trouble than it's worth; we should just assume that those have too many constructors. As for the type-driven rewriting, I think we basically have two options: 1. Do it in PrelRules. I think this requires something like the `dataToTagWith#` primop I mentioned earlier. 2. Do it sometime later (tidy core, core prep, or lowering to STG). This lets us stick with `dataToTag#` throughout core2core, which is nice, but I don't know enough to know if any of those make sense or where/how to slot in the rule. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15696#comment:42 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15696: Derived Ord instance for enumerations with more than 8 elements seems to be incorrect -------------------------------------+------------------------------------- Reporter: mrkkrp | Owner: (none) Type: bug | Status: patch Priority: highest | Milestone: 8.6.2 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D5196, Wiki Page: | Phab:D5201 -------------------------------------+------------------------------------- Changes (by osa1): * differential: Phab:D5196 => Phab:D5196, Phab:D5201 Comment:
As for the type-driven rewriting
This is not possible unless we somehow distinguish "stuff that directly points to a value (without going through indirections)" from other at the type level. ---- I submitted another diff that fixes this bug in Cmm. It also does a bunch of simplifications in other parts of the compiler (removes notes, special cases, and hacks for `dataToTag#`). Here are some example code we generate: Program: {{{ data T = T1 | T2 main = do print (I# (dataToTag# f)) print (I# (dataToTag# a)) where {-# NOINLINE f #-} f = T2 {-# NOINLINE a #-} a = f }}} For the first `dataToTag#` we generate: {{{ _s1yo::I64 = 1; // CmmAssign }}} for the second {{{ // ======= DATA TO TAG ======== Hp = Hp - 16; // CmmAssign I64[Sp - 24] = c1yE; // CmmStore R1 = a_r1lF_closure; // CmmAssign Sp = Sp - 24; // CmmAssign if (R1 & 7 != 0) goto c1yE; else goto c1yF; // CmmCondBranch c1yF: // global call (I64[R1])(R1) returns to c1yE, args: 8, res: 8, upd: 24; // CmmCall, this is where we evalaute the arg c1yE: // global _c1yD::I64 = R1; // CmmAssign // ======= DATA TO TAG SMALL FAMILY ======== _s1yr::I64 = _c1yD::I64 & 7 - 1; // CmmAssign, read the tag bits }}} If I make this type a "big family", then we generate (for the second `dataToTag#`) {{{ // ======= DATA TO TAG ======== Hp = Hp - 16; // CmmAssign I64[Sp - 24] = c1A0; // CmmStore R1 = a_r1mL_closure; // CmmAssign Sp = Sp - 24; // CmmAssign if (R1 & 7 != 0) goto c1A0; else goto c1A1; // CmmCondBranch c1A1: // global call (I64[R1])(R1) returns to c1A0, args: 8, res: 8, upd: 24; // CmmCall, enter the argument c1A0: // global _c1zZ::I64 = R1; // CmmAssign // ======= DATA TO TAG GENERAL CASE ======== _s1zN::I64 = %MO_UU_Conv_W32_W64(I32[I64[_c1zZ::I64 & (-8)] - 4]); // CmmAssign, read info table }}} The patch is not ready (I'll need to update some notes, I also left some TODOs and questions in the code) but it fixes the bug and demonstrates that this is possible (and even easy) to do in Cmm. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15696#comment:43 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15696: Derived Ord instance for enumerations with more than 8 elements seems to be incorrect -------------------------------------+------------------------------------- Reporter: mrkkrp | Owner: (none) Type: bug | Status: patch Priority: highest | Milestone: 8.6.2 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D5196, Wiki Page: | Phab:D5201 -------------------------------------+------------------------------------- Comment (by osa1): So we mention this "optimisation" without describing what it is, and I realized that that may cause some confusion so let's define it. By "optimisation" I mean that in some cases when compiling `dataToTag#` we can avoid entering the argument, because we already know enough about it. - In compile time we sometimes know that the argument is definitely some constructor `C`. For example, in the example in comment:43, we know that `f` is `T2`, so we should be able to compile `dataToTag# f` to `1#`. Another example is `case foo of X -> ... dataToTag# foo ...`. Note that we currently (with GHC 8.6 and HEAD) don't do this for `f` in comment:43, even with -O2. However with Phab:D5201 `f` in comment:43 is optimised to `1#` in Cmm level. The case expression is optimised in all versions. (both diffs preserve Core transformations and rules, so anything optimised at Core level by GHC HEAD will still be optimised) (I don't know why `dataToTag# f` in comment:43 is not optimised in Core level, but perhaps that can be tracked in another ticket) - In compile time we sometimes know that a value is already evaluated, but we don't know specifics (i.e. which constructor it is). For example, in a pattern match, a variable binding a strict field should be evaluated. Another example is `let !foo = f x in ... dataToTag# foo ...`. In these cases we should be able to avoid entering the argument. There are a few problems with this: First, as #14677 demonstrates, we sometimes fail to tag values properly. So for example if a strict field's type is small, we still can't look at tag bits, we need to read the info table. Second, the analysis for this should be more strict than the current `exprIsHNF` or `isEvaldUnfolding` (see Note [Always introduce case around dataToTag# arg] in comment:30 for what goes wrong otherwise). However, we currently don't have an analysis for this, and I don't know how easy it'd be to implement in Core (remember that we don't know about CAFs in Core until CorePrep, and this information is necessary for this analysis). I don't know if going into the trouble of implementing a new analysis in Core just for this purpose is worth it. - For all other cases we need to enter the argument in runtime. If the argument is a small type then we can look at the tag bits, otherwise we read the info table. In other words, case (1) is when we know the constructor of the argument in compile time, case (2) is when we don't know the constructor, but we know that the argument is already evaluated, case (3) is when we don't know anything about the argument. Looking at the tag bits in (3) is currently only done by Phab:D5201, but it could be easily added to other implementations. (2) is currently only done by GHC 8.6 (and HEAD), but it's done incorrectly (thus this ticket). I think we should avoid this at least for now. (1) Is done best in Phab:D5201, because we check lambda form of the argument in Cmm, which optimises some code that escapes Core optimisations. However, I think we should be able to improve simplifier somehow to optimise `dataToTag# f` in comment:43. We can track this in another ticket. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15696#comment:44 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15696: Derived Ord instance for enumerations with more than 8 elements seems to be incorrect -------------------------------------+------------------------------------- Reporter: mrkkrp | Owner: (none) Type: bug | Status: patch Priority: highest | Milestone: 8.6.2 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D5196, Wiki Page: | Phab:D5201 -------------------------------------+------------------------------------- Comment (by osa1): Here's my (hopefully final) proposal: (1) is best done in Core. I'll remove the relevant code from Phab:D5201. For cases that should be optimised but currently aren't, we can open a new ticket and track the progress there. (2) is best avoided. It opens a can of worms, and it's what caused this ticket. (3) Should be done in Cmm. Phab:D5201 already does this, but we need to do some refactoring to be able to reliably check whether the argument type is a large or small family (to decide whether to check tag bits of read the info table). -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15696#comment:45 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15696: Derived Ord instance for enumerations with more than 8 elements seems to be incorrect -------------------------------------+------------------------------------- Reporter: mrkkrp | Owner: (none) Type: bug | Status: patch Priority: highest | Milestone: 8.6.2 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D5196, Wiki Page: | Phab:D5201 -------------------------------------+------------------------------------- Comment (by simonpj): Comments about Phab:D5201. I'm not looking at the details yet (it's just a draft), but * I think this is the Right Place to deal with `dataToTag#`. Bravo. * The known-constructor case surely will be handled in the simplifier; if not now then soon. Handling it here is not wrong, but probably unnecesssary. * The thing that we CAN ONLY handle here is {{{ ...(case x of y { A -> blah DEFAULT -> ....(dataToTag# y)... }}} Here we know that `y` really points to the value, so `dastaToTag#` does not need to do a redundant eval. However I have just realised that this optimisation is available for '''any''' case expression, not just `dataToTag#`. Consider {{{ ...(case x of y { A -> blah DEFAULT -> ....(case y of B -> blah2 C -> blah3 )... }}} Here we know that `y` is fully evaluated and points to the final value; that's the promise of the outer case expression. So we can directly test y's tag bits without worrying that it might be unevaluated, need to build a return point and info table etc. Ha! I'm not sure how common this is. But perhaps it is worth a separate ticket. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15696#comment:46 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15696: Derived Ord instance for enumerations with more than 8 elements seems to be incorrect -------------------------------------+------------------------------------- Reporter: mrkkrp | Owner: (none) Type: bug | Status: patch Priority: highest | Milestone: 8.6.2 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D5196, Wiki Page: | Phab:D5201 -------------------------------------+------------------------------------- Comment (by simonpj):
In compile time we sometimes know that a value is already evaluated, but we don't know specifics
This ticket has convinced me that the property that "y is a correctly tagged pointer directly to an evaluated value" is ''extremely delicate''. The only time we are really sure of this is in the case-binder of a case expression: {{{ case e of y <alts> }}} In `<alts>` we know that `y` really is a tagged pointer and points to the value. I used to think that this was also true of the strict fields of a data constructor, but not so! See comment:36. Moreover, as comment:36 shows, the Simplifier (for good reasons) does not guaranteed to maintain the Delicate Property, even it if it holds at some point. Only the code generator knows for sure. Lets ''not'' attempt to do this in Core. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15696#comment:47 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15696: Derived Ord instance for enumerations with more than 8 elements seems to be incorrect -------------------------------------+------------------------------------- Reporter: mrkkrp | Owner: (none) Type: bug | Status: patch Priority: highest | Milestone: 8.6.2 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D5196, Wiki Page: | Phab:D5201 -------------------------------------+------------------------------------- Comment (by osa1): Great, I think we're on the same page. I'm finalizing Phab:D5201. Currently stuck on a let/app invariant invalidation (Core lint error) when I mark `dataToTag#` as can't fail (`can_fail = False`). -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15696#comment:48 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15696: Derived Ord instance for enumerations with more than 8 elements seems to be incorrect -------------------------------------+------------------------------------- Reporter: mrkkrp | Owner: (none) Type: bug | Status: patch Priority: highest | Milestone: 8.6.2 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D5196, Wiki Page: | Phab:D5201 -------------------------------------+------------------------------------- Comment (by osa1): Simon, what do you think about marking `dataToTag#` as can't fail? I thought it makes sense, but when I do that I get this lint error: (add type annotation on scrutinee) {{{ <no location info>: warning: In the expression: tagToEnum# @ Bool (case (ds :: Instr) of lwild { __DEFAULT -> 0#; C_ALabel_1 ipv -> 1# }) This argument does not satisfy the let/app invariant: case ds of lwild { __DEFAULT -> 0#; C_ALabel_1 ipv -> 1# } }}} Note that the expression does not actually have `dataToTag#`. Apparently making `dataToTag#` can't fail enables some transformations, which leads to this. (Type of `ds` is a lifted sum type named `Instr`) If I don't make it "can't fail", I get this expression instead {{{ case GHC.Prim.dataToTag# @ Instr ds of b# { __DEFAULT -> GHC.Prim.tagToEnum# @ Bool (case b# of { __DEFAULT -> 0#; 0# -> 1# }) }; }}} The version with "can't fail" is better because we eliminate a redundant `dataToTag#` call, but apparently the resulting expression is not "OK for speculation". I think this expression is not OK for speculation because the scrutinee is lifted, but I'm not sure. I also don't know why lifted scrutinee is a problem for speculation.. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15696#comment:49 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15696: Derived Ord instance for enumerations with more than 8 elements seems to be incorrect -------------------------------------+------------------------------------- Reporter: mrkkrp | Owner: (none) Type: bug | Status: patch Priority: highest | Milestone: 8.6.2 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D5196, Wiki Page: | Phab:D5201 -------------------------------------+------------------------------------- Comment (by simonpj):
Simon, what do you think about marking dataToTag# as can't fail?
It makes sense to me to do so -- but it's an unforced change. It's not so marked today, primops.txt.pp even says why {{{ {- Note [dataToTag#] ~~~~~~~~~~~~~~~~~~~~ The dataToTag# primop should always be applied to an evaluated argument. The way to ensure this is to invoke it via the 'getTag' wrapper in GHC.Base: getTag :: a -> Int# getTag !x = dataToTag# x But now consider \z. case x of y -> let v = dataToTag# y in ... To improve floating, the FloatOut pass (deliberately) does a binder-swap on the case, to give \z. case x of y -> let v = dataToTag# x in ... Now FloatOut might float that v-binding outside the \z. But that is bad because that might mean x gets evaluated much too early! (CorePrep adds an eval to a dataToTag# call, to ensure that the argument really is evaluated; see CorePrep Note [dataToTag magic].) Solution: make DataToTag into a can_fail primop. That will stop it floating (see Note [PrimOp can_fail and has_side_effects] in PrimOp). It's a bit of a hack but never mind. }}} Are we good to go if you don't make this change? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15696#comment:50 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15696: Derived Ord instance for enumerations with more than 8 elements seems to be incorrect -------------------------------------+------------------------------------- Reporter: mrkkrp | Owner: (none) Type: bug | Status: patch Priority: highest | Milestone: 8.6.2 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D5196, Wiki Page: | Phab:D5201 -------------------------------------+------------------------------------- Comment (by simonpj):
In compile time we sometimes know that the argument is definitely some constructor C
The simplifier does optimise this: {{{ data T = T1 | T2 | T3 | T4 Char boo = T4 'x' f x = case x of T1 -> getTag x T2 -> getTag boo _ -> getTag x }}} produces `-ddump-simpl` {{{ f :: T -> GHC.Prim.Int# f = \ (x_aXJ :: T) -> case x_aXJ of wild_Xf { __DEFAULT -> GHC.Prim.dataToTag# @ T wild_Xf; T1 -> 0#; T2 -> 3# } }}} -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15696#comment:51 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15696: Derived Ord instance for enumerations with more than 8 elements seems to be incorrect -------------------------------------+------------------------------------- Reporter: mrkkrp | Owner: (none) Type: bug | Status: patch Priority: highest | Milestone: 8.6.2 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D5196, Wiki Page: | Phab:D5201 -------------------------------------+------------------------------------- Comment (by osa1): I understand why it's currently marked as `can_fail`, but if my understanding is correct we should be able to mark it as `can_fail = False` now. I'm curious why that's causing problems. Also, as shown in comment:48, the code is actually better with `can_fail = False`.
Are we good to go if you don't make this change?
I tried to validate now without `can_fail = False`. I'm getting a segfault in haddock (when building docs during validate), then in the test suite haddock perf tests are failing. Other tests are passing. I'll debug. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15696#comment:52 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15696: Derived Ord instance for enumerations with more than 8 elements seems to be incorrect -------------------------------------+------------------------------------- Reporter: mrkkrp | Owner: (none) Type: bug | Status: patch Priority: highest | Milestone: 8.6.2 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D5196, Wiki Page: | Phab:D5201 -------------------------------------+------------------------------------- Comment (by simonpj):
if my understanding is correct we should be able to mark it as can_fail = False now
Why do you think the Note is now not relevant? It looks every bit as pertinent now as before. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15696#comment:53 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15696: Derived Ord instance for enumerations with more than 8 elements seems to be incorrect -------------------------------------+------------------------------------- Reporter: mrkkrp | Owner: (none) Type: bug | Status: patch Priority: highest | Milestone: 8.6.2 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D5196, Wiki Page: | Phab:D5201 -------------------------------------+------------------------------------- Comment (by osa1): Well because this transformation \z. case x of y -> let v = dataToTag# y in ... ==> \z. case x of y -> let v = dataToTag# x in ... is not a problem anymore. Although I now realize that floating `v` outsize of `\z` is still a problem, because that'd mean evaluating `x` more eagerly. So I think first part of the note isn't relevant anymore, but second part (floating out) is. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15696#comment:54 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

bt #0 evacuate1 (p=p@entry=0x420fe54720) at rts/sm/Evac.c:642 #1 0x00007fdcea628c35 in scavenge_block1 (bd=0x420fe01500) at rts/sm/Scav.c:541 #2 0x00007fdcea648f5f in scavenge_find_work () at rts/sm/Scav.c:2067 #3 scavenge_loop1 () at rts/sm/Scav.c:2130 #4 0x00007fdcea6502e2 in scavenge_until_all_done () at rts/sm/GC.c:1090 #5 0x00007fdcea650c3f in GarbageCollect (collect_gen=collect_gen@entry=1, do_heap_census=do_heap_census@entry=false, gc_type=gc_type@entry=2, cap=cap@entry=0x7fdcea686ac0 <MainCapability>, idle_cap=idle_cap@entry=0x1f14dd0) at rts/sm/GC.c:421 #6 0x00007fdcea635f17 in scheduleDoGC (pcap=pcap@entry=0x7ffdb2c2fea0, task=task@entry=0x1f109b0, force_major=force_major@entry=false) at rts/Schedule.c:1798 #7 0x00007fdcea6368ec in schedule (initialCapability=initialCapability@entry=0x7fdcea686ac0 <MainCapability>, task=task@entry=0x1f109b0) at rts/Schedule.c:546 #8 0x00007fdcea637e41 in scheduleWaitThread (tso=0x4200006388, ret=ret@entry=0x0, pcap=pcap@entry=0x7ffdb2c2ff38) at rts/Schedule.c:2537 #9 0x00007fdcea642a98 in rts_evalLazyIO (cap=cap@entry=0x7ffdb2c2ff38,
#15696: Derived Ord instance for enumerations with more than 8 elements seems to be incorrect -------------------------------------+------------------------------------- Reporter: mrkkrp | Owner: (none) Type: bug | Status: patch Priority: highest | Milestone: 8.6.2 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D5196, Wiki Page: | Phab:D5201 -------------------------------------+------------------------------------- Comment (by osa1): I did a little bit of debugging for the segfault problem. I'm wondering if I'm triggering some other bug, because the segfault is happening during GC: {{{ p=p@entry=0x756150, ret=ret@entry=0x0) at rts/RtsAPI.c:530 #10 0x00007fdcea6433ec in hs_main (argc=<optimized out>, argv=<optimized out>, main_closure=0x756150, rts_config=...) at rts/RtsMain.c:72 #11 0x0000000000749560 in main () }}} The info table pointer is wrong: {{{
print info $1 = (const StgInfoTable *) 0x72 }}}
-- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15696#comment:55 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15696: Derived Ord instance for enumerations with more than 8 elements seems to be incorrect -------------------------------------+------------------------------------- Reporter: mrkkrp | Owner: (none) Type: bug | Status: patch Priority: highest | Milestone: 8.6.2 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D5196, Wiki Page: | Phab:D5201 -------------------------------------+------------------------------------- Comment (by osa1): Interestingly, if I revert the Cmm bits and always introduce a case on dataToTag# args in CorePrep I can validate. Patch is [https://github.com/osa1/ghc/commit/1b9bd4769c2c70b23ef8739aba116f9d18628b00 here]. I'll compare the differences in generated code in this branch and Phab:D5201. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15696#comment:56 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15696: Derived Ord instance for enumerations with more than 8 elements seems to be incorrect -------------------------------------+------------------------------------- Reporter: mrkkrp | Owner: (none) Type: bug | Status: patch Priority: highest | Milestone: 8.6.2 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D5196, Wiki Page: | Phab:D5201 -------------------------------------+------------------------------------- Comment (by osa1): Phab:D5201 currently validates and is ready for reviews. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15696#comment:57 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15696: Derived Ord instance for enumerations with more than 8 elements seems to be incorrect -------------------------------------+------------------------------------- Reporter: mrkkrp | Owner: (none) Type: bug | Status: patch Priority: highest | Milestone: 8.6.2 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D5196, Wiki Page: | Phab:D5201 -------------------------------------+------------------------------------- Comment (by simonpj): I'm confused. You report a set-fault in comment:55; but now you say it's ready for review. How did you fix it? What is the new design? It's hard to review the patch without understanding the thinking. Thanks! -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15696#comment:58 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15696: Derived Ord instance for enumerations with more than 8 elements seems to be incorrect -------------------------------------+------------------------------------- Reporter: mrkkrp | Owner: (none) Type: bug | Status: patch Priority: highest | Milestone: 8.6.2 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D5196, Wiki Page: | Phab:D5201 -------------------------------------+------------------------------------- Comment (by osa1): Sorry it's a bit hard to follow -- I asked Simon Marlow about the segfault, and he correctly guessed that the problem is that primops are assumed to be non-allocating and I have to update `StgCmmExpr.isSimpleOp`. This change is included in the diff and it fixes the segfault. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15696#comment:59 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15696: Derived Ord instance for enumerations with more than 8 elements seems to be incorrect -------------------------------------+------------------------------------- Reporter: mrkkrp | Owner: (none) Type: bug | Status: patch Priority: highest | Milestone: 8.6.2 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D5196, Wiki Page: | Phab:D5201 -------------------------------------+------------------------------------- Comment (by simonpj): The `dataToTag#` saga has revealed several things, which I'd like to summarise here. * We have been guilty of mixing up two things: * A ''semantic'' property, namely that an expression, or variable, is guaranteed to be non-bottom. For example, given {{{ data T = MkT !Int f (MkT y) = ...y... }}} in the body of `f`, we know that `y` is non-bottom. * A ''representational'' propery, that a variable is bound to a heap pointer that (a) points directly to an evaluated heap-allocated value (e.g. a cons cell), and (b) is properly tagged, for data constructors with small enough tags. The representational property implies the semantic property, ''but not vice versa''. For example, in the above code, are we guaranteed that `y` (being the strict field of `MkT` is bound to a properly-tagged pointer to an `Int`? I used to think yes, but [https://ghc.haskell.org/trac/ghc/ticket/15696#comment:36 this example] (look at the bullets in that comment) shows that the answer is NO. My conclusion: we should keep the two properties separate. The Simplifier can maintain the first; but the second is much more fragile, and is the business of the code generator alone. * We have never implemented #14626, but it's pretty easy to do so provided we focus our attention on the code generator. The example there is {{{ f x = case x of y Red -> Green DEFAULT -> y }}} The resturn contract of `case` specifies that `y` is bound to a properly-tagged pointer directly to the value; The code generator can remember this fact, in its environment; and simply ''return'' `y` in the DEFAULT branch rather than ''entering'' it. * `exprOkForSpeculation` finishes with this case (in `app_ok`) {{{ _other -> isUnliftedType (idType fun) -- c.f. the Var case of exprIsHNF || idArity fun > n_val_args -- Partial apps || (n_val_args == 0 && isEvaldUnfolding (idUnfolding fun)) -- Let-bound values }}} I think the final disjunct is wrong, becuase it is fragile. Consider {{{ case x of y { DEFAULT -> ...y.... } }}} In that branch, is `y` ok-for-speculation? It looks so, because it has an evald unfolding (`OtherCon`). But the binder-swap transformation performed by FloatOut transforms it to {{{ case x of y { DEFAULT -> ...x.... } }}} and `x` is NOT ok-for-spec. So the binder-swap transformation might invalidate invariants. This is pretty obscure in practice, but I think we can make the compiler simpler and more correct by deleting those two lines. I think it will have zero effect because `exprOkForSpecuation` is seldom if ever called on expressions of lifted type. * Only two primops (currently) evaluate a lifted-type argument: `SeqOp` and `DataToTagOp`. The former has a special case in `app_ok` (in `exprOkForSpeculation` and the latter should too. * Once all this is done, the hack of making `DataToTagOp` have `can_fail=True` can be dispensed with. * We have related loose ends in #14677 and #15155. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15696#comment:60 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15696: Derived Ord instance for enumerations with more than 8 elements seems to be incorrect -------------------------------------+------------------------------------- Reporter: mrkkrp | Owner: (none) Type: bug | Status: patch Priority: highest | Milestone: 8.6.2 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D5196, Wiki Page: | Phab:D5201 -------------------------------------+------------------------------------- Comment (by dfeuer): I believe that `spark#` and `par#` also evaluate lifted-type arguments. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15696#comment:61 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15696: Derived Ord instance for enumerations with more than 8 elements seems to be incorrect -------------------------------------+------------------------------------- Reporter: mrkkrp | Owner: (none) Type: bug | Status: patch Priority: highest | Milestone: 8.6.2 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: #14677, #15155 | Differential Rev(s): Phab:D5196, Wiki Page: | Phab:D5201 -------------------------------------+------------------------------------- Changes (by osa1): * related: => #14677, #15155 -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15696#comment:62 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15696: Derived Ord instance for enumerations with more than 8 elements seems to be incorrect -------------------------------------+------------------------------------- Reporter: mrkkrp | Owner: (none) Type: bug | Status: patch Priority: highest | Milestone: 8.6.2 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: #14677, #15155 | Differential Rev(s): Phab:D5196, Wiki Page: | Phab:D5201 -------------------------------------+------------------------------------- Comment (by bgamari): We will move ahead merging Phag:D5201 to move ahead with 8.6.2 but will continue to pursue the alternatives in Phab:60. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15696#comment:63 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15696: Derived Ord instance for enumerations with more than 8 elements seems to be incorrect -------------------------------------+------------------------------------- Reporter: mrkkrp | Owner: osa1 Type: bug | Status: patch Priority: highest | Milestone: 8.6.2 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: #14677, #15155 | Differential Rev(s): Phab:D5196, Wiki Page: | Phab:D5201 -------------------------------------+------------------------------------- Changes (by bgamari): * owner: (none) => osa1 Comment: Omer will investigate. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15696#comment:64 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15696: Derived Ord instance for enumerations with more than 8 elements seems to be incorrect -------------------------------------+------------------------------------- Reporter: mrkkrp | Owner: osa1 Type: bug | Status: patch Priority: highest | Milestone: 8.6.2 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: #14677, #15155 | Differential Rev(s): Phab:D5196, Wiki Page: | Phab:D5201 -------------------------------------+------------------------------------- Comment (by simonpj):
I believe that spark# and par# also evaluate lifted-type arguments.
They do have a lifted-type arg, but they don't evalauate it: they put it in a spark queue. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15696#comment:65 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15696: Derived Ord instance for enumerations with more than 8 elements seems to be
incorrect
-------------------------------------+-------------------------------------
Reporter: mrkkrp | Owner: osa1
Type: bug | Status: patch
Priority: highest | Milestone: 8.6.2
Component: Compiler | Version: 8.6.1
Resolution: | Keywords:
Operating System: Unknown/Multiple | Architecture:
Type of failure: Incorrect result | Unknown/Multiple
at runtime | Test Case:
Blocked By: | Blocking:
Related Tickets: #14677, #15155 | Differential Rev(s): Phab:D5196,
Wiki Page: | Phab:D5201
-------------------------------------+-------------------------------------
Comment (by Ömer Sinan Ağacan

#15696: Derived Ord instance for enumerations with more than 8 elements seems to be incorrect -------------------------------------+------------------------------------- Reporter: mrkkrp | Owner: osa1 Type: bug | Status: patch Priority: highest | Milestone: 8.6.2 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: #14677, #15155 | Differential Rev(s): Phab:D5196, Wiki Page: | Phab:D5201 -------------------------------------+------------------------------------- Comment (by osa1): The bug fix is now merged. I'll be investigating the loose ends mentioned in comment:60. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15696#comment:67 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15696: Derived Ord instance for enumerations with more than 8 elements seems to be incorrect -------------------------------------+------------------------------------- Reporter: mrkkrp | Owner: osa1 Type: bug | Status: patch Priority: highest | Milestone: 8.6.2 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: #14677, #15155 | Differential Rev(s): Phab:D5196, Wiki Page: | Phab:D5201 -------------------------------------+------------------------------------- Comment (by osa1): Simon, you said that we can remove `can_fail=True` for `dataToTag#` after removing various hacks (in comment:60), and I also used to think that this should be the case, but thinking about this more I think `dataToTag#` should stay as `can_fail=True`. The reason is because `dataToTag#` evaluates the argument and returns an unboxed value, so I think these two expressions calls should have the same value: {{{ exprOkForSpeculation `dataToTag# foo` exprOkForSpeculation `case foo of x -> 1#` -- note: foo is lifted }}} Looking at the code I see that the latter returns `False` (because the scrutinee is lifted), so for the former to return false we need to mark `dataToTag#` as `can_fail = True`. Does this make sense? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15696#comment:68 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15696: Derived Ord instance for enumerations with more than 8 elements seems to be incorrect -------------------------------------+------------------------------------- Reporter: mrkkrp | Owner: osa1 Type: bug | Status: patch Priority: highest | Milestone: 8.6.2 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: #14677, #15155 | Differential Rev(s): Phab:D5196, Wiki Page: | Phab:D5201 -------------------------------------+------------------------------------- Comment (by simonpj): It's tricky, I agree; and I agree that's the result that `exprOkForSpeculation` should return False in these cases. But as I say in item (6) of comment:60, `app_ok` already has a special case for `SeqOp` for this very reason, and I think we should just extend that to `DataToTagOp`. In fact I got as far as writing the Note to accompany the change to `app_ok`: {{{ Note [PrimOps that evaluate their arguments] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Most primops to not evaluate their arguments, even lifted arguments. But two do: DataToTagOp and SeqOp (rembember the latter is monadic; it is not the primop corresponding to 'seq'). Now, is (dataToTag# x) ok-for-speculation? You might say "yes, if x is ok-for-speculation", because then it'll obey the rules for ok-for-speculation. But it's very fragile. Consider: \z. case x of y { DEFAULT -> let v = dataToTag# y in ... } This looks OK: we ask if 'y' is ok-for-spec, and say yes because it is evaluated. But if we do the binder-swap operation (which happens in FloatOut) we have \z. case x of y { DEFAULT -> let v = dataToTag# x in ... } and now it is /not/ ok-for-spec. This becomes even clearer if we float it to give let v = dataToTag# x in \z. case x of y { DEFAULT -> ... } Conclusion: always return False for ok-to-spec on SeqOp and DataToTagOp. }}} Does that makes sense? The can-fail thing is a hack, and we don't do it for `SeqOp`. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15696#comment:69 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15696: Derived Ord instance for enumerations with more than 8 elements seems to be
incorrect
-------------------------------------+-------------------------------------
Reporter: mrkkrp | Owner: osa1
Type: bug | Status: patch
Priority: highest | Milestone: 8.6.2
Component: Compiler | Version: 8.6.1
Resolution: | Keywords:
Operating System: Unknown/Multiple | Architecture:
Type of failure: Incorrect result | Unknown/Multiple
at runtime | Test Case:
Blocked By: | Blocking:
Related Tickets: #14677, #15155 | Differential Rev(s): Phab:D5196,
Wiki Page: | Phab:D5201
-------------------------------------+-------------------------------------
Comment (by osa1):
OK, that makes sense, thanks. Moving on, I'm now trying to understand an
assertion in `FloatIn` which is triggered when I make the `can_fail`
change and
update `ok_app` so that `dataToTag#` is never OK for speculation.
The assertion that fails is this: (in FloatIn.hs)
{{{
fiExpr _ to_drop (_, AnnLit lit) = ASSERT( null to_drop ) Lit lit
}}}
The definition that triggers it:
{{{
$wclosureTypeHeaderSize_s75y [InlPrag=NOUSERINLINE[2]]
:: ClosureType -> Int#
[LclId, Arity=1, Str=]
$wclosureTypeHeaderSize_s75y
= \ (w_s75u :: ClosureType) ->
case dataToTag# @ ClosureType w_s75u of a#_a4k2 [Dmd=]
{ __DEFAULT ->
join {
$j_s6vr [Dmd=] { __DEFAULT ->
case <# 20# a#_a4k2 of lwild_s6vu {
__DEFAULT -> 2#;
1# -> jump $j_s6vr
}
};
1# -> jump $j_s6vr
}
}
}}}
At one point `fiExpr` comes across this expression:
{{{
case a#_a4k2 :: Int# of a#_X4Mq [Dmd=

#15696: Derived Ord instance for enumerations with more than 8 elements seems to be incorrect -------------------------------------+------------------------------------- Reporter: mrkkrp | Owner: osa1 Type: bug | Status: patch Priority: highest | Milestone: 8.6.2 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: #14677, #15155 | Differential Rev(s): Phab:D5196, Wiki Page: | Phab:D5201 -------------------------------------+------------------------------------- Comment (by simonpj):
What's wrong with seeing a literal when floating stuff inwards? Why not just introduce case/let around it?
Because the float-in pass floats bindings inwards ''towards their use sites''. If there is no use of the variable bound by a floating binding, then we should not be floating that binding into that part of the code. Literals are an extreme case.
There should be some code somewhere that's supposed to ensure that this assertion is not triggered, but I can't find it.
It's `FloatIn.sepBindsByDropPoint`.
Why did simplifier not remove this case expression?
That is indeed a mystery. Is the code you are showing the output of the simplifier? Can you give instructions to reproduce (maybe make a `wip/` branch)? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15696#comment:71 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15696: Derived Ord instance for enumerations with more than 8 elements seems to be incorrect -------------------------------------+------------------------------------- Reporter: mrkkrp | Owner: osa1 Type: bug | Status: patch Priority: highest | Milestone: 8.6.2 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: #14677, #15155 | Differential Rev(s): Phab:D5196, Wiki Page: | Phab:D5201 -------------------------------------+------------------------------------- Comment (by osa1):
That is indeed a mystery. Is the code you are showing the output of the simplifier?
The code I showed is the input of `FloatIn` when a file is compiled with `-O`. The full command to reproduce (after building stage 1): {{{ "inplace/bin/ghc-stage1" -hisuf hi -osuf o -hcsuf hc -static -O0 -H64m -Wall -this-unit-id ghc-heap-8.7 -hide-all-packages -i -ilibraries /ghc-heap/. -ilibraries/ghc-heap/dist-install/build -Ilibraries/ghc-heap /dist-install/build -ilibraries/ghc-heap/dist-install/build/./autogen -Ilibraries/ghc-heap/dist-install/build/./autogen -Ilibraries/ghc-heap/. -optP-include -optPlibraries/ghc-heap/dist- install/build/./autogen/cabal_macros.h -package-id base-4.12.0.0 -package- id ghc-prim-0.5.3 -package-id rts -Wall -XHaskell2010 -XNoImplicitPrelude -O -no-user-package-db -rtsopts -Wno-deprecated-flags -Wnoncanonical-monad-instances -odir libraries/ghc-heap/dist- install/build -hidir libraries/ghc-heap/dist-install/build -stubdir libraries/ghc-heap/dist-install/build -dynamic-too -c libraries/ghc- heap/./GHC/Exts/Heap/ClosureTypes.hs -o libraries/ghc-heap/dist- install/build/GHC/Exts/Heap/ClosureTypes.o -dyno libraries/ghc-heap/dist- install/build/GHC/Exts/Heap/ClosureTypes.dyn_o }}}
Can you give instructions to reproduce (maybe make a wip/ branch)?
Committed code to `wip/T15696`. Note that -ddump-simpl-iterations is not printing Core right before `FloatIn` so you need some extra prints to be able to trace this. I'm currently using [https://gist.githubusercontent.com/osa1/9755f7d7238a3757cfee0b37a183a6d8/raw... this] patch for that. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15696#comment:72 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15696: Derived Ord instance for enumerations with more than 8 elements seems to be
incorrect
-------------------------------------+-------------------------------------
Reporter: mrkkrp | Owner: osa1
Type: bug | Status: patch
Priority: highest | Milestone: 8.6.2
Component: Compiler | Version: 8.6.1
Resolution: | Keywords:
Operating System: Unknown/Multiple | Architecture:
Type of failure: Incorrect result | Unknown/Multiple
at runtime | Test Case:
Blocked By: | Blocking:
Related Tickets: #14677, #15155 | Differential Rev(s): Phab:D5196,
Wiki Page: | Phab:D5201
-------------------------------------+-------------------------------------
Comment (by simonpj):
I took a look
1. We should give `dataToTag#` a strictness signatures in
`primops.txt.pp`. After all, it's strict. If we do that then,
{{{
case x of y { DEFAULT -> daataToTag# y }
}}}
will correctly optimise to `dataToTag# x`.
2. We should remove the bang from `GHC.Base.getTag`. Currenlty it says
{{{
getTag !x = dataToTag# x
}}}
But now the bang is unnecessary, and has to be optimised away by (1).
3. That's also the source of the extra stuff like
{{{
case a#_a4k2 :: Int# of a#_X4Mq [Dmd=

#15696: Derived Ord instance for enumerations with more than 8 elements seems to be incorrect -------------------------------------+------------------------------------- Reporter: mrkkrp | Owner: osa1 Type: bug | Status: patch Priority: highest | Milestone: 8.6.2 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: #14677, #15155 | Differential Rev(s): Phab:D5196, Wiki Page: | Phab:D5201 -------------------------------------+------------------------------------- Comment (by osa1): Thanks Simon, Re 1, the primop is already marked as strict: {{{ strictness = { \ _arity -> mkClosedStrictSig [evalDmd] topRes } }}} Re 2, I already removed the bang pattern in Phab:D5201. I think you need a git pull. Re 3, both (1) and (2) are already done so this should've worked? I'll implement 4 and update. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15696#comment:74 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15696: Derived Ord instance for enumerations with more than 8 elements seems to be incorrect -------------------------------------+------------------------------------- Reporter: mrkkrp | Owner: osa1 Type: bug | Status: patch Priority: highest | Milestone: 8.6.2 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: #14677, #15155 | Differential Rev(s): Phab:D5196, Wiki Page: | Phab:D5201 -------------------------------------+------------------------------------- Comment (by simonpj): Sorry, yes, I was looking in the wrong tree re (1) and (2). Re (3) I now understand. The output of the simplifier is this {{{ $j_s6vr = case w_s75u of lwild_s6yB { __DEFAULT -> case GHC.Prim.dataToTag# @ ClosureType lwild_s6yB of { __DEFAULT -> case lwild_s6yB of lwild_s6yA { __DEFAULT -> case GHC.Prim.dataToTag# @ ClosureType lwild_s6yA of { __DEFAULT -> case lwild_s6yA of lwild_s6vm { __DEFAULT -> case GHC.Prim.dataToTag# @ ClosureType lwild_s6vm of { __DEFAULT -> 1# }; AP_STACK -> 2# } }; AP -> 2# } }; THUNK_SELECTOR -> 2# } } in }}} Are those `case lwild_s6yB of lwild_s6yq { ...}` evals actually redundant? No: they are checking for `AP_STACK` and `THUNK_SELECTOR` resp. But the one you originally asked about was {{{ case a#_a4k2 :: Int# of a#_X4Mq { __DEFAULT -> 1# } }}} This one was ''introduced'' by CSE, so it has not yet had a simplifer run to eliminate it. Before CSE it looked like {{{ case dataToTag# lwild of a#_X4Mq { __DEFAULT -> 1# } }}} In short, all is well. Just do (4). -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15696#comment:75 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15696: Derived Ord instance for enumerations with more than 8 elements seems to be
incorrect
-------------------------------------+-------------------------------------
Reporter: mrkkrp | Owner: osa1
Type: bug | Status: patch
Priority: highest | Milestone: 8.6.2
Component: Compiler | Version: 8.6.1
Resolution: | Keywords:
Operating System: Unknown/Multiple | Architecture:
Type of failure: Incorrect result | Unknown/Multiple
at runtime | Test Case:
Blocked By: | Blocking:
Related Tickets: #14677, #15155 | Differential Rev(s): Phab:D5196,
Wiki Page: | Phab:D5201
-------------------------------------+-------------------------------------
Comment (by osa1):
I did (4) and tried to validate. There are 4 regressions. 2 of them are
perf,
the other 2 are tests with checks on -ddump-simpl output. They're all
basically
the same problem: worker/wrapper results change slightly after the patch,
and
worker functions take more boxed params. Here's an example (T10482).
Original
program is this:
{{{
{-# LANGUAGE BangPatterns #-}
{-# LANGUAGE TypeFamilies #-}
module T10482 where
data family Foo a
data instance Foo (a, b) = FooPair !(Foo a) !(Foo b)
newtype instance Foo Int = Foo Int
foo :: Foo ((Int, Int), Int) -> Int -> Int
foo !f k =
if k == 0 then 0
else if even k then foo f (k-1)
else case f of
FooPair (FooPair (Foo n) _) _ -> n
}}}
`$wfoo` originally:
{{{
Rec {
-- RHS size: {terms: 19, types: 4, coercions: 0, joins: 0/0}
T10482.$wfoo [InlPrag=NOUSERINLINE[2], Occ=LoopBreaker]
:: GHC.Prim.Int# -> GHC.Prim.Int# -> GHC.Prim.Int#
[GblId,
Arity=2,
Caf=NoCafRefs,
Str=,
Unf=OtherCon []]
T10482.$wfoo
= \ (ww_s2Qy :: GHC.Prim.Int#) (ww1_s2QG :: GHC.Prim.Int#) ->
case ww1_s2QG of wild_X1v {
__DEFAULT ->
case GHC.Prim.remInt# wild_X1v 2# of {
__DEFAULT -> ww_s2Qy;
0# -> T10482.$wfoo ww_s2Qy (GHC.Prim.-# wild_X1v 1#)
};
0# -> 0#
}
end Rec }
}}}
after the patch (`wip/T15696`):
{{{
Rec {
-- RHS size: {terms: 22, types: 7, coercions: 3, joins: 0/0}
T10482.$wfoo [InlPrag=NOUSERINLINE[2], Occ=LoopBreaker]
:: Foo Int -> GHC.Prim.Int# -> GHC.Prim.Int#
[GblId,
Arity=2,
Caf=NoCafRefs,
Str=,
Unf=OtherCon []]
T10482.$wfoo
= \ (ww_s2Qs
:: Foo Int
Unf=OtherCon [])
(ww1_s2Qz :: GHC.Prim.Int#) ->
case ww1_s2Qz of wild_X1l {
__DEFAULT ->
case GHC.Prim.remInt# wild_X1l 2# of {
__DEFAULT ->
case ww_s2Qs
`cast` (T10482.D:R:FooInt0[0] ; T10482.N:R:FooInt[0]
:: Foo Int ~R# Int)
of
{ GHC.Types.I# ww3_s2QD ->
ww3_s2QD
};
0# -> T10482.$wfoo ww_s2Qs (GHC.Prim.-# wild_X1l 1#)
};
0# -> 0#
}
end Rec }
}}}
The difference is that the first argument is now boxed `Int` instead of
`Int#`
as before.
I think the reason for this change is this: originally evaluation of
`ww_s2Qs`
happens after `remInt# wild_X1l 2#`, but if we pass `ww_s2Qs` unboxed
that'd
mean evaluating it before `remInt#`. This is possible if at least one of
those
expressions are OK-for-speculation. `remInt#` is not OK-for-speculation
because
it can fail. `ww_s2Qs` was previously (without `wip/T15696`) OK-for-
speculation,
but we just made `app_ok` more strict by removing a disjunct which was
actually
holding for this expression:
- `n_val_args == 0` holds because it has no args
- `isEvaldUnfolding (idUnfolding fun)` apparently also somehow holds. I
don't
know how exactly, but I'm guessing that because of the strictness
annotation
on the data con field we give this id `evaldUnfolding`.
Other 3 regressions are also of similar nature (worker function with less
unboxing).
What to do? I think it's correct to consider the data con field as strict
here
and pass an unboxed value to the worker. While we may not actually have a
tagged
value there (or even an evaluated one! e.g. #14677, #15155), the
programmer
indicated that it's fine to evaluate the field eagerly. As long as we
don't
expect (in the generated code) evaluated or tagged value it's fine.
So perhaps we should not do the change in `ok_app`.
--
Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15696#comment:76
GHC http://www.haskell.org/ghc/
The Glasgow Haskell Compiler

#15696: Derived Ord instance for enumerations with more than 8 elements seems to be incorrect -------------------------------------+------------------------------------- Reporter: mrkkrp | Owner: osa1 Type: bug | Status: patch Priority: highest | Milestone: 8.6.2 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: #14677, #15155 | Differential Rev(s): Phab:D5196, Wiki Page: | Phab:D5201 -------------------------------------+------------------------------------- Comment (by simonpj): The rabbit hole gets deeper. In `T10482` We have a data constructor `FooPair` that is strict in both arguments, and an expression like {{{ foo f k = case f of FooPair x y -> case burble of True -> case x of FooPair p q -> ... False -> ... }}} Previously we floated that inner `case x` out to get {{{ foo f k = case f of FooPair x y -> case x of FooPair p q -> case burble of True -> ... False -> ... }}} which is nice and strict. '''So firstly''': the reasoning in item (3) of comment:60 is right for case-binders, but not for the binders of a constructor pattern (the binder-swap stuff doesn't apply to them). So rather than make the change in item (3), we could instead just refrain from giving an "evald-unfolding" to the case binder. This happens here, in `Simplify.simplAlts` {{{ ; (env1, case_bndr1) <- simplBinder env0 case_bndr ; let case_bndr2 = case_bndr1 `setIdUnfolding` evaldUnfolding env2 = modifyInScope env1 case_bndr2 -- See Note [Case binder evaluated-ness] }}} We could try (a) undoing the change in item (3), and (b) removing the above meddling with `case_bndr`. NB: `Note [Case binder evaluated-ness]` is, I believe, out of date; we now skip the lifted args of primpos in `app_ok`. '''Secondly''' I think we can improve w/w, even for the case where that case-expression is not floated out. It's all to do with `Note [Add demands for strict constructors]` in `DmdAnal`, and the function `addDataConStrictness`. This special treatment is ineffective here because (in the first code for foo above), `x` is really used lazily in the alternative. And yet it'd be sound to unpack it. So here's an idea: * Delete all the stuff about `addDataConStrictness` from `DmdAnal`. * Instead, add it into `WwLib` thus: {{{ | isStrictDmd dmd , Just cs <- splitProdDmd_maybe dmd -- See Note [Unpacking arguments with product and polymorphic demands] , not (has_inlineable_prag && isClassPred arg_ty) -- See Note [Do not unpack class dictionaries] , Just (data_con, inst_tys, inst_con_arg_tys, co) <- deepSplitProductType_maybe fam_envs arg_ty , cs `equalLength` inst_con_arg_tys -- See Note [mkWWstr and unsafeCoerce] = do { (uniq1:uniqs) <- getUniquesM ; let cs' = addDataConStrictness data_con cs <---------------- NEW unpk_args = zipWith3 mk_ww_arg uniqs inst_con_arg_tys cs' unbox_fn = mkUnpackCase (Var arg) co uniq1 data_con unpk_args arg_no_unf = zapStableUnfolding arg }}} That is, add one new line, and transfer the defn of `addDataConStrictness` from `DmdAnal`. This actually works, even without implementing "Firstly" -- I tried it. '''And thirdly''', I noticed that `addDataConStrictness` makes the demand stricter like this {{{ add dmd str | isMarkedStrict str , not (isAbsDmd dmd) = dmd `bothDmd` seqDmd | otherwise = dmd }}} But `bothDmd seqDmd` messes up the cardinality information! I doubt this is important but better to define (in `Demand`) {{{ strictifyDmd :: Demand -> Demand strictifyDmd dmd@(JD { sd = str }) = dmd { sd = str `bothArgStr` Str VanStr HeadStr } }}} and call it from `addDataConStrictness`. I think either Firstly or Secondly should fix the regressions, but all three should be good. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15696#comment:77 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15696: Derived Ord instance for enumerations with more than 8 elements seems to be incorrect -------------------------------------+------------------------------------- Reporter: mrkkrp | Owner: osa1 Type: bug | Status: patch Priority: highest | Milestone: 8.6.2 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: #14677, #15155 | Differential Rev(s): Phab:D5196, Wiki Page: | Phab:D5201 -------------------------------------+------------------------------------- Comment (by osa1): I'm probably missing the big picture, but I think it makes sense to give evaluated things `evaldUnfolding` and if that breaks something else then we should fix that something else. In (3) of comment:60, perhaps the thing to fix is the binder-swapping transform rather than the id unfolding, so that it avoids swapping binders with different evaluated-ness (or maybe only avoid it if doing so will break the let/app invariant -- not sure how hard would it be to check this though). I don't know enough about demand analysis to comment on Secondly. I'll try it. Re Thirdly, the demand type does not make too much sense to me (how can an id have strict demand but unused cardinality? Sounds like a hack to make some transformations possible) but I can see how the current code changes cardinality incorrectly. I'll update. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15696#comment:78 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15696: Derived Ord instance for enumerations with more than 8 elements seems to be incorrect -------------------------------------+------------------------------------- Reporter: mrkkrp | Owner: osa1 Type: bug | Status: patch Priority: highest | Milestone: 8.6.2 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: #14677, #15155 | Differential Rev(s): Phab:D5196, Wiki Page: | Phab:D5201 -------------------------------------+------------------------------------- Comment (by osa1): Submitted Phab:D5225 for `addDataConStrictness` fix. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15696#comment:79 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15696: Derived Ord instance for enumerations with more than 8 elements seems to be incorrect -------------------------------------+------------------------------------- Reporter: mrkkrp | Owner: osa1 Type: bug | Status: patch Priority: highest | Milestone: 8.6.2 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: #14677, #15155 | Differential Rev(s): Phab:D5196, Wiki Page: | Phab:D5201, Phab:D5226 -------------------------------------+------------------------------------- Changes (by bgamari): * differential: Phab:D5196, Phab:D5201 => Phab:D5196, Phab:D5201, Phab:D5226 Comment: I gave Secondly from comment:77 a try while waiting for GHC 8.4.4 builds. The result is Phab:D5226. However, the approach presented does seem a bit odd since the demand resulting from demand analysis is still conservative; we merely make up for this conservatism in worker-wrapper. However, this doesn't help other (hypothetical) consumers of demand analysis. Perhaps it would be good to open a ticket to ensure we don't forget about this limitation. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15696#comment:80 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15696: Derived Ord instance for enumerations with more than 8 elements seems to be incorrect -------------------------------------+------------------------------------- Reporter: mrkkrp | Owner: osa1 Type: bug | Status: patch Priority: highest | Milestone: 8.6.2 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: #14677, #15155 | Differential Rev(s): Phab:D5196, Wiki Page: | Phab:D5201, Phab:D5226 -------------------------------------+------------------------------------- Comment (by simonpj): We decided ''not'' to attempt to put Phab:D5226 on GHC 8.6. It's not necessary to fix 8.6. Instead Omer will include it on his `wip/T15696` branch -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15696#comment:81 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15696: Derived Ord instance for enumerations with more than 8 elements seems to be incorrect -------------------------------------+------------------------------------- Reporter: mrkkrp | Owner: osa1 Type: bug | Status: patch Priority: highest | Milestone: 8.6.2 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: #14677, #15155 | Differential Rev(s): Phab:D5196, Wiki Page: | Phab:D5201, Phab:D5226 -------------------------------------+------------------------------------- Comment (by osa1): Just an update before I forget where I had left: the performance regression in `wip/T15696` is fixed by any of these: - Revert change in `app_ok` general case, don't give case binders `evaldUnfolding` in `simplAlts` (first suggestion in comment:77) - Revert change in `app_ok` general case, disable binder swapping of case bidners and scrutinee. Both validate, but I don't have nofib results yet. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15696#comment:82 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15696: Derived Ord instance for enumerations with more than 8 elements seems to be incorrect -------------------------------------+------------------------------------- Reporter: mrkkrp | Owner: osa1 Type: bug | Status: patch Priority: highest | Milestone: 8.6.2 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: #14677, #15155 | Differential Rev(s): Phab:D5196, Wiki Page: | Phab:D5201, Phab:D5226 -------------------------------------+------------------------------------- Comment (by simonpj):
Revert change in app_ok general case, don't give case binders evaldUnfolding in simplAlts (first suggestion in comment:77)
This is my preferred option; I don't like disabling binder swapping - it's there for a good reason! It would be illuminating to know (perhaps via `-ticky`) what code is improved by reverting the `app_ok` general case. If we knew, we could add an example to the code so that we had a concrete reason for that `isEvaldUnfolding` case. But it's only curiosity. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15696#comment:83 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15696: Derived Ord instance for enumerations with more than 8 elements seems to be incorrect -------------------------------------+------------------------------------- Reporter: mrkkrp | Owner: osa1 Type: bug | Status: patch Priority: highest | Milestone: 8.6.2 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: #14677, #15155 | Differential Rev(s): Phab:D5196, Wiki Page: | Phab:D5201, Phab:D5226 -------------------------------------+------------------------------------- Comment (by osa1):
I don't like disabling binder swapping - it's there for a good reason!
I don't feel strongly about this, but just out of curiosity I compared nofib outputs of GHC HEAD with and without this patch: (disables swapping case binder swapping) {{{ diff --git a/compiler/simplCore/SetLevels.hs b/compiler/simplCore/SetLevels.hs index b8212c72f3..8fabf98179 100644 --- a/compiler/simplCore/SetLevels.hs +++ b/compiler/simplCore/SetLevels.hs @@ -469,15 +469,15 @@ lvlCase env scrut_fvs scrut' case_bndr ty alts -- Always float the case if possible -- Unlike lets we don't insist that it escapes a value lambda do { (env1, (case_bndr' : bs')) <- cloneCaseBndrs env dest_lvl (case_bndr : bs) - ; let rhs_env = extendCaseBndrEnv env1 case_bndr scrut' - ; body' <- lvlMFE rhs_env True body + -- ; let rhs_env = extendCaseBndrEnv env1 case_bndr scrut' + ; body' <- lvlMFE env1 True body ; let alt' = (con, map (stayPut dest_lvl) bs', body') ; return (Case scrut' (TB case_bndr' (FloatMe dest_lvl)) ty' [alt']) } | otherwise -- Stays put = do { let (alts_env1, [case_bndr']) = substAndLvlBndrs NonRecursive env incd_lvl [case_bndr] - alts_env = extendCaseBndrEnv alts_env1 case_bndr scrut' - ; alts' <- mapM (lvl_alt alts_env) alts + -- alts_env = extendCaseBndrEnv alts_env1 case_bndr scrut' + ; alts' <- mapM (lvl_alt alts_env1) alts ; return (Case scrut' case_bndr' ty' alts') } where ty' = substTy (le_subst env) ty @@ -1496,6 +1496,7 @@ floatTopLvlOnly le = floatToTopLevelOnly (le_switches le) incMinorLvlFrom :: LevelEnv -> Level incMinorLvlFrom env = incMinorLvl (le_ctxt_lvl env) +{- -- extendCaseBndrEnv adds the mapping case-bndr->scrut-var if it can -- See Note [Binder-swap during float-out] extendCaseBndrEnv :: LevelEnv @@ -1507,6 +1508,7 @@ extendCaseBndrEnv le@(LE { le_subst = subst, le_env = id_env }) = le { le_subst = extendSubstWithVar subst case_bndr scrut_var , le_env = add_id id_env (case_bndr, scrut_var) } extendCaseBndrEnv env _ _ = env +-} -- See Note [Join ceiling] placeJoinCeiling :: LevelEnv -> LevelEnv }}} There's one -0.3% allocation in one program ("pic") with this patch, the rest is the same. No change in binary sizes. So perhaps this is not as important as we thought.
It would be illuminating to know (perhaps via -ticky) what code is improved by reverting the app_ok general case
Have you seen comment:76? I show one example there, showing Core with and without the change in `app_ok`. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15696#comment:84 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15696: Derived Ord instance for enumerations with more than 8 elements seems to be
incorrect
-------------------------------------+-------------------------------------
Reporter: mrkkrp | Owner: osa1
Type: bug | Status: patch
Priority: highest | Milestone: 8.6.2
Component: Compiler | Version: 8.6.1
Resolution: | Keywords:
Operating System: Unknown/Multiple | Architecture:
Type of failure: Incorrect result | Unknown/Multiple
at runtime | Test Case:
Blocked By: | Blocking:
Related Tickets: #14677, #15155 | Differential Rev(s): Phab:D5196,
Wiki Page: | Phab:D5201, Phab:D5226
-------------------------------------+-------------------------------------
Comment (by osa1):
Making progress. We want to handle dataToTag# and seq# uniformly, as
they're
two exceptional primops that evaluate their arguments. So I added this to
`app_ok`:
{{{
app_ok primop_ok fun args
= case idDetails fun of
...
| SeqOp <- op ------------------- OLD CODE
-> all (expr_ok primop_ok) args
| DataToTagOp <- op ------------- NEW CODE
-> all (expr_ok primop_ok) args
}}}
However this causes let/app invariant errors. Here's an example:
{{{
$cpred_a3m9
= \ (a_a2Ix :: VecElem) ->
case dataToTag# @ VecElem a_a2Ix of a#_a2Iy { __DEFAULT ->
case eqInt (GHC.Types.I# 0#) (GHC.Types.I# a#_a2Iy) of {
False -> tagToEnum# @ VecElem (+# a#_a2Iy -1#);
True ->
error
@ 'LiftedRep
@ VecElem
($dIP_s4eG
`cast` (Sym (GHC.Classes.N:IP[0]
<"callStack">_N
] {
__DEFAULT ->
tagToEnum# @ VecElem (+# (dataToTag# @ VecElem a_a2Ix) -1#);
Int8ElemRep -> lvl_s4yY
}
}}}
which is not OK-for-spec beucase `a_a2Ix` doesn't have evaldUnfolding. The
error message:
{{{
*** Core Lint errors : in result of Float out(FOS {Lam = Just 0,
Consts = True,
OverSatApps = True})
***
<no location info>: warning:
In the expression: +# (dataToTag# @ VecElem a_a2Ix) -1#
This argument does not satisfy the let/app invariant:
dataToTag# @ VecElem a_a2Ix
}}}
I don't understand why we substitute `dataToTag# @ VecElem a_a2Ix` for the
case
binder. Simon, any ideas?
(Committed the code to wip/T15696, the error happens during validate)
--
Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15696#comment:85
GHC http://www.haskell.org/ghc/
The Glasgow Haskell Compiler

#15696: Derived Ord instance for enumerations with more than 8 elements seems to be incorrect -------------------------------------+------------------------------------- Reporter: mrkkrp | Owner: osa1 Type: bug | Status: patch Priority: highest | Milestone: 8.6.2 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: #14677, #15155 | Differential Rev(s): Phab:D5196, Wiki Page: | Phab:D5201, Phab:D5226 -------------------------------------+------------------------------------- Changes (by asr): * cc: asr (added) -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15696#comment:86 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15696: Derived Ord instance for enumerations with more than 8 elements seems to be incorrect -------------------------------------+------------------------------------- Reporter: mrkkrp | Owner: osa1 Type: bug | Status: patch Priority: highest | Milestone: 8.6.2 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: #14677, #15155 | Differential Rev(s): Phab:D5196, Wiki Page: | Phab:D5201, Phab:D5226 -------------------------------------+------------------------------------- Comment (by bgamari): The core problem here was fixed with comment:66. There are several additional tasks that can be done, however, as described in comment:77. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15696#comment:87 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15696: Derived Ord instance for enumerations with more than 8 elements seems to be incorrect -------------------------------------+------------------------------------- Reporter: mrkkrp | Owner: osa1 Type: bug | Status: patch Priority: highest | Milestone: 8.6.2 Component: Compiler | Version: 8.6.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Incorrect result | Unknown/Multiple at runtime | Test Case: Blocked By: | Blocking: Related Tickets: #14677, #15155 | Differential Rev(s): Phab:D5196, Wiki Page: | Phab:D5201, Phab:D5226 -------------------------------------+------------------------------------- Comment (by simonpj): Simon will look at the lint error on `wip/T15696` -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15696#comment:88 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15696: Derived Ord instance for enumerations with more than 8 elements seems to be
incorrect
-------------------------------------+-------------------------------------
Reporter: mrkkrp | Owner: osa1
Type: bug | Status: patch
Priority: highest | Milestone: 8.6.2
Component: Compiler | Version: 8.6.1
Resolution: | Keywords:
Operating System: Unknown/Multiple | Architecture:
Type of failure: Incorrect result | Unknown/Multiple
at runtime | Test Case:
Blocked By: | Blocking:
Related Tickets: #14677, #15155 | Differential Rev(s): Phab:D5196,
Wiki Page: | Phab:D5201, Phab:D5226
-------------------------------------+-------------------------------------
Comment (by Simon Peyton Jones

#15696: Derived Ord instance for enumerations with more than 8 elements seems to be incorrect -------------------------------------+------------------------------------- Reporter: mrkkrp | Owner: osa1 Type: bug | Status: closed Priority: highest | Milestone: 8.6.2 Component: Compiler | Version: 8.6.1 Resolution: fixed | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Incorrect result | Test Case: at runtime | codeGen/should_run/T15696_1, | T15696_2, T15696_3 Blocked By: | Blocking: Related Tickets: #14677, #15155 | Differential Rev(s): Phab:D5196, Wiki Page: | Phab:D5201, Phab:D5226 -------------------------------------+------------------------------------- Changes (by simonpj): * status: patch => closed * testcase: => codeGen/should_run/T15696_1, T15696_2, T15696_3 * resolution: => fixed Comment: Ben, Omer: I believe I have finally finished this ticket, as promised. Do check my work! But, optimistically, I'm going to close it. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15696#comment:90 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15696: Derived Ord instance for enumerations with more than 8 elements seems to be
incorrect
-------------------------------------+-------------------------------------
Reporter: mrkkrp | Owner: osa1
Type: bug | Status: closed
Priority: highest | Milestone: 8.6.2
Component: Compiler | Version: 8.6.1
Resolution: fixed | Keywords:
Operating System: Unknown/Multiple | Architecture:
| Unknown/Multiple
Type of failure: Incorrect result | Test Case:
at runtime | codeGen/should_run/T15696_1,
| T15696_2, T15696_3
Blocked By: | Blocking:
Related Tickets: #14677, #15155 | Differential Rev(s): Phab:D5196,
Wiki Page: | Phab:D5201, Phab:D5226
-------------------------------------+-------------------------------------
Comment (by Ben Gamari

#15696: Derived Ord instance for enumerations with more than 8 elements seems to be
incorrect
-------------------------------------+-------------------------------------
Reporter: mrkkrp | Owner: osa1
Type: bug | Status: closed
Priority: highest | Milestone: 8.6.2
Component: Compiler | Version: 8.6.1
Resolution: fixed | Keywords:
Operating System: Unknown/Multiple | Architecture:
| Unknown/Multiple
Type of failure: Incorrect result | Test Case:
at runtime | codeGen/should_run/T15696_1,
| T15696_2, T15696_3
Blocked By: | Blocking:
Related Tickets: #14677, #15155 | Differential Rev(s): Phab:D5196,
Wiki Page: | Phab:D5201, Phab:D5226
-------------------------------------+-------------------------------------
Comment (by Simon Peyton Jones for Int#
rather than for Int
}}}
--
Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15696#comment:92
GHC http://www.haskell.org/ghc/
The Glasgow Haskell Compiler

#15696: Derived Ord instance for enumerations with more than 8 elements seems to be incorrect -------------------------------------+------------------------------------- Reporter: mrkkrp | Owner: osa1 Type: bug | Status: closed Priority: highest | Milestone: 8.6.2 Component: Compiler | Version: 8.6.1 Resolution: fixed | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Incorrect result | Test Case: at runtime | codeGen/should_run/T15696_1, | T15696_2, T15696_3 Blocked By: | Blocking: Related Tickets: #14677, #15155 | Differential Rev(s): Phab:D5196, Wiki Page: | Phab:D5201, Phab:D5226 -------------------------------------+------------------------------------- Comment (by simonpj): These two patches (comment:91 and comment:92) finally close this surprisingly rich ticket. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15696#comment:93 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#15696: Derived Ord instance for enumerations with more than 8 elements seems to be incorrect -------------------------------------+------------------------------------- Reporter: mrkkrp | Owner: osa1 Type: bug | Status: closed Priority: highest | Milestone: 8.6.2 Component: Compiler | Version: 8.6.1 Resolution: fixed | Keywords: | DemandAnalysis Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: Incorrect result | Test Case: at runtime | codeGen/should_run/T15696_1, | T15696_2, T15696_3 Blocked By: | Blocking: Related Tickets: #14677, #15155 | Differential Rev(s): Phab:D5196, Wiki Page: | Phab:D5201, Phab:D5226 -------------------------------------+------------------------------------- Changes (by simonpj): * keywords: => DemandAnalysis -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/15696#comment:94 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler
participants (1)
-
GHC