[GHC] #10218: GHC creates incorrect code which throws <<loop>>

#10218: GHC creates incorrect code which throws <<loop>> -------------------------------------+------------------------------------- Reporter: yongqli | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.10.1 Keywords: | Operating System: Unknown/Multiple Architecture: | Type of failure: None/Unknown Unknown/Multiple | Blocked By: Test Case: yes | Related Tickets: Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- My co-worker and I have spent almost two weeks tracking down this bug. We have finally produced a reasonably small test case... please take a look: https://github.com/yongqli/ghctestcase Under certain circumstances GHC generates incorrect code which goes into <<loop>>. When you compile and run the project without profiling and with eager black-holing, it will throw <<loop>> and exit. If you compile either without eager black-holing or with profiling, it will run correctly. See setup.hs for further details. This bug can be delicate to trigger, inlining certain things will prevent it from occurring. However, we've encountered this in production, and it is much harder to work-around in an actual project. We've had to resort to inlining every function in the affected module, which slows down compilation considerably. We've found that this bug triggers on 7.10.1 and 7.8.4. The test case will only compile on 7.10.1. Please let me know if there are any questions. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10218 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10218: GHC creates incorrect code which throws <<loop>> -------------------------------------+------------------------------------- Reporter: yongqli | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.10.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: None/Unknown | Unknown/Multiple Blocked By: | Test Case: yes Related Tickets: | Blocking: | Differential Revisions: -------------------------------------+------------------------------------- Comment (by jstolarek): I can confirm this happens on my 64bit Debian Wheezy with GHC 7.10.1. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10218#comment:1 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10218: GHC creates incorrect code which throws <<loop>> -------------------------------------+------------------------------------- Reporter: yongqli | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.10.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: None/Unknown | Unknown/Multiple Blocked By: | Test Case: yes Related Tickets: | Blocking: | Differential Revisions: -------------------------------------+------------------------------------- Comment (by AlexET): I have managed to reproduce on windows 64-bit on GHC-7.10.1. I have shrunk the example down. It requires linear, vector and transformers. It needs to be complied with -O1 and -feager-blackholes. The redundant constraint in the context of the type of guessStates is required for the bug to trigger. Inlining calc_zs also prevents the bug from triggering. Adding sharing between xs and ys in guessStates also prevents the bug. Changing the monad to the Identity monad prevents the bug. Removing the VectorSpace space class and just adding the constraints to LinAlg also prevents the bug. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10218#comment:2 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10218: GHC creates incorrect code which throws <<loop>> -------------------------------------+------------------------------------- Reporter: yongqli | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.10.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: None/Unknown | Unknown/Multiple Blocked By: | Test Case: yes Related Tickets: | Blocking: | Differential Revisions: -------------------------------------+------------------------------------- Comment (by shachaf): I simplified AlexET's example some more. This bug is very touchy. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10218#comment:3 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10218: GHC creates incorrect code which throws <<loop>> -------------------------------------+------------------------------------- Reporter: yongqli | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.10.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: None/Unknown | Unknown/Multiple Blocked By: | Test Case: yes Related Tickets: | Blocking: | Differential Revisions: -------------------------------------+------------------------------------- Comment (by nomeata): Did you systematically try enabling/disabling the various `-fsomething` options? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10218#comment:4 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10218: GHC creates incorrect code which throws <<loop>> -------------------------------------+------------------------------------- Reporter: yongqli | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.10.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: None/Unknown | Unknown/Multiple Blocked By: | Test Case: yes Related Tickets: | Blocking: | Differential Revisions: -------------------------------------+------------------------------------- Comment (by AlexET): I have simplified the example further. The difference between the working and not working cases is now really small. I think the simplifier core is mostly equal (modulo alpha). However the demand analysis is different. The core-prep also has a case in one which is a let in the other. The stg differ similarly. The cmm differs more (but the difference between a case and a let would cause a larger difference there). In terms of flags currently I know that specialise ,funbox-strict-fields and static-argument-transformation are uneeded. cse, strictness are needed. I will try more later. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10218#comment:5 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10218: GHC creates incorrect code which throws <<loop>>
-------------------------------------+-------------------------------------
Reporter: yongqli | Owner:
Type: bug | Status: new
Priority: normal | Milestone:
Component: Compiler | Version: 7.10.1
Resolution: | Keywords:
Operating System: Unknown/Multiple | Architecture:
Type of failure: None/Unknown | Unknown/Multiple
Blocked By: | Test Case: yes
Related Tickets: | Blocking:
| Differential Revisions:
-------------------------------------+-------------------------------------
Comment (by AlexET):
Looking at the difference in usage demands it seems that there is a
variable (specifically a dictionary for A) with demand `

#10218: GHC creates incorrect code which throws <<loop>> -------------------------------------+------------------------------------- Reporter: yongqli | Owner: Type: bug | Status: new Priority: highest | Milestone: 7.10.2 Component: Compiler | Version: 7.10.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: None/Unknown | Unknown/Multiple Blocked By: | Test Case: yes Related Tickets: | Blocking: | Differential Revisions: -------------------------------------+------------------------------------- Changes (by thoughtpolice): * priority: normal => highest * milestone: => 7.10.2 -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10218#comment:7 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10218: GHC creates incorrect code which throws <<loop>> -------------------------------------+------------------------------------- Reporter: yongqli | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.10.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: None/Unknown | Unknown/Multiple Blocked By: | Test Case: yes Related Tickets: | Blocking: | Differential Revisions: -------------------------------------+------------------------------------- Changes (by simonpj): * priority: highest => normal * milestone: 7.10.2 => Comment: Could you try `-flate-dmd-anal` (7.10 only)? Thanks! -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10218#comment:8 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10218: GHC creates incorrect code which throws <<loop>> -------------------------------------+------------------------------------- Reporter: yongqli | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.10.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: None/Unknown | Unknown/Multiple Blocked By: | Test Case: yes Related Tickets: | Blocking: | Differential Revisions: -------------------------------------+------------------------------------- Comment (by AlexET): Running late demand analysis does seem to fix this. I am now pretty sure this is a problem with cse and demand. There seems to be two ways to fix it. Either drop demand information on variables involved in cse or somehow infer the correct demand information. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10218#comment:9 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10218: GHC creates incorrect code which throws <<loop>> -------------------------------------+------------------------------------- Reporter: yongqli | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.10.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: None/Unknown | Unknown/Multiple Blocked By: | Test Case: yes Related Tickets: | Blocking: | Differential Revisions: -------------------------------------+------------------------------------- Comment (by simonpj): OK good. Can you also try `-fno-cse`. I guess that too will fix it. I would still ''really'' like to know how the incorrect demand info leads to a loop. I still don't understand that. I wonder if adding some `trace` calls might help show what is going on, without making the bug disappear? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10218#comment:10 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10218: GHC creates incorrect code which throws <<loop>>
-------------------------------------+-------------------------------------
Reporter: yongqli | Owner:
Type: bug | Status: new
Priority: normal | Milestone:
Component: Compiler | Version: 7.10.1
Resolution: | Keywords:
Operating System: Unknown/Multiple | Architecture:
Type of failure: None/Unknown | Unknown/Multiple
Blocked By: | Test Case: yes
Related Tickets: | Blocking:
| Differential Revisions:
-------------------------------------+-------------------------------------
Comment (by simonpj):
OK I think I have it.
* There is a dictionary thunk thus (in STG syntax):
{{{
let {
$dA_s7fS [Dmd=

#10218: GHC creates incorrect code which throws <<loop>> -------------------------------------+------------------------------------- Reporter: yongqli | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.10.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: None/Unknown | Unknown/Multiple Blocked By: | Test Case: yes Related Tickets: | Blocking: | Differential Revisions: -------------------------------------+------------------------------------- Comment (by simonpj): OK here is a nice small test case, which does not (unlike the test above) depend on `lens`! {{{ module Main where {-# NOINLINE foo #-} foo :: Bool -> Int -> Int -> Int foo True _ x = 1 foo False _ x = x+1 {-# NOINLINE bar #-} bar :: Int -> (Int,Int) bar x = let y1 = x * 2 y2 = x * 2 in (foo False y1 y2,foo False y2 y1) main = print (fst p + snd p) where p = bar 3 }}} Compile with `-O -feager-blackholing` and you get `<<loop>`. Add `-fno- cse` or `-flate-dmd-anal` restores correct behaviour. Points to note * `foo` uses its second argument zero times, and its third argument exactly once. * So the two calls to `foo` in `bar` use `y1` exactly once and `y2` exactly once. * But when `y1` and `y2` are CSE'd, the usage goes up to twice; and that is the problem. I'm validating a simple fix in CSE, which zaps the demand-info on binders which are potentially shared. It's a bit brutal. But another run of the demand analyser (which has other advantages) restores everything again. I'm validating now; will commit next week. Really sorry to have taken two weeks of your time to find this bug. But at least your efforts will be rewarded by a real fix. Simon -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10218#comment:12 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10218: GHC creates incorrect code which throws <<loop>>
-------------------------------------+-------------------------------------
Reporter: yongqli | Owner:
Type: bug | Status: new
Priority: normal | Milestone:
Component: Compiler | Version: 7.10.1
Resolution: | Keywords:
Operating System: Unknown/Multiple | Architecture:
Type of failure: None/Unknown | Unknown/Multiple
Blocked By: | Test Case: yes
Related Tickets: | Blocking:
| Differential Revisions:
-------------------------------------+-------------------------------------
Comment (by Simon Peyton Jones

#10218: GHC creates incorrect code which throws <<loop>> -------------------------------------+------------------------------------- Reporter: yongqli | Owner: Type: bug | Status: merge Priority: normal | Milestone: Component: Compiler | Version: 7.10.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: None/Unknown | Unknown/Multiple Blocked By: | Test Case: Related Tickets: | strianal/should_run/T10218 | Blocking: | Differential Revisions: -------------------------------------+------------------------------------- Changes (by simonpj): * status: new => merge * testcase: yes => strianal/should_run/T10218 Comment: Right I've fixed this. Thank you for identifying it. I'm really sorry it cost you so much to track down. Simon -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10218#comment:14 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10218: GHC creates incorrect code which throws <<loop>> -------------------------------------+------------------------------------- Reporter: yongqli | Owner: Type: bug | Status: merge Priority: normal | Milestone: 7.10.2 Component: Compiler | Version: 7.10.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: None/Unknown | Unknown/Multiple Blocked By: | Test Case: Related Tickets: | strianal/should_run/T10218 | Blocking: | Differential Revisions: -------------------------------------+------------------------------------- Changes (by simonpj): * milestone: => 7.10.2 Comment: Merge to 7.10.2 -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10218#comment:15 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10218: GHC creates incorrect code which throws <<loop>> -------------------------------------+------------------------------------- Reporter: yongqli | Owner: Type: bug | Status: closed Priority: normal | Milestone: 7.10.2 Component: Compiler | Version: 7.10.1 Resolution: fixed | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: None/Unknown | Unknown/Multiple Blocked By: | Test Case: Related Tickets: | strianal/should_run/T10218 | Blocking: | Differential Revisions: -------------------------------------+------------------------------------- Changes (by thoughtpolice): * status: merge => closed * resolution: => fixed Comment: Merged to `ghc-7.10` via 5c10c69849c8029db5b5e7ee540df308d8941957 -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10218#comment:16 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler
participants (1)
-
GHC