[GHC] #10491: Simplifier explosion with Accelerate

#10491: Simplifier explosion with Accelerate -------------------------------------+------------------------------------- Reporter: robertce | Owner: Type: bug | Status: new Priority: normal | Milestone: 7.10.2 Component: Compiler | Version: 7.10.1 Keywords: | Operating System: Unknown/Multiple Architecture: | Type of failure: None/Unknown Unknown/Multiple | Blocked By: Test Case: | Related Tickets: Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- In [https://github.com/AccelerateHS/accelerate Accelerate] we are encountering problems with the simplifier exploding the number of terms, which consequently cause huge compile times. This has only happened since the release of 7.10.1 and it appears to still be happening in the ghc-7.10 branch. In the attached Slice.hs you can see what is about the worst case example. Trying to compile this with `-O2` yields {{{ [1 of 1] Compiling Data.Array.Accelerate.CUDA.Array.Slice ( Slice.hs, Slice.o ) ghc: panic! (the 'impossible' happened) (GHC version 7.10.1.20150602 for x86_64-apple-darwin): Simplifier ticks exhausted When trying UnfoldingDone $j_s1FRp To increase the limit, use -fsimpl-tick-factor=N (default 100) If you need to do this, let GHC HQ know, and what factor you needed To see detailed counts use -ddump-simpl-stats Total ticks: 1668604 }}} Bumping the tick factor to 150 stops this from happening, but at that point the simplifier blows up. See output.txt for the output of `ghc -v`. After half an hour we end up with ~4 million terms. At this point ghc takes up about 9GB of memory and I have to kill it. In order to reproduce this, you will need the `accelerate` package and all its various dependencies. The one on hackage works fine. I will try to come up with a test case that is not so heavy on dependencies, but I wanted to ensure this was raised before 7.10.2 is released. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10491 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10491: Simplifier explosion with Accelerate -------------------------------------+------------------------------------- Reporter: robertce | Owner: Type: bug | Status: new Priority: normal | Milestone: 7.10.2 Component: Compiler | Version: 7.10.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: None/Unknown | Unknown/Multiple Blocked By: | Test Case: Related Tickets: | Blocking: | Differential Revisions: -------------------------------------+------------------------------------- Description changed by robertce: Old description:
In [https://github.com/AccelerateHS/accelerate Accelerate] we are encountering problems with the simplifier exploding the number of terms, which consequently cause huge compile times. This has only happened since the release of 7.10.1 and it appears to still be happening in the ghc-7.10 branch.
In the attached Slice.hs you can see what is about the worst case example. Trying to compile this with `-O2` yields
{{{ [1 of 1] Compiling Data.Array.Accelerate.CUDA.Array.Slice ( Slice.hs, Slice.o ) ghc: panic! (the 'impossible' happened) (GHC version 7.10.1.20150602 for x86_64-apple-darwin): Simplifier ticks exhausted When trying UnfoldingDone $j_s1FRp To increase the limit, use -fsimpl-tick-factor=N (default 100) If you need to do this, let GHC HQ know, and what factor you needed To see detailed counts use -ddump-simpl-stats Total ticks: 1668604 }}}
Bumping the tick factor to 150 stops this from happening, but at that point the simplifier blows up. See output.txt for the output of `ghc -v`. After half an hour we end up with ~4 million terms. At this point ghc takes up about 9GB of memory and I have to kill it.
In order to reproduce this, you will need the `accelerate` package and all its various dependencies. The one on hackage works fine. I will try to come up with a test case that is not so heavy on dependencies, but I wanted to ensure this was raised before 7.10.2 is released.
New description: In [https://github.com/AccelerateHS/accelerate Accelerate] we are encountering problems with the simplifier exploding the number of terms, which consequently cause huge compile times. This has only happened since the release of 7.10.1 and it appears to still be happening in the ghc-7.10 branch. In the attached Slice.hs you can see what is about the worst case example. Trying to compile this with `-O2` yields {{{ [1 of 1] Compiling Data.Array.Accelerate.CUDA.Array.Slice ( Slice.hs, Slice.o ) ghc: panic! (the 'impossible' happened) (GHC version 7.10.1.20150602 for x86_64-apple-darwin): Simplifier ticks exhausted When trying UnfoldingDone $j_s1FRp To increase the limit, use -fsimpl-tick-factor=N (default 100) If you need to do this, let GHC HQ know, and what factor you needed To see detailed counts use -ddump-simpl-stats Total ticks: 1668604 }}} Bumping the tick factor to 150 stops this from happening, but at that point the simplifier blows up. See output.txt for the output of `ghc -v`. After half an hour we end up with ~4 million terms. At this point ghc takes up about 9GB of memory and I have to kill it. In order to reproduce this, you will need the `accelerate` package and all its various dependencies. It doesn't have to be the latest from github. The version on hackage also works with the example I have given. I will try to come up with a test case that is not so heavy on dependencies, but I wanted to ensure this was raised before 7.10.2 is released. -- -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10491#comment:1 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10491: Simplifier explosion with Accelerate -------------------------------------+------------------------------------- Reporter: robertce | Owner: Type: bug | Status: new Priority: normal | Milestone: 7.10.2 Component: Compiler | Version: 7.10.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: None/Unknown | Unknown/Multiple Blocked By: | Test Case: Related Tickets: | Blocking: | Differential Revisions: -------------------------------------+------------------------------------- Comment (by George): This doesn't seem to be a problem on the 7.10.2 alpha Haskell Platform: ghc --version The Glorious Glasgow Haskell Compilation System, version 7.10.1.20150601 cabal install --ghc-options="-O2" accelerate --reinstall cabal install --ghc-options="-O2" accelerate --reinstall Resolving dependencies... In order, the following will be installed: accelerate-0.15.1.0 (reinstall) Warning: Note that reinstalls are always dangerous. Continuing anyway... Configuring accelerate-0.15.1.0... Building accelerate-0.15.1.0... Installed accelerate-0.15.1.0 Updating documentation index /Users/gcolpitts/Library/Haskell/share/doc/x86_64-osx- ghc-7.10.1.20150601/index.html fwiw the following didn't give an error but not sure if it did the "right" thing, in any case the above should be definitive cabal install -j3 -O2 accelerate cabal install -j3 -O2 accelerate Resolving dependencies... Configuring fclabels-2.0.2.2... Configuring hashtables-1.2.0.2... Building hashtables-1.2.0.2... Building fclabels-2.0.2.2... Installed fclabels-2.0.2.2 Installed hashtables-1.2.0.2 Configuring accelerate-0.15.1.0... Building accelerate-0.15.1.0... Installed accelerate-0.15.1.0 Updating documentation index /Users/gcolpitts/Library/Haskell/share/doc/x86_64-osx- ghc-7.10.1.20150601/index.html -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10491#comment:2 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10491: Simplifier explosion with Accelerate -------------------------------------+------------------------------------- Reporter: robertce | Owner: Type: bug | Status: new Priority: normal | Milestone: 7.10.2 Component: Compiler | Version: 7.10.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: None/Unknown | Unknown/Multiple Blocked By: | Test Case: Related Tickets: | Blocking: | Differential Revisions: -------------------------------------+------------------------------------- Comment (by robertce): Accelerate itself compiles without issue. The issue arises when compiling the attached `Slice.hs`, which is not part of the `accelerate` package. Sorry if that was unclear. It's actually part of an experimental version of the `accelerate-cuda` package, but I pulled it out to reduce the the number of dependencies required. Are you able to compile `Slice.hs` with the Haskell platform? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10491#comment:3 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10491: Simplifier explosion with Accelerate -------------------------------------+------------------------------------- Reporter: robertce | Owner: Type: bug | Status: new Priority: normal | Milestone: 7.10.2 Component: Compiler | Version: 7.10.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: None/Unknown | Unknown/Multiple Blocked By: | Test Case: Related Tickets: | Blocking: | Differential Revisions: -------------------------------------+------------------------------------- Description changed by robertce: Old description:
In [https://github.com/AccelerateHS/accelerate Accelerate] we are encountering problems with the simplifier exploding the number of terms, which consequently cause huge compile times. This has only happened since the release of 7.10.1 and it appears to still be happening in the ghc-7.10 branch.
In the attached Slice.hs you can see what is about the worst case example. Trying to compile this with `-O2` yields
{{{ [1 of 1] Compiling Data.Array.Accelerate.CUDA.Array.Slice ( Slice.hs, Slice.o ) ghc: panic! (the 'impossible' happened) (GHC version 7.10.1.20150602 for x86_64-apple-darwin): Simplifier ticks exhausted When trying UnfoldingDone $j_s1FRp To increase the limit, use -fsimpl-tick-factor=N (default 100) If you need to do this, let GHC HQ know, and what factor you needed To see detailed counts use -ddump-simpl-stats Total ticks: 1668604 }}}
Bumping the tick factor to 150 stops this from happening, but at that point the simplifier blows up. See output.txt for the output of `ghc -v`. After half an hour we end up with ~4 million terms. At this point ghc takes up about 9GB of memory and I have to kill it.
In order to reproduce this, you will need the `accelerate` package and all its various dependencies. It doesn't have to be the latest from github. The version on hackage also works with the example I have given. I will try to come up with a test case that is not so heavy on dependencies, but I wanted to ensure this was raised before 7.10.2 is released.
New description: In [https://github.com/AccelerateHS/accelerate Accelerate] we are encountering problems with the simplifier exploding the number of terms, which consequently cause huge compile times. This has only happened since the release of 7.10.1 and it appears to still be happening in the ghc-7.10 branch. In the attached Slice.hs you can see what is about the worst case example. Trying to compile this with `-O2` yields {{{ [1 of 1] Compiling Data.Array.Accelerate.CUDA.Array.Slice ( Slice.hs, Slice.o ) ghc: panic! (the 'impossible' happened) (GHC version 7.10.1.20150602 for x86_64-apple-darwin): Simplifier ticks exhausted When trying UnfoldingDone $j_s1FRp To increase the limit, use -fsimpl-tick-factor=N (default 100) If you need to do this, let GHC HQ know, and what factor you needed To see detailed counts use -ddump-simpl-stats Total ticks: 1668604 }}} Bumping the tick factor to 150 stops this from happening, but at that point the simplifier blows up. See output.txt for the output of `ghc -v`. After half an hour we end up with ~4 million terms. At this point ghc takes up about 9GB of memory and I have to kill it. In order to reproduce this, you will need the `accelerate` package and all its various dependencies. It doesn't have to be the latest from github. The version on hackage also works with the example I have given. I will try to come up with a test case that is not so heavy on dependencies, but I wanted to ensure this was raised before 7.10.2 is released. Steps to reproduce: {{{ cabal install accelerate ghc -O2 -fsimpl-tick-factor=150 Slice.hs }}} -- -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10491#comment:4 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10491: Simplifier explosion with Accelerate -------------------------------------+------------------------------------- Reporter: robertce | Owner: Type: bug | Status: new Priority: normal | Milestone: 7.10.2 Component: Compiler | Version: 7.10.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: None/Unknown | Unknown/Multiple Blocked By: | Test Case: Related Tickets: | Blocking: | Differential Revisions: -------------------------------------+------------------------------------- Comment (by George): Nope, I cannot successfully compile, I reproduce the problem: ghc -O2 Slice.hs [1 of 1] Compiling Data.Array.Accelerate.CUDA.Array.Slice ( Slice.hs, Slice.o ) ghc: panic! (the 'impossible' happened) (GHC version 7.10.1.20150601 for x86_64-apple-darwin): Simplifier ticks exhausted When trying RuleFired Class op bound To increase the limit, use -fsimpl-tick-factor=N (default 100) If you need to do this, let GHC HQ know, and what factor you needed To see detailed counts use -ddump-simpl-stats Total ticks: 1668404 Please report this as a GHC bug: http://www.haskell.org/ghc/reportabug -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10491#comment:5 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10491: Simplifier explosion with Accelerate, increasing tick factor is not a workaround -------------------------------------+------------------------------------- Reporter: robertce | Owner: Type: bug | Status: new Priority: normal | Milestone: 7.10.2 Component: Compiler | Version: 7.10.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: None/Unknown | Unknown/Multiple Blocked By: | Test Case: Related Tickets: | Blocking: | Differential Revisions: -------------------------------------+------------------------------------- -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10491#comment:6 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10491: Simplifier explosion with Accelerate, cannot compile, increasing tick factor is not a workaround -------------------------------------+------------------------------------- Reporter: robertce | Owner: Type: bug | Status: new Priority: normal | Milestone: 7.10.2 Component: Compiler | Version: 7.10.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: None/Unknown | Unknown/Multiple Blocked By: | Test Case: Related Tickets: | Blocking: | Differential Revisions: -------------------------------------+------------------------------------- -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10491#comment:7 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10491: Simplifier explosion with Accelerate, cannot compile, increasing tick factor is not a workaround -------------------------------------+------------------------------------- Reporter: robertce | Owner: Type: bug | Status: new Priority: normal | Milestone: 7.10.2 Component: Compiler | Version: 7.10.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: None/Unknown | Unknown/Multiple Blocked By: | Test Case: Related Tickets: | Blocking: | Differential Revisions: -------------------------------------+------------------------------------- Changes (by tmcdonell): * cc: tmcdonell@… (added) -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10491#comment:8 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10491: Simplifier explosion with Accelerate, cannot compile, increasing tick factor is not a workaround -------------------------------------+------------------------------------- Reporter: robertce | Owner: Type: bug | Status: new Priority: normal | Milestone: 7.10.2 Component: Compiler | Version: 7.10.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: None/Unknown | Unknown/Multiple Blocked By: | Test Case: Related Tickets: | Blocking: | Differential Revisions: -------------------------------------+------------------------------------- Changes (by George): * cc: george (added) -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10491#comment:9 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10491: Regression: Simplifier explosion with Accelerate, cannot compile, increasing tick factor is not a workaround -------------------------------------+------------------------------------- Reporter: robertce | Owner: Type: bug | Status: new Priority: normal | Milestone: 7.10.2 Component: Compiler | Version: 7.10.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: None/Unknown | Unknown/Multiple Blocked By: | Test Case: Related Tickets: | Blocking: | Differential Revisions: -------------------------------------+------------------------------------- -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10491#comment:10 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10491: Regression, simplifier explosion with Accelerate, cannot compile, increasing tick factor is not a workaround -------------------------------------+------------------------------------- Reporter: robertce | Owner: Type: bug | Status: new Priority: normal | Milestone: 7.10.2 Component: Compiler | Version: 7.10.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: None/Unknown | Unknown/Multiple Blocked By: | Test Case: Related Tickets: | Blocking: | Differential Revisions: -------------------------------------+------------------------------------- -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10491#comment:11 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10491: Regression, simplifier explosion with Accelerate, cannot compile, increasing tick factor is not a workaround -------------------------------------+------------------------------------- Reporter: robertce | Owner: Type: bug | Status: new Priority: normal | Milestone: 7.10.2 Component: Compiler | Version: 7.10.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Revisions: -------------------------------------+------------------------------------- Changes (by gidyn): * cc: gidyn (added) * failure: None/Unknown => Compile-time performance bug -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10491#comment:12 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10491: Regression, simplifier explosion with Accelerate, cannot compile, increasing tick factor is not a workaround -------------------------------------+------------------------------------- Reporter: robertce | Owner: Type: bug | Status: new Priority: highest | Milestone: 7.10.2 Component: Compiler | Version: 7.10.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Revisions: -------------------------------------+------------------------------------- Changes (by thoughtpolice): * priority: normal => highest Comment: Okay, I'm bumping this to high priority, and me and Ben can investigate and talk with Simon about this tomorrow in our meeting. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10491#comment:13 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10491: Regression, simplifier explosion with Accelerate, cannot compile, increasing tick factor is not a workaround -------------------------------------+------------------------------------- Reporter: robertce | Owner: Type: bug | Status: new Priority: highest | Milestone: 7.10.2 Component: Compiler | Version: 7.10.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Revisions: -------------------------------------+------------------------------------- Comment (by bgamari): George, if I understand correctly you were able to compile `Slice.hs` with GHC 7.10.1 if passed `-fsimpl-tick-factor=150`. Is this correct? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10491#comment:14 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10491: Regression, simplifier explosion with Accelerate, cannot compile, increasing tick factor is not a workaround -------------------------------------+------------------------------------- Reporter: robertce | Owner: Type: bug | Status: new Priority: highest | Milestone: 7.10.2 Component: Compiler | Version: 7.10.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Revisions: -------------------------------------+------------------------------------- Comment (by bgamari): For the record, my notes on this bug can be found here, https://github.com/bgamari/ghc-T10491/blob/master/notes.mkd -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10491#comment:15 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10491: Regression, simplifier explosion with Accelerate, cannot compile, increasing tick factor is not a workaround -------------------------------------+------------------------------------- Reporter: robertce | Owner: Type: bug | Status: new Priority: highest | Milestone: 7.10.2 Component: Compiler | Version: 7.10.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Revisions: -------------------------------------+------------------------------------- Comment (by George): Replying to [comment:14 bgamari]: I was able to compile it with -fsimpl-tick-factor=150, not sure if memory use was reasonable, cpu usage was close to 100%. As I wrote above: when I do ghc -O2 -fsimpl-tick-factor=150 Slice.hs it quickly uses over 11G of memory. I'm not sure what the peak memory use is or how long it took to compile as I let it run overnight (I have 12G of RAM on my Mac). The resulting binary of the 321 line source file is 51 MB -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10491#comment:16 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10491: Regression, simplifier explosion with Accelerate, cannot compile, increasing tick factor is not a workaround -------------------------------------+------------------------------------- Reporter: robertce | Owner: Type: bug | Status: new Priority: highest | Milestone: 7.10.2 Component: Compiler | Version: 7.10.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Revisions: -------------------------------------+------------------------------------- Comment (by bgamari): Replying to [comment:16 George]:
I was able to compile it with -fsimpl-tick-factor=150, not sure if
memory use was reasonable, cpu usage was close to 100%. As I wrote above:
when I do
ghc -O2 -fsimpl-tick-factor=150 Slice.hs
it quickly uses over 11G of memory. I'm not sure what the peak memory
use is or how long it took to compile as I let it run overnight (I have 12G of RAM on my Mac). The resulting binary of the 321 line source file is 51 MB What I'm a bit unclear on is that the bug description seems to suggest this behavior began at some point since the release of 7.10.1,
This has only happened since the release of 7.10.1 and it appears to still be happening in the ghc-7.10 branch.
However I am finding that the commits prior to the 7.10.1 release also exhibit the behavior. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10491#comment:17 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

What I'm a bit unclear on is that the bug description seems to suggest
#10491: Regression, simplifier explosion with Accelerate, cannot compile, increasing tick factor is not a workaround -------------------------------------+------------------------------------- Reporter: robertce | Owner: Type: bug | Status: new Priority: highest | Milestone: 7.10.2 Component: Compiler | Version: 7.10.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Revisions: -------------------------------------+------------------------------------- Comment (by robertce): Replying to [comment:17 bgamari]: this behavior began at some point since the release of 7.10.1,
This has only happened since the release of 7.10.1 and it appears to
still be happening in the ghc-7.10 branch.
However I am finding that the commits prior to the 7.10.1 release also
exhibit the behavior. I'm not surprised it's also happening with commits prior to 7.10.1. By saying it's only happened since the release of 7.10.1, I was speaking in release terms. It works with 7.8.4, but not 7.10.1 or the current 7.10 branch. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10491#comment:18 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10491: Regression, simplifier explosion with Accelerate, cannot compile, increasing tick factor is not a workaround -------------------------------------+------------------------------------- Reporter: robertce | Owner: Type: bug | Status: new Priority: highest | Milestone: 7.10.2 Component: Compiler | Version: 7.10.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Revisions: -------------------------------------+------------------------------------- Comment (by bgamari): Replying to [comment:18 robertce]:
I'm not surprised it's also happening with commits prior to 7.10.1. By saying it's only happened since the release of 7.10.1, I was speaking in release terms. It works with 7.8.4, but not 7.10.1 or the current 7.10 branch.
Ahh, I see. English can be a frustratingly ambiguous language at times. In other news, it appears that 7.10's behavior begins to diverge from that of 7.8.3 at float out, * after the initial simplifier phase both 7.8 and 7.10 have roughly 3000 terms, 7000 types * after float out the program has 3000 terms, 8000 types in 7.8 whereas it has 12000 terms and 60000 types in 029a296a770addbd096bbfd6de0936327ee620d4 -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10491#comment:19 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10491: Regression, simplifier explosion with Accelerate, cannot compile, increasing tick factor is not a workaround -------------------------------------+------------------------------------- Reporter: robertce | Owner: Type: bug | Status: new Priority: highest | Milestone: 7.10.2 Component: Compiler | Version: 7.10.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Revisions: -------------------------------------+------------------------------------- Comment (by simonpj): I've just compiled `Slice.hs` with HEAD quite easily {{{ bash$ ghc-stage2 -dshow-passes Slice.hs -package accelerate -O2 -fforce- recomp .... 4,168,610,960 bytes allocated in the heap 973,775,616 bytes copied during GC 67,207,400 bytes maximum residency (14 sample(s)) 2,684,880 bytes maximum slop 185 MB total memory in use (0 MB lost due to fragmentation) }}} The program never ets vast {{{ [1 of 1] Compiling Data.Array.Accelerate.CUDA.Array.Slice ( Slice.hs, Slice.o ) *** Parser: *** Renamer/typechecker: *** Desugar: Result size of Desugar (after optimization) = {terms: 2,214, types: 3,698, coercions: 26} *** Simplifier: Result size of Simplifier iteration=1 = {terms: 2,183, types: 3,841, coercions: 20} Result size of Simplifier iteration=2 = {terms: 2,182, types: 3,835, coercions: 20} Result size of Simplifier = {terms: 2,182, types: 3,835, coercions: 20} *** Specialise: Result size of Specialise = {terms: 6,196, types: 26,056, coercions: 470} *** Float out(FOS {Lam = Just 0, Consts = True, OverSatApps = False}): Result size of Float out(FOS {Lam = Just 0, Consts = True, OverSatApps = False}) = {terms: 7,738, types: 35,962, coercions: 470} *** Simplifier: Result size of Simplifier iteration=1 = {terms: 9,756, types: 45,944, coercions: 10,471} Result size of Simplifier iteration=2 = {terms: 10,295, types: 49,489, coercions: 21,914} Result size of Simplifier iteration=3 = {terms: 9,724, types: 46,649, coercions: 14,825} Result size of Simplifier iteration=4 = {terms: 9,694, types: 46,444, coercions: 14,376} Result size of Simplifier = {terms: 9,694, types: 46,444, coercions: 14,376} *** Simplifier: Result size of Simplifier iteration=1 = {terms: 9,301, types: 45,776, coercions: 13,996} Result size of Simplifier iteration=2 = {terms: 9,267, types: 45,696, coercions: 13,846} Result size of Simplifier iteration=3 = {terms: 9,268, types: 45,686, coercions: 13,744} Result size of Simplifier iteration=4 = {terms: 9,268, types: 45,681, coercions: 13,682} Result size of Simplifier = {terms: 9,268, types: 45,681, coercions: 13,682} *** Simplifier: Result size of Simplifier iteration=1 = {terms: 21,154, types: 76,735, coercions: 30,042} Result size of Simplifier iteration=2 = {terms: 18,674, types: 69,540, coercions: 22,805} Result size of Simplifier iteration=3 = {terms: 18,479, types: 67,598, coercions: 19,578} Result size of Simplifier iteration=4 = {terms: 17,773, types: 64,887, coercions: 16,920} Result size of Simplifier = {terms: 17,773, types: 64,887, coercions: 16,920} *** Float inwards: Result size of Float inwards = {terms: 17,773, types: 64,887, coercions: 16,920} *** Called arity analysis: Result size of Called arity analysis = {terms: 17,773, types: 64,887, coercions: 16,920} *** Simplifier: Result size of Simplifier iteration=1 = {terms: 17,458, types: 63,723, coercions: 15,764} Result size of Simplifier iteration=2 = {terms: 17,528, types: 63,751, coercions: 15,458} Result size of Simplifier iteration=3 = {terms: 17,560, types: 63,758, coercions: 15,272} Result size of Simplifier iteration=4 = {terms: 17,551, types: 63,748, coercions: 15,182} Result size of Simplifier = {terms: 17,551, types: 63,748, coercions: 15,182} *** Demand analysis: Result size of Demand analysis = {terms: 17,551, types: 63,748, coercions: 15,182} *** Worker Wrapper binds: Result size of Worker Wrapper binds = {terms: 21,047, types: 77,338, coercions: 15,182} *** Simplifier: Result size of Simplifier iteration=1 = {terms: 19,533, types: 69,686, coercions: 15,210} Result size of Simplifier iteration=2 = {terms: 18,528, types: 67,708, coercions: 15,346} Result size of Simplifier iteration=3 = {terms: 18,293, types: 67,029, coercions: 15,164} Result size of Simplifier = {terms: 18,293, types: 67,029, coercions: 15,164} *** Float out(FOS {Lam = Just 0, Consts = True, OverSatApps = True}): Result size of Float out(FOS {Lam = Just 0, Consts = True, OverSatApps = True}) = {terms: 19,134, types: 72,375, coercions: 15,164} *** Common sub-expression: Result size of Common sub-expression = {terms: 17,353, types: 61,395, coercions: 5,875} *** Float inwards: Result size of Float inwards = {terms: 17,353, types: 61,395, coercions: 5,875} *** Liberate case: Result size of Liberate case = {terms: 17,353, types: 61,395, coercions: 5,875} *** Simplifier: Result size of Simplifier iteration=1 = {terms: 15,872, types: 53,291, coercions: 6,049} Result size of Simplifier = {terms: 15,745, types: 52,388, coercions: 5,875} *** SpecConstr: Result size of SpecConstr = {terms: 16,083, types: 52,968, coercions: 5,875} *** Simplifier: Result size of Simplifier iteration=1 = {terms: 16,074, types: 52,968, coercions: 5,875} Result size of Simplifier iteration=2 = {terms: 16,075, types: 53,003, coercions: 5,875} Result size of Simplifier = {terms: 16,075, types: 53,003, coercions: 5,875} *** Tidy Core: Result size of Tidy Core = {terms: 10,807, types: 36,367, coercions: 5,684} *** CorePrep: Result size of CorePrep = {terms: 13,102, types: 39,418, coercions: 5,688} *** Stg2Stg: *** CodeGen: *** Assembler: }}} The -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10491#comment:20 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10491: Regression, simplifier explosion with Accelerate, cannot compile, increasing tick factor is not a workaround -------------------------------------+------------------------------------- Reporter: robertce | Owner: Type: bug | Status: new Priority: highest | Milestone: 7.10.2 Component: Compiler | Version: 7.10.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Revisions: -------------------------------------+------------------------------------- Comment (by bgamari): The blow-up in 7.10 appears to be due to the introduction of silent superclass parameters, a notion which has been replaced in `master` (namely a6f0f5ab45b2643b561e0a0a54a4f14745ab2152 and surrounding commits). I'm working on backporting these to the 7.10 branch. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10491#comment:21 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10491: Regression, simplifier explosion with Accelerate, cannot compile, increasing tick factor is not a workaround -------------------------------------+------------------------------------- Reporter: robertce | Owner: Type: bug | Status: new Priority: highest | Milestone: 7.10.2 Component: Compiler | Version: 7.10.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Revisions: -------------------------------------+------------------------------------- Comment (by bgamari): Unfortunately it is a bit tricker bringing a6f0f5ab45b2643b561e0a0a54a4f14745ab2152 to `ghc-7.10` than I was hoping it would be, * We want a6f0f5ab45b2643b561e0a0a54a4f14745ab2152 (Eliminate so-called "silent superclass parameters") * Unfortunately this breaks the constraint solver (#10335, manifested when compiling `Data.Data` as missing `Typeable` instances) * This appears to be fixed in `master` by 646866ff318d6eb8beeed98032644182dd9d997b (Fix superclass generation in an instance) * Unfortunately an approximation of this appears to already be present as 7f24cdd63fbfd69a83f81e85384dc8cb7ef57704 I haven't yet taken a careful look at able to the patches fixing superclass generation. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10491#comment:22 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10491: Regression, simplifier explosion with Accelerate, cannot compile, increasing tick factor is not a workaround -------------------------------------+------------------------------------- Reporter: robertce | Owner: Type: bug | Status: new Priority: highest | Milestone: 7.10.2 Component: Compiler | Version: 7.10.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Revisions: -------------------------------------+------------------------------------- Comment (by simonpj): I looked at this a bit too, but I'm now in an all-day meeting. One idea. I think that 95% of the pain comes from cross-module specialisation, dramatically exacerbated by the silent superclasses thing. If I'm right, a much quicker fix would be `-fno-specialise`. A less draconian fix would be an extra flag `-fno-cross-module- specialise`, which turns `Specialise.specImports` into a no-op. GHC never ''used'' to do cross-module specialisation. Maybe we should even disable it by-default in 7.10? Anyway implementing such a flag would take only a minute or two, and would be a Good Thing anyway. Moving HEAD's fix for silent userclasses is a pretty complicated solution. Ben, suggestions: * See if `-fno-specialise` solves the problem * Add the flag and see if that too solves the problem * Add some stats gathering in `Specialise` that reports how many imported functions are specialised, at how many different types, and how big the specialised code is. The idea is so we can more easily see when there's a specialisation blow-up. * If so we can switch off cross-module specialisation by default (although I'd prefer it to be on in 7.12) Simon -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10491#comment:23 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10491: Regression, simplifier explosion with Accelerate, cannot compile, increasing tick factor is not a workaround -------------------------------------+------------------------------------- Reporter: robertce | Owner: Type: bug | Status: new Priority: highest | Milestone: 7.10.2 Component: Compiler | Version: 7.10.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Revisions: -------------------------------------+------------------------------------- Comment (by George): -fno-specialise does not seem to solve the problem ( time ghc -v -fsimpl-tick-factor=150 -O2 Slice.hs ) > t.out 2>&1 & tail t.out {{{ Result size of Float inwards = {terms: 4,218,587, types: 4,087,814, coercions: 113,719} *** Called arity analysis: Result size of Called arity analysis = {terms: 4,218,587, types: 4,087,814, coercions: 113,719} *** Simplifier: Result size of Simplifier iteration=1 = {terms: 4,217,640, types: 4,085,011, coercions: 111,731} Result size of Simplifier iteration=2 = {terms: 4,217,311, types: 4,083,869, coercions: 110,905} }}} -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10491#comment:24 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10491: Regression, simplifier explosion with Accelerate, cannot compile, increasing tick factor is not a workaround -------------------------------------+------------------------------------- Reporter: robertce | Owner: Type: bug | Status: new Priority: highest | Milestone: 7.10.2 Component: Compiler | Version: 7.10.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Revisions: -------------------------------------+------------------------------------- Comment (by bgamari): George, don't worry about the edits; I am quite guilty of long strings of edits myself and as far as I can tell Trac only produces a message on the initial submission. Simon, I am sad to say that I can confirm that `-fno-specialise` does not help this issue. That being said, I'll see to it that we have a enabled- by-default `-fcross-module-specialise` option and some diagnostics output going forward. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10491#comment:25 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10491: Regression, simplifier explosion with Accelerate, cannot compile, increasing tick factor is not a workaround -------------------------------------+------------------------------------- Reporter: robertce | Owner: Type: bug | Status: new Priority: highest | Milestone: 7.10.2 Component: Compiler | Version: 7.10.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Revisions: -------------------------------------+------------------------------------- Comment (by simonpj): Well I suggest you use `-fno-specialise` in any further investigations. It doesn't fix it, but it'll remove distracting clutter from what you see. The silent-superclass thing is not really the ''cause''. (And hence I don't think it's worth a heroic effort to port HEAD to 7.10. It certainly generates a lot more dictionaries, so perhaps there might be twice as much code as without. But you are seeing MUCH more than that! So maybe there is a bug lurking in HEAD too; it just happens that the code generated by silent superclasses in 7.10 tickles it. So the hunt goes on. Monday is a washout for me. Simon -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10491#comment:26 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10491: Regression, simplifier explosion with Accelerate, cannot compile, increasing tick factor is not a workaround -------------------------------------+------------------------------------- Reporter: robertce | Owner: Type: bug | Status: new Priority: highest | Milestone: 7.10.2 Component: Compiler | Version: 7.10.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Revisions: -------------------------------------+------------------------------------- Comment (by bgamari): George, what GHC version did you test `-fno-specialise` on? While yesterday I was able to confirm that `-fno-specialise` seemed to make no difference on a test machine running what should have been 7.10.1 I am now having trouble replicating this on my laptop. Unfortunately I no longer have access to the test environment on which I tested this yesterday but clearly something was inconsistent. I am now seeing on multiple machines that `-fno-specialise` indeed allows things to compile, {{{ $ ghc -V The Glorious Glasgow Haskell Compilation System, version 7.10.1 $$ time ghc Slice.hs -fforce-recomp -O2 -fno-specialise [1 of 1] Compiling Slice ( Slice.hs, Slice.o ) real 0m3.759s user 0m1.688s sys 0m0.044s $ time ghc Slice.hs -fforce-recomp -O2 [1 of 1] Compiling Slice ( Slice.hs, Slice.o ) ^C real 0m51.103s user 0m44.336s sys 0m0.948s }}} I am now looking at whether disabling only cross-module specialisation is enough to eliminate the blow-up. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10491#comment:27 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10491: Regression, simplifier explosion with Accelerate, cannot compile, increasing tick factor is not a workaround -------------------------------------+------------------------------------- Reporter: robertce | Owner: Type: bug | Status: new Priority: highest | Milestone: 7.10.2 Component: Compiler | Version: 7.10.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Revisions: -------------------------------------+------------------------------------- Comment (by bgamari): Phab:D999 adds a `-fno-cross-module-specialisation` option which works around the blow-up will likely still be present in 7.10.2. I'm still trying to work out exactly what is happening. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10491#comment:28 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10491: Regression, simplifier explosion with Accelerate, cannot compile, increasing tick factor is not a workaround -------------------------------------+------------------------------------- Reporter: robertce | Owner: Type: bug | Status: new Priority: highest | Milestone: 7.10.2 Component: Compiler | Version: 7.10.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Revisions: -------------------------------------+------------------------------------- Comment (by George): ghc -V The Glorious Glasgow Haskell Compilation System, version 7.10.1.20150612 It seems that the order of the args determines if it succeeds or not. {{{ time ghc -fno-specialise -fforce-recomp -O2 Slice.hs [1 of 1] Compiling Data.Array.Accelerate.CUDA.Array.Slice ( Slice.hs, Slice.o ) ghc: panic! (the 'impossible' happened) (GHC version 7.10.1.20150612 for x86_64-apple-darwin): Simplifier ticks exhausted When trying UnfoldingDone $j_s1FS5 To increase the limit, use -fsimpl-tick-factor=N (default 100) If you need to do this, let GHC HQ know, and what factor you needed To see detailed counts use -ddump-simpl-stats Total ticks: 1668604 Please report this as a GHC bug: http://www.haskell.org/ghc/reportabug real 0m19.542s user 0m18.451s sys 0m0.901s bash-3.2$ time ghc -O2 -fno-specialise -fforce-recomp Slice.hs [1 of 1] Compiling Data.Array.Accelerate.CUDA.Array.Slice ( Slice.hs, Slice.o ) real 0m3.299s user 0m3.085s sys 0m0.158s }}} -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10491#comment:29 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10491: Regression, simplifier explosion with Accelerate, cannot compile, increasing tick factor is not a workaround -------------------------------------+------------------------------------- Reporter: robertce | Owner: Type: bug | Status: new Priority: highest | Milestone: 7.10.2 Component: Compiler | Version: 7.10.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Revisions: -------------------------------------+------------------------------------- Comment (by bgamari): Having recently seen how the `-O` option is implemented, it's actually not terribly surprising that it's order dependent. Essentially `-O2` is equivalent to a passing a bunch of individual `-f` options explicitly. `-f` options are themselves just flipping flags in the parser state and are therefore order dependent. I'll admit that this behavior is a bit surprising, but then again I'm not entirely sure how to improve the situation without sacrificing expressive power. I wonder how other compilers manage this. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10491#comment:30 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10491: Regression, simplifier explosion with Accelerate, cannot compile, increasing tick factor is not a workaround -------------------------------------+------------------------------------- Reporter: robertce | Owner: Type: bug | Status: new Priority: highest | Milestone: 7.10.2 Component: Compiler | Version: 7.10.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Revisions: -------------------------------------+------------------------------------- Comment (by George): I think something needs to be changed. AFAIK, there is no documentation that it is order dependent. A language that is not order dependent should not have command line options that are! As a minimum if -O2 and -fno<something> means -f<something> and -fno<something> and some remaining -f<options> then the user should get an error message if the order is "wrong" and no error message if the order is right, i.e. -fno<something> wins. At least, this is what I think off the top of my head from my naive viewpoint. I believe there is a ticket for an ER or task to improve command line options. I'll try to find it. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10491#comment:31 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10491: Regression, simplifier explosion with Accelerate, cannot compile, increasing tick factor is not a workaround -------------------------------------+------------------------------------- Reporter: robertce | Owner: Type: bug | Status: new Priority: highest | Milestone: 7.10.2 Component: Compiler | Version: 7.10.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Revisions: -------------------------------------+------------------------------------- Comment (by bgamari): Sure, let's continue the command line parsing discussion on another issue. On the issue of the specialiser, it seems that cross-module specialisation generates a fairly large number of `Shape` dictionaries (along with its superclasses). `Shape` is defined thusly, {{{ class (Elt sh, Elt (Any sh), Repr.Shape (EltRepr sh)) => Shape sh where dim :: sh -> Int size :: sh -> Int ignore :: sh intersect :: sh -> sh -> sh toIndex :: sh -> sh -> Int fromIndex :: sh -> Int -> sh bound :: sh -> sh -> Boundary a -> Either a sh iter :: sh -> (sh -> a) -> (a -> a -> a) -> a -> a iter1 :: sh -> (sh -> a) -> (a -> a -> a) -> a rangeToShape :: (sh, sh) -> sh shapeToRange :: sh -> (sh, sh) shapeToList :: sh -> [Int] listToShape :: [Int] -> sh sliceAnyIndex :: sh -> Repr.SliceIndex (EltRepr (Any sh)) (EltRepr sh) () (EltRepr sh) }}} The most noticeable dictionaries one finds in the Core are for types of the form `(((), Int), Int)`. For instance, {{{ $s$fShape(,)_sby9 :: Data.Array.Accelerate.Array.Representation.Shape (((), Int), Int) $s$fShape(,)_sby9 = Data.Array.Accelerate.Array.Representation.D:Shape @ (((), Int), Int) $dEq_a9Zd $dSlice_a9Ze (Data.Array.Accelerate.Array.Representation.$fShape(,)_$cdim @ ((), Int) $dEq_a9Zd $dSlice_a9Ze $dShape_a9YB) ... $s$fShape(,)_$cdim_sbxD :: (((), Int), Int) -> Int $s$fShape(,)_$cdim_sbxD = \ _ -> case Data.Array.Accelerate.Array.Representation.dim @ ((), Int) $dShape_a9YB (undefined @ ((), Int)) of _ { GHC.Types.I# x_abe0 -> GHC.Types.I# (GHC.Prim.+# x_abe0 1) } -- along with implementations for the remaining functions of Shape }}} Along with the dictionary the specialiser also generates implementations for `Eq` and `ArrayElt` for these types. All of these are generated for all types up to `((((((((((), Int), Int), Int), Int), Int), Int), Int), Int), Int)`. In addition, one also finds `ArrayElt` dictionaries for tuples of the same shape but with `s/Int/()`, e.g. `(((), ()), ())`. One then finds `Show`, `Elt`, and `Shape` implementations for types of the form `(((Z :. Int) :. Int) :. Int)` for all types up to `(((((((((Z :. Int) :. Int) :. A) :. A) :. A) :. A) :. A) :. A)`. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10491#comment:32 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10491: Regression, simplifier explosion with Accelerate, cannot compile, increasing tick factor is not a workaround -------------------------------------+------------------------------------- Reporter: robertce | Owner: Type: bug | Status: new Priority: highest | Milestone: 7.10.2 Component: Compiler | Version: 7.10.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Revisions: -------------------------------------+------------------------------------- Comment (by simonpj): * There is, so far as I can see, no documentation of the fact that command-line arguments are applied in-order (so `-fspecialise -fno- specialise` has the same effect as `-fno-specialise`). '''Ben''': could you add documentation of this, to the preamble of 4.10 in the user manual? In 4.10.1, when discussing `-O`, stress the order-dependence with individual flags. * Let's debate whether or not this order dependence is a good design on another ticket. * I think we have a workaround for this particular problem (`-fno- specialise`) * But as Ben mentions, we still don't know why it leads to such a stunning blow-up, so let's keep looking for that. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10491#comment:33 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10491: Regression, simplifier explosion with Accelerate, cannot compile, increasing tick factor is not a workaround -------------------------------------+------------------------------------- Reporter: robertce | Owner: Type: bug | Status: new Priority: highest | Milestone: 7.10.2 Component: Compiler | Version: 7.10.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Revisions: -------------------------------------+------------------------------------- Comment (by bgamari): I've opened #10560 to discuss the command-line parser behavior. George, perhaps you want to write down some thoughts there (or just paste Comment 31 from above for reference)? Simon, documentation is on its way. I'll be looking at the simplifier blow-up shortly. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10491#comment:34 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10491: Regression, simplifier explosion with Accelerate, cannot compile, increasing tick factor is not a workaround -------------------------------------+------------------------------------- Reporter: robertce | Owner: Type: bug | Status: new Priority: highest | Milestone: 7.10.2 Component: Compiler | Version: 7.10.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Revisions: -------------------------------------+------------------------------------- Comment (by bgamari): One thing in the post-explosion Core that I've noticed is some extremely large bindings of type, {{{#!hs lvl_scb5 :: forall a_aaJ9. (((((((Z :. Int) :. Int) :. A) :. A) :. A) :. A) :. A) :. Int -> (((((((Z :. Int) :. Int) :. A) :. A) :. A) :. A) :. A) :. Int -> Data.Array.Accelerate.Type.Boundary a_aaJ9 -> Either a_aaJ9 ((((((((Z :. Int) :. Int) :. A) :. A) :. A) :. A) :. A) :. Int) }}} with variously sized indices. The size of the binding (determined very roughly by the number of lines in the textual Core representation) scales very strongly with the size of the index, * `((((Z :. Int) :. Int) :. A) :. A) :. Int`: 11777 lines * `(((((Z :. Int) :. Int) :. A) :. A) :. A) :. Int`: 34246 lines * `((((((Z :. Int) :. Int) :. A) :. A) :. A) :. A) :. Int`: 108745 lines * `(((((((Z :. Int) :. Int) :. A) :. A) :. A) :. A) :. A) :. Int`: 351252 lines -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10491#comment:35 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

I've opened #10560 to discuss the command-line parser behavior. George,
#10491: Regression, simplifier explosion with Accelerate, cannot compile, increasing tick factor is not a workaround -------------------------------------+------------------------------------- Reporter: robertce | Owner: Type: bug | Status: new Priority: highest | Milestone: 7.10.2 Component: Compiler | Version: 7.10.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Revisions: -------------------------------------+------------------------------------- Comment (by George): Replying to [comment:34 bgamari]: Ben, Simon thanks for explaining this. I've added my comments to #10560. Sorry to distract you from the bug here. I was just surprised by the command line option issue. Adding the documentation will definitely be a big help. perhaps you want to write down some thoughts there (or just paste Comment 31 from above for reference)?
Simon, documentation is on its way.
I'll be looking at the simplifier blow-up shortly.
-- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10491#comment:36 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10491: Regression, simplifier explosion with Accelerate, cannot compile, increasing tick factor is not a workaround -------------------------------------+------------------------------------- Reporter: robertce | Owner: Type: bug | Status: new Priority: highest | Milestone: 7.10.2 Component: Compiler | Version: 7.10.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Revisions: -------------------------------------+------------------------------------- Comment (by bgamari): At this point I'm pretty certain that `bound` for the deep `(:.)` instances is the culprit here. For instance, the implementation for `((((((((Z :. Int) :. Int) :. A) :. A) :. A) :. A) :. A) :. A)` grows from `{terms: 10, types: 59, coercions: 147}` to `{terms: 1,375,016, types: 1,282,534, coercions: 14,151}` in one simplifier iteration (phase 0). Let's follow the specialized implementation of `bound` for `((Z .: Int) :. Int) :. Int` which clearly exhibits this explosion, but on a much smaller scale (from `{terms: 10, types: 23, coercions: 33}` to `{terms: 2,789, types: 2,743, coercions: 78}` over the same simplifier iteration; it's not clear that the smaller types are exploding quite as much, namely they still have 0 coercions after the simplifier does its work). It starts simple enough, {{{#!hs $s$fShape:._$cbound_sbQv :: forall a_aaK6. ((Z :. Int) :. Int) :. Int -> ((Z :. Int) :. Int) :. Int -> Data.Array.Accelerate.Type.Boundary a_aaK6 -> Either a_aaK6 (((Z :. Int) :. Int) :. Int) $s$fShape:._$cbound_sbQv = \ (@ a142_abqH) (w4_abqI :: ((Z :. Int) :. Int) :. Int) (w5_abqJ :: ((Z :. Int) :. Int) :. Int) (w6_abqK :: Data.Array.Accelerate.Type.Boundary a142_abqH) -> Data.Array.Accelerate.Array.Sugar.$w$cbound @ ((Z :. Int) :. Int) $s$fElt:._sbKS ($s$fShape(,)_sbz9 `cast` ...) @ a142_abqH w4_abqI w5_abqJ w6_abqK }}} This gets simplified to this, https://gist.github.com/bgamari/c6e360478b0e98df62e3. More analysis shortly. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10491#comment:37 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10491: Regression, simplifier explosion with Accelerate, cannot compile, increasing tick factor is not a workaround -------------------------------------+------------------------------------- Reporter: robertce | Owner: Type: bug | Status: new Priority: highest | Milestone: 7.10.2 Component: Compiler | Version: 7.10.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Revisions: -------------------------------------+------------------------------------- Comment (by bgamari): Let's look at `bound` more closely, {{{#!hs -- |Apply a boundary condition to an index. bound :: sh -> sh -> Boundary a -> Either a sh }}} It takes the size of an array, an index, and a `Boundary a` value describing how lookups falling outside of the array should be handled. The result is either the index that should be looked up or a fixed value. In light of this it seems clear where the deep case analyses appearing in the simplified `$s$fShape:._$cbound_sbQv` are coming from: the compiler is expanding tests for whether each index falls within the array. This test being replicated can be found here https://github.com/AccelerateHS/accelerate/blob/release/0.15/Data/Array/Acce.... Interestingly enough, the `master` branch (5803c5e566a0647139523f2aad06081dc2818ef2) decides not to inline any of the `Data.Array.Accelerate.Array.Sugar.$w$cbound` occurrences, which is why it is able to finish. Interestingly enough the default implementation `Data.Array.Accelerate.Array.Sugar.bound` is not marked as `INLINEABLE` and marking it as `NOINLINE` allows GHC 7.10.2 to compile. Likewise, marking `bound` as `INLINE` on `master` produces a blow-up similar to what we see with 7.10 (actually it does even worse, running out of simplifier ticks even with `-fsimpl-tick-factor=150`). I suppose that GHC 7.10 probably decides on its own that it should be `INLINEABLE` but `master` does not, resulting in the observed difference. Regardless, it does seem as though the Accelerate folks would want a function like `bound` to be inlined (robertce, perhaps you could comment here?). Perhaps the real bug here is our inability to produce a reasonable amount of optimized code for this function? That being said, the definition of `bound` isn't exactly conducive to efficient implementation. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10491#comment:38 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10491: Regression, simplifier explosion with Accelerate, cannot compile, increasing tick factor is not a workaround -------------------------------------+------------------------------------- Reporter: robertce | Owner: Type: bug | Status: new Priority: highest | Milestone: 7.10.2 Component: Compiler | Version: 7.10.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Revisions: -------------------------------------+------------------------------------- Comment (by robertce): Replying to [comment:38 bgamari]:
Regardless, it does seem as though the Accelerate folks would want a function like `bound` to be inlined (robertce, perhaps you could comment here?). Perhaps the real bug here is our inability to produce a reasonable amount of optimized code for this function? That being said, the definition of `bound` isn't exactly conducive to efficient implementation.
It's actually not terribly important that `bound` is inlined for Accelerate. We achieve high-performance by generating CUDA C or LLVM IR from our DSL rather than having fast Haskell code. The functions in `Shape`, like `bound`, are primarily used in our (slow) reference interpreter. If necessary, we could just put a `NOINLINE` pragma on `bound` or add an `-fno-specialse` to the top of any files affected. I guess what we really would like is to just not have such an explosion when something like `bound` is defined in the way it is. Having GHC figure out optimised code for it is not something I see us having any immediate need for. By the way, thanks for looking at this and tracking it down so quickly. Sorry I couldn't have been more help. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10491#comment:39 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10491: Regression, simplifier explosion with Accelerate, cannot compile, increasing tick factor is not a workaround -------------------------------------+------------------------------------- Reporter: robertce | Owner: Type: bug | Status: new Priority: highest | Milestone: 7.10.2 Component: Compiler | Version: 7.10.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Revisions: -------------------------------------+------------------------------------- Comment (by bgamari): Replying to [comment:39 robertce]:
It's actually not terribly important that `bound` is inlined for
Accelerate. We achieve high-performance by generating CUDA C or LLVM IR from our DSL rather than having fast Haskell code. The functions in `Shape`, like `bound`, are primarily used in our (slow) reference interpreter. If necessary, we could just put a `NOINLINE` pragma on `bound` or add an `-fno-specialse` to the top of any files affected. Adding a `NOINLINE` would be the least intrusive solution. We also will have a `-fno-cross-module-specialise` in 7.10.2 which also works around the issue but it obviously won't be available in earlier releases.
I guess what we really would like is to just not have such an explosion when something like `bound` is defined in the way it is. Having GHC figure out optimised code for it is not something I see us having any immediate need for.
There are "simple" steps I can imagine we could take to avoid these explosions (e.g. stop inlining into an expression after it has reached some size) but there are likely Good Reasons (TM) why we don't currently do this. Simon, perhaps you could comment here?
By the way, thanks for looking at this and tracking it down so quickly. Sorry I couldn't have been more help.
No worries! I'm glad I could help. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10491#comment:40 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10491: Regression, simplifier explosion with Accelerate, cannot compile, increasing tick factor is not a workaround -------------------------------------+------------------------------------- Reporter: robertce | Owner: Type: bug | Status: new Priority: highest | Milestone: 7.10.2 Component: Compiler | Version: 7.10.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Revisions: -------------------------------------+------------------------------------- Comment (by simonpj): Interesting! Looking at [https://github.com/AccelerateHS/accelerate/blob/release/0.15/Data/Array/Acce... this code] (linked in comment:38), we see that if `bound` is inlined, we get ''seven'' new calls to `bound`. And each of those seven will yield seven new ones, and so on. Very exponential! Some thoughts: * One could rewrite the linked code to say {{{ ... where bound1 = bound sh ix bndy }}} and use `bound1` for all seven occurrences of `bound sh ix bndy`. Does that help? * If it does, one might have hoped that CSE would catch it. But GHC's current CSE is not very clever. I have an idea for how to improve it so that it ''would'' catch it, if anyone interested in optimisation would like to try it out. RSVP. * Use `-ddump-inlinings` to see why HEAD is choosing not to inline it. * In general this kind of thing is quite hard to prevent. The classic case is {{{ f x = f (x-1) + f (x-2) }}} but GHC doesn't inline recursive functions, so we are ok. This case is more like {{{ f1 x = f2 (x-1) + f2 (x-2) f2 x = f3 (x-1) + f3 (x-2) f3 x = f4 (x-1) + f4 (x-2) ...etc... }}} But the successive f's are generated by the nested instance declarations. And I tried hard NOT to prevent inlining in these cases becuase it can be dramatically better to allow it. (E.g. look at the definition of `fromIndex` at the same link. I don't know a good way to trim off unproductive exponential inlinings without killing off good inlinings too. In short, I don't know a general solution. But any definition that has a seven-way (or even two-way) expansion over a recursive type index is going to cause trouble. Worth looking out for these. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10491#comment:41 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10491: Regression, simplifier explosion with Accelerate, cannot compile, increasing tick factor is not a workaround -------------------------------------+------------------------------------- Reporter: robertce | Owner: Type: bug | Status: new Priority: highest | Milestone: 7.10.2 Component: Compiler | Version: 7.10.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Revisions: -------------------------------------+------------------------------------- Comment (by bgamari): Simonpj, indeed sharing `bound sh ix bndy` between the alternatives does help, but only if the binding is marked as `NOINLINE`, otherwise it seems that the compiler is eager to inline it right back in. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10491#comment:42 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10491: Regression, simplifier explosion with Accelerate, cannot compile, increasing tick factor is not a workaround -------------------------------------+------------------------------------- Reporter: robertce | Owner: Type: bug | Status: new Priority: highest | Milestone: 7.10.2 Component: Compiler | Version: 7.10.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Revisions: -------------------------------------+------------------------------------- Comment (by simonpj): Aha, yes you are right. If we have {{{ let x = bound sh ix bndy in case v of A -> case x of ... B -> case x of ... etc }}} then GHC will inline `x` because * Doing so eliminates a thunk * Doing so does no more work Hmm. I really don't see a good general solution here. As an ad-hoc solution, using NOINLINE will be fine. It's probably simpler to use it on the entire `bound` method (with a prominent Note to explain) rather than creating an auxiliary NOINLINE binding. That would be an upstream change in `accelerate`. For posterity, here is the offending instance in `Data.Array.Accelerate.Array.Representation`: {{{ instance Shape sh => Shape (sh, Int) where ... bound (sh, sz) (ix, i) bndy | i < 0 = case bndy of Clamp -> bound sh ix bndy `addDim` 0 Mirror -> bound sh ix bndy `addDim` (-i) Wrap -> bound sh ix bndy `addDim` (sz+i) Constant e -> Left e | i >= sz = case bndy of Clamp -> bound sh ix bndy `addDim` (sz-1) Mirror -> bound sh ix bndy `addDim` (sz-(i-sz+2)) Wrap -> bound sh ix bndy `addDim` (i-sz) Constant e -> Left e | otherwise = bound sh ix bndy `addDim` i where Right ds `addDim` d = Right (ds, d) Left e `addDim` _ = Left e }}} Note the seven recursive calls; plus the fact that we use instances with deeply-nested tuples Simon -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10491#comment:43 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10491: Regression, simplifier explosion with Accelerate, cannot compile, increasing tick factor is not a workaround -------------------------------------+------------------------------------- Reporter: robertce | Owner: Type: bug | Status: new Priority: highest | Milestone: 7.10.2 Component: Compiler | Version: 7.10.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Revisions: -------------------------------------+------------------------------------- Comment (by bgamari): I've opened this pull-request addressing this issue upstream. https://github.com/AccelerateHS/accelerate/pull/276 -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10491#comment:44 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10491: Regression, simplifier explosion with Accelerate, cannot compile, increasing tick factor is not a workaround -------------------------------------+------------------------------------- Reporter: robertce | Owner: bgamari Type: bug | Status: new Priority: highest | Milestone: 7.10.2 Component: Compiler | Version: 7.10.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Revisions: -------------------------------------+------------------------------------- Changes (by bgamari): * owner: => bgamari -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10491#comment:45 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10491: Regression, simplifier explosion with Accelerate, cannot compile, increasing tick factor is not a workaround -------------------------------------+------------------------------------- Reporter: robertce | Owner: bgamari Type: bug | Status: closed Priority: highest | Milestone: 7.10.2 Component: Compiler | Version: 7.10.1 Resolution: wontfix | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Revisions: -------------------------------------+------------------------------------- Changes (by bgamari): * status: new => closed * resolution: => wontfix Comment: As Simon described above, there isn't much we can do to safely optimize this sort of program in general. Moreover, it seems that 7.11 already decides on its own not to inline the offending function. Closing as WONTFIX. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10491#comment:46 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#10491: Regression, simplifier explosion with Accelerate, cannot compile,
increasing tick factor is not a workaround
-------------------------------------+-------------------------------------
Reporter: robertce | Owner: bgamari
Type: bug | Status: closed
Priority: highest | Milestone: 7.10.2
Component: Compiler | Version: 7.10.1
Resolution: wontfix | Keywords:
Operating System: Unknown/Multiple | Architecture:
Type of failure: Compile-time | Unknown/Multiple
performance bug | Test Case:
Blocked By: | Blocking:
Related Tickets: | Differential Revisions:
-------------------------------------+-------------------------------------
Comment (by Ben Gamari

#10491: Regression, simplifier explosion with Accelerate, cannot compile,
increasing tick factor is not a workaround
-------------------------------------+-------------------------------------
Reporter: robertce | Owner: bgamari
Type: bug | Status: closed
Priority: highest | Milestone: 7.10.2
Component: Compiler | Version: 7.10.1
Resolution: wontfix | Keywords:
Operating System: Unknown/Multiple | Architecture:
Type of failure: Compile-time | Unknown/Multiple
performance bug | Test Case:
Blocked By: | Blocking:
Related Tickets: | Differential Revisions:
-------------------------------------+-------------------------------------
Comment (by Ben Gamari
participants (1)
-
GHC