[GHC] #9675: Unreasonable memory usage on large data structures

#9675: Unreasonable memory usage on large data structures -------------------------------------+------------------------------------- Reporter: Polarina | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.8.3 Keywords: | Operating System: Linux Architecture: x86_64 (amd64) | Type of failure: Compile- Difficulty: Unknown | time performance bug Blocked By: | Test Case: Related Tickets: | Blocking: | Differential Revisions: -------------------------------------+------------------------------------- I have attached a file that, when compiled, results in very large amount of memory usage by GHC. The memory used is in the order of tens of gigabytes, but I've not been able to successfully compile the file due to lack of physical memory. This is on Debian GNU/Linux (Jessie) using Haskell Platform 2014.2.0.0 with GHC 7.8.3. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9675 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9675: Unreasonable memory usage on large data structures -------------------------------------+------------------------------------- Reporter: Polarina | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.8.3 Resolution: | Keywords: Operating System: Linux | Architecture: x86_64 (amd64) Type of failure: Compile- | Difficulty: Unknown time performance bug | Blocked By: Test Case: | Related Tickets: Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Comment (by schyler): It's intriguing to know that this file only contains a single record and no 'actual' code. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9675#comment:1 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9675: Unreasonable memory usage on large data structures -------------------------------------+------------------------------------- Reporter: Polarina | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.8.3 Resolution: | Keywords: Operating System: Linux | Architecture: x86_64 (amd64) Type of failure: Compile- | Difficulty: Unknown time performance bug | Blocked By: Test Case: | Related Tickets: Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Comment (by rwbarton): Well, the data declaration does define ~3000 record accessors, each of which pattern matches on a record with ~3000 fields... it's easy for some things to become quadratic (intermediate program code size, strictness/demand info) even if in the end, the code size is only linear. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9675#comment:2 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9675: Unreasonable memory usage on large data structures -------------------------------------+------------------------------------- Reporter: Polarina | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.8.3 Resolution: | Keywords: Operating System: Linux | Architecture: x86_64 (amd64) Type of failure: Compile- | Difficulty: Unknown time performance bug | Blocked By: Test Case: | Related Tickets: Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Comment (by nomeata): Interesting extreme benchmark, maybe it will reveal some low-hanging fruit of algorithmic improvement. But is there actually a chance that the resulting code will be usable? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9675#comment:3 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9675: Unreasonable memory usage on large data structures -------------------------------------+------------------------------------- Reporter: Polarina | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.8.3 Resolution: | Keywords: Operating System: Linux | Architecture: x86_64 (amd64) Type of failure: Compile- | Difficulty: Unknown time performance bug | Blocked By: Test Case: | Related Tickets: Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Comment (by Polarina): The idea behind the code is to load all the OpenGL function pointers at program startup and store them in a data structure, rather than load them at runtime as needed; and to support multiple OpenGL contexts as there is no guarantee that two different contexts provide the same set of function pointers for the same set of functions. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9675#comment:4 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9675: Unreasonable memory usage on large data structures -------------------------------------+------------------------------------- Reporter: Polarina | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.8.3 Resolution: | Keywords: Operating System: Linux | Architecture: x86_64 (amd64) Type of failure: Compile- | Difficulty: Unknown time performance bug | Blocked By: Test Case: | Related Tickets: Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Comment (by nomeata): I created a plot to see what kind of complexity problem we have to look for. I gotta run now, so it’s not as polished as it should, and I didn’t have time yet to actually try to make sense of it. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9675#comment:5 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9675: Unreasonable memory usage on large data structures -------------------------------------+------------------------------------- Reporter: Polarina | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.8.3 Resolution: | Keywords: Operating System: Linux | Architecture: x86_64 (amd64) Type of failure: Compile- | Difficulty: Unknown time performance bug | Blocked By: Test Case: | Related Tickets: Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Comment (by nomeata): Did some GHC profiling, and the problem is clearly the huge number of useless variables in the pattern match in the Core for the record selector functions, here for a three-field-record (I’ll spare us the 3000-field- record): {{{ Test.field3 :: Test.Foo -> GHC.Types.Int -> GHC.Types.Int [GblId[[RecSel]], Arity=1, Caf=NoCafRefs, Str=DmdType] Test.field3 = \ (ds_dm2 :: Test.Foo) -> case ds_dm2 of _ [Occ=Dead] { Test.Foo ds1_dm3 ds2_dm4 ds3_dm5 -> ds3_dm5 } }}} See the attached heap profile, `IdInfo`, `Var`, `IntMap` and `[]` are the biggest chunks. The problem is that a data type declaration like {{{ data R = R { field1 :: T1, field2 :: T2 } }}} generates Haskell code of the form {{{ field3 (R {field3 = x}) = x }}} which is then turned in to the form above by general code that also handles multiple fields, field puns, and further patterns in the field. While the submitter’s code above may or may not be reasonable it migth be worth improving the situation for the benefot of modules with medium-sized records (including GHC’s own DynFlags module). I see a two approaches: 1. Avoid generating actual Core for these selector. Make them implicitly availailbe, and only generated them in STG or later. I am not sure if there is precedence for such implicit things. 2. Extend the this code somehow: {{{ AltCon = DataAlt DataCon | LitAlt Literal | DEFAULT }}} I have two ideas here. Either special-case the single-field-match, i.e. a new constructor {{{ | FieldSelAlt DataCon Int }}} which binds only one variable, the field indicated by the `Int` variable. This would get rid of the quadraticness of the representation. But quite a bit of code would have to handle two cases, and there would be no unique representation of the same match. The other one is to extend `DataAlt` by a bit field: {{{ = DataAlt DataCon [Bool] }}} (with a more suitable data structure for `[Bool]`, e.g. some kind of `BitField`). The idea is that this field specifies what fields of the constructor are actually used by the match. The number of variables bound by such an alternative would equal to the number of `True`s in the list. This would improve every alternative where not all variables are used. It might make some code more complicated – I could imagine that some compiler phases expect that within a case alternative, there is a locally bound name for every field – but maybe it’s not too bad. This would still be quadratic (but with a much lower factor – one bit instead of whatever `Id` + `IdInfo` + `(:)` + whatnot use). We can even get rid of that by making the `BitField` a data type like {{{ data BitField = Empty | Singleton Int | ManyBits ByteString | All }}} (`Empty` would be useful for `Con {}` patterns. Not sure if `All` is useful. Maybe `data BitField = Empty | Singleton Int | All` is actually enough, if few-but-more-than-one-field-patterns are rare.) At this point I’m hoping for someone experienced (e.g. SPJ, of course) to tell us which of these ideas are totally whacky, and which are worth pursuing, and if so, what problems to expect. But this is not a promise that I’ll implement it :-) -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9675#comment:6 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9675: Unreasonable memory usage on large data structures -------------------------------------+------------------------------------- Reporter: Polarina | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.8.3 Resolution: | Keywords: Operating System: Linux | Architecture: x86_64 (amd64) Type of failure: Compile- | Difficulty: Unknown time performance bug | Blocked By: Test Case: | Related Tickets: Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Comment (by simonpj): I agree that it is tantalising that so much `Core` is necessary to generate so little object code. Nevertheless, I am ''deeply'' reluctant to fiddle with the representation of `Core`. Doing so will impose overheads on every pass that deals with Core, and I'm far from convinced that it'll solve the problem. Anything along these lines would risk breaking optimisations. For example, here is how GHC optimises repeated evaluation of the same data constructor. Consider {{{ case x of K a _ _ _ -> ... (case x of K _ _ c _ -> ...a..c..) ... }}} Clearly we'd like to optimise this to {{{ case x of K a _ c _ -> ... ( ...a..c.. ) ... }}} And GHC does so by (effecitvely) transforming to {{{ case x of K a bb cc dd -> let x = K a bb cc dd in ... (case x of K _ _ c _ -> ...a..c..) ... }}} The `let` shadows the outer binding of `x`. Now the inner `case` can see that `x` is bound to `(K a bb cc dd)`, and so it's easy to do case-of- known-constructor to the inner case, giving {{{ case x of K a bb cc dd -> let x = K a bb cc dd in ... (let c = cc in ...a..c..) ... }}} (The `let` doesn't really exist, of course, it's only in the Simplifier's head.) All this is threatened if the outer case binds only `a`. I think a decent alternative approach would be not to generate code for selectors (at least when there are many fields) but rather to inline them at usage sites, via a so-called "compulsory unfolding". That too risks lots of code if you use a lot of record selectors, so it's not exactly a solid fix. The other thing is that I think it's quite likely that GHC has accumulated a space leak. The compiler makes repeated passes over the `Core` program. To avoid a leak, all vestiges of the input of each pass should be gone by the time we have computed the output of that pass. But I would not be amazed if that was not true at all -- that vesiges of pass 1 were still in memory when we were on pass 10. I would '''love''' someone to really look at this for us. It's not easy, but it could give a huge payoff for every GHC compilation, not just these weird corner cases. I certainly think we should be sure this is ''not'' happening before investing effort in optimising particular cases Simon -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9675#comment:7 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9675: Unreasonable memory usage on large data structures -------------------------------------+------------------------------------- Reporter: Polarina | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.8.3 Resolution: | Keywords: Operating System: Linux | Architecture: x86_64 (amd64) Type of failure: Compile- | Difficulty: Unknown time performance bug | Blocked By: Test Case: | Related Tickets: Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Comment (by nomeata): Thanks for your input. The way it breaks optimisation is what I meant with “making some code more complicated”: With a sparse list of bound variables, it would have to check which fields the optimization would like to use and then generated them on demand. I see that that might make the code very ugly.. The compulsory unfolding you mention would still be of the the shape of a huge pattern match, right? So the quadratic behaviour wouldn’t be eliminated. A comparison of the heap profile with and without `-dverbose-core2core` (using this as a poor man’s deepseq after each phase) shows that there might be a space leak, as you guess. (Uploading both graphs.) @Polarina: A work-around for you might be to not use a data constructor, but a newtype around a `Vector (Int -> Int)` (inner type mostly irrelevant). You store the functions therein and your accessor functions would use `unsafeIndex` and `unsafeCoerce` to get them out again. You can wrap that highly unsafe code in a module that can be used in a totally safe way. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9675#comment:8 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9675: Unreasonable memory usage on large data structures -------------------------------------+------------------------------------- Reporter: Polarina | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.8.3 Resolution: | Keywords: Operating System: Linux | Architecture: x86_64 (amd64) Type of failure: Compile- | Difficulty: Unknown time performance bug | Blocked By: Test Case: | Related Tickets: Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Comment (by nomeata): A space leak seems to be hidden in `DmdAnal`. Maybe strictness annotation calculation for dead variables, which are then never used? I’m preparing a compiler perf test case so that we can track possible improvements. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9675#comment:9 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9675: Unreasonable memory usage on large data structures
-------------------------------------+-------------------------------------
Reporter: Polarina | Owner:
Type: bug | Status: new
Priority: normal | Milestone:
Component: Compiler | Version: 7.8.3
Resolution: | Keywords:
Operating System: Linux | Architecture: x86_64 (amd64)
Type of failure: Compile- | Difficulty: Unknown
time performance bug | Blocked By:
Test Case: | Related Tickets:
Blocking: |
Differential Revisions: |
-------------------------------------+-------------------------------------
Comment (by Joachim Breitner

#9675: Unreasonable memory usage on large data structures -------------------------------------+------------------------------------- Reporter: Polarina | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.8.3 Resolution: | Keywords: Operating System: Linux | Architecture: x86_64 (amd64) Type of failure: Compile- | Difficulty: Unknown time performance bug | Blocked By: Test Case: | Related Tickets: Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Comment (by nomeata): Heap profiling tells me that most of it is retained by `dmdAnalRhs`, produced by `annotateBndrs/dmdAnalAlt/dmdAnal'`. But even after cluttering most of the module with {{{`seq`}}} and bang patterns, the leak is still there.... and just while I was about to give up I found the problem: The environment part of a DmdType is not seq’ed. Doing that does great good. Patch coming.... :-) -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9675#comment:11 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9675: Unreasonable memory usage on large data structures -------------------------------------+------------------------------------- Reporter: Polarina | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.8.3 Resolution: | Keywords: Operating System: Linux | Architecture: x86_64 (amd64) Type of failure: Compile- | Difficulty: Unknown time performance bug | Blocked By: Test Case: | Related Tickets: Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Comment (by simonpj): That sounds very plausible -- thank you! Do de-clutter when you are done :-). In fact if you grep for `seq` you'll find a variety of `seqs`, some of which probably do nothing useful while costing execution time. You are in an excellent position to find out, if you feel up to doing so. Really each should be accompanied with a story for why it is there. Simon -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9675#comment:12 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9675: Unreasonable memory usage on large data structures
-------------------------------------+-------------------------------------
Reporter: Polarina | Owner:
Type: bug | Status: new
Priority: normal | Milestone:
Component: Compiler | Version: 7.8.3
Resolution: | Keywords:
Operating System: Linux | Architecture: x86_64 (amd64)
Type of failure: Compile- | Difficulty: Unknown
time performance bug | Blocked By:
Test Case: | Related Tickets:
Blocking: |
Differential Revisions: |
-------------------------------------+-------------------------------------
Comment (by Joachim Breitner

#9675: Unreasonable memory usage on large data structures -------------------------------------+------------------------------------- Reporter: Polarina | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.8.3 Resolution: | Keywords: Operating System: Linux | Architecture: x86_64 (amd64) Type of failure: Compile- | Difficulty: Unknown time performance bug | Blocked By: Test Case: | Related Tickets: Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Changes (by ekmett): * cc: ekmett (added) -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9675#comment:14 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9675: Unreasonable memory usage on large data structures
-------------------------------------+-------------------------------------
Reporter: Polarina | Owner:
Type: bug | Status: new
Priority: normal | Milestone:
Component: Compiler | Version: 7.8.3
Resolution: | Keywords:
Operating System: Linux | Architecture: x86_64
Type of failure: Compile-time | (amd64)
performance bug | Test Case:
Blocked By: | Blocking:
Related Tickets: | Differential Revisions:
-------------------------------------+-------------------------------------
Comment (by Ben Gamari
participants (1)
-
GHC