[GHC] #13586: ghc --make seems to leak memory

#13586: ghc --make seems to leak memory -------------------------------------+------------------------------------- Reporter: | Owner: (none) MikolajKonarski | Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.0.1 Keywords: | Operating System: Linux Architecture: x86_64 | Type of failure: Compile-time (amd64) | performance bug Test Case: | Blocked By: Blocking: | Related Tickets: #13379 Differential Rev(s): | Wiki Page: -------------------------------------+------------------------------------- (This is probably not reproducible with a small example.) When I build this project with `cabal build` https://github.com/LambdaHack/LambdaHack/commit/138123ab13edd4db6c8143720af6... the peek memory, as observed with `top`, is 10G*. When I instead interrupt the compilation with `^C` at the following point (compilation of this file take a couple of minutes, so it's easy to interrupt): {{{ [123 of 123] Compiling Game.LambdaHack.SampleImplementation.SampleMonadServer ( Game/LambdaHack/SampleImplementation/SampleMonadServer.hs, dist/build/Game/LambdaHack/SampleImplementation/SampleMonadServer.o ) }}} and then restart and continue to the end, peek memory usage in either of the two compilation parts is 5G*. So it seems `ghc --make` keeps some data that is either eventually not used or could as well be read on demand instead of kept in memory. Confirmed with 8.2.1-rc1 as well, but it's not trivial to compile due to restrictive upper bounds of many packages. *exact numerical values are made up -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13586 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13586: ghc --make seems to leak memory -------------------------------------+------------------------------------- Reporter: MikolajKonarski | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.0.1 Resolution: | Keywords: Operating System: Linux | Architecture: x86_64 Type of failure: Compile-time | (amd64) performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #13379 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by jstolarek): MikoĊaj, does that problem happen with earlier versions of GHC? Or: to what extent does this happen with earlier versions? I have run into the same problem on my old laptop with 2GB of RAM. With larger projects I often had to kill the build because the system ran out of memory, but then restarting the build lead to successful completion. One symptom I also experienced was a very long linking time, something that did not happen with GHC 7.10 or 7.8. I speculate that the cause might have been due to cluttering the memory with unnecessary data left from compilation and then running the linker lead to swapping. Again, restarting the build to just finish linking would solve the problem, ie. linking finishing in reasonable time. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13586#comment:1 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13586: ghc --make seems to leak memory -------------------------------------+------------------------------------- Reporter: MikolajKonarski | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.0.1 Resolution: | Keywords: Operating System: Linux | Architecture: x86_64 Type of failure: Compile-time | (amd64) performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #13379 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- @@ -24,0 +24,3 @@ + + Edit: this is a regression, GHC 7.10.3 uses < 3G for compilation without + even interrupting New description: (This is probably not reproducible with a small example.) When I build this project with `cabal build` https://github.com/LambdaHack/LambdaHack/commit/138123ab13edd4db6c8143720af6... the peek memory, as observed with `top`, is 10G*. When I instead interrupt the compilation with `^C` at the following point (compilation of this file take a couple of minutes, so it's easy to interrupt): {{{ [123 of 123] Compiling Game.LambdaHack.SampleImplementation.SampleMonadServer ( Game/LambdaHack/SampleImplementation/SampleMonadServer.hs, dist/build/Game/LambdaHack/SampleImplementation/SampleMonadServer.o ) }}} and then restart and continue to the end, peek memory usage in either of the two compilation parts is 5G*. So it seems `ghc --make` keeps some data that is either eventually not used or could as well be read on demand instead of kept in memory. Confirmed with 8.2.1-rc1 as well, but it's not trivial to compile due to restrictive upper bounds of many packages. *exact numerical values are made up Edit: this is a regression, GHC 7.10.3 uses < 3G for compilation without even interrupting -- Comment (by MikolajKonarski): Jan, that was a great hunch --- indeed, this is a regression, GHC 7.10.3 uses < 3G for compilation without even interrupting. It's possible, different versions of some packages I use under 7.10.3 may contribute, but I'd be surprised if it wasn't almost completely the change of GHC version. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13586#comment:2 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13586: ghc --make seems to leak memory -------------------------------------+------------------------------------- Reporter: MikolajKonarski | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.0.1 Resolution: | Keywords: Operating System: Linux | Architecture: x86_64 Type of failure: Compile-time | (amd64) performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #13379 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Description changed by MikolajKonarski: @@ -27,0 +27,5 @@ + + Edit2: which actually doesn't prove the `--make` leak is a regression. + There may just be some other regression that makes the difference between + interrupted and non-interrupted compilation under 7.10.3 much smaller and + harder to measure. New description: (This is probably not reproducible with a small example.) When I build this project with `cabal build` https://github.com/LambdaHack/LambdaHack/commit/138123ab13edd4db6c8143720af6... the peek memory, as observed with `top`, is 10G*. When I instead interrupt the compilation with `^C` at the following point (compilation of this file take a couple of minutes, so it's easy to interrupt): {{{ [123 of 123] Compiling Game.LambdaHack.SampleImplementation.SampleMonadServer ( Game/LambdaHack/SampleImplementation/SampleMonadServer.hs, dist/build/Game/LambdaHack/SampleImplementation/SampleMonadServer.o ) }}} and then restart and continue to the end, peek memory usage in either of the two compilation parts is 5G*. So it seems `ghc --make` keeps some data that is either eventually not used or could as well be read on demand instead of kept in memory. Confirmed with 8.2.1-rc1 as well, but it's not trivial to compile due to restrictive upper bounds of many packages. *exact numerical values are made up Edit: this is a regression, GHC 7.10.3 uses < 3G for compilation without even interrupting Edit2: which actually doesn't prove the `--make` leak is a regression. There may just be some other regression that makes the difference between interrupted and non-interrupted compilation under 7.10.3 much smaller and harder to measure. -- -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13586#comment:3 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13586: ghc --make seems to leak memory -------------------------------------+------------------------------------- Reporter: MikolajKonarski | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.0.1 Resolution: | Keywords: Operating System: Linux | Architecture: x86_64 Type of failure: Compile-time | (amd64) performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #13379 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by rwbarton): GHC certainly retains information about modules it has finished compiling in `--make` mode, by design--the information it wrote to the interface file, and would read back from the interface file if needed. It has "always" worked this way, though of course it's possible that the space usage of this retained data has increased, or that there is other data being retained unintentionally. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13586#comment:4 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13586: ghc --make seems to leak memory -------------------------------------+------------------------------------- Reporter: MikolajKonarski | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.0.1 Resolution: | Keywords: Operating System: Linux | Architecture: x86_64 Type of failure: Compile-time | (amd64) performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #13379 13564 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by MikolajKonarski): * related: #13379 => #13379 13564 -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13586#comment:5 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13586: ghc --make seems to leak memory -------------------------------------+------------------------------------- Reporter: MikolajKonarski | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.0.1 Resolution: | Keywords: Operating System: Linux | Architecture: x86_64 Type of failure: Compile-time | (amd64) performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #13379 #13564 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by MikolajKonarski): * related: #13379 13564 => #13379 #13564 -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13586#comment:6 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13586: ghc --make seems to leak memory -------------------------------------+------------------------------------- Reporter: MikolajKonarski | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.0.1 Resolution: | Keywords: Operating System: Linux | Architecture: x86_64 Type of failure: Compile-time | (amd64) performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #13379 #13564 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by MikolajKonarski): I've just made a change that lowered RAM usage a lot (it no longer trashes my computer; perhaps it uses the same total amount of RAM+swap, I didn't measure). Most of specialization was occurring originally in a single module and now I'm already specializing some of that in another module. This would indicate RAM usage of the compiler is not linear in the number of specializations occurring in a single module. Perhaps it really uses non-linear amounts of memory and perhaps it just looks through the list of specializations from the current module from time to time and so they can't just get swapped out and stay that way, but are brought to back to RAM too often, causing swap thrashing. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13586#comment:7 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13586: ghc --make seems to leak memory -------------------------------------+------------------------------------- Reporter: MikolajKonarski | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.0.1 Resolution: | Keywords: Operating System: Linux | Architecture: x86_64 Type of failure: Compile-time | (amd64) performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #13379 #13564 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by bgamari): Do you have an small-ish example which exhibits this? We should profile it if so. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13586#comment:8 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13586: ghc --make seems to leak memory -------------------------------------+------------------------------------- Reporter: MikolajKonarski | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.0.1 Resolution: | Keywords: Operating System: Linux | Architecture: x86_64 Type of failure: Compile-time | (amd64) performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #13379 #13564 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by MikolajKonarski): Nope, and reducing the example reduces the problem, because it needs lots of specializations, so lots of code, to trigger. However, I guess one could construct a cheaper, artificial example with n simple functions specialized to m types and thus get n*m specializations (I only have 1--2 types for each functions in my example, so I need lots of code). I wonder if we already have such example in GHC test suite. If so, we'd only need a variant where specializations is split between 2 modules and compare the time/heap as n and m grow. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13586#comment:9 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13586: ghc --make seems to leak memory -------------------------------------+------------------------------------- Reporter: MikolajKonarski | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.0.1 Resolution: | Keywords: Operating System: Linux | Architecture: x86_64 Type of failure: Compile-time | (amd64) performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #13379 #13564 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by MikolajKonarski): Just one more data point, showing (I guess) how big .hi files loaded into memory and never freed can lock up lots of RAM during compilation: this commit https://github.com/LambdaHack/LambdaHack/commit/9adc5ee93ab32a9a1ba949362371... lowers maximum resident set size of GHC during compilation from 4.5G to 2.8G, as measured with `/usr/bin/time -v cabal build -j1`on Ubuntu with GHC 8.4.3. As reported in other comments, it's also the case that interrupting the compilation and then restarting it lowers resident set size considerably. Before the commit, two RAM usage peaks coincide when compiling the library section of the .cabal file --- one peak from 120 large .hi files loaded into memory (I guess ~2G) and another from an excessive amount of specialisations performed when compiling a single module that provides a concrete implementation of a certain monad. The commit just moves the specialization to executable section of the .cabal file thus separating the peaks. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13586#comment:10 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13586: ghc --make seems to leak memory -------------------------------------+------------------------------------- Reporter: MikolajKonarski | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.0.1 Resolution: | Keywords: Operating System: Linux | Architecture: x86_64 Type of failure: Compile-time | (amd64) performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #13379 #13564 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by watashi): * cc: watashi (added) -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13586#comment:11 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13586: ghc --make seems to leak memory -------------------------------------+------------------------------------- Reporter: MikolajKonarski | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.0.1 Resolution: | Keywords: Operating System: Linux | Architecture: x86_64 Type of failure: Compile-time | (amd64) performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #13379 #13564 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by ulysses4ever):
GHC certainly retains information about modules it has finished compiling in `--make` mode, by design--the information it wrote to the interface file
Question: would it be reasonable to add a flag of how much memory is allowed for this kind of buffering? Upon reaching the limit, GHC could clear it up and resort to fetching `hi`-files on demand. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13586#comment:12 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#13586: ghc --make seems to leak memory -------------------------------------+------------------------------------- Reporter: MikolajKonarski | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.0.1 Resolution: | Keywords: Operating System: Linux | Architecture: x86_64 Type of failure: Compile-time | (amd64) performance bug | Test Case: Blocked By: | Blocking: Related Tickets: #13379 #13564 | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by bgamari): Would it be reasonable? Perhaps. Would it be easy to implement? I don't believe so. You would need to somehow walk the heap looking for references to freed interfaces so that they can be GC'd (or make any reference that might refer to something in another module weak, which would come at quite some cost). -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13586#comment:13 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler
participants (1)
-
GHC