
#9669: Long compile time/high memory usage for modules with many deriving clauses -------------------------------------+------------------------------------- Reporter: snoyberg | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.8.3 Keywords: | Operating System: Linux Architecture: x86_64 (amd64) | Type of failure: Compile- Difficulty: Unknown | time performance bug Blocked By: | Test Case: Related Tickets: | Blocking: | Differential Revisions: -------------------------------------+------------------------------------- I've seen many different people complain about this in various different contexts. One of the most common I personally see is when using Persistent with a large number of entities, which results in multi-minute build times for that single module. Usually, these reports have come from private repositories, which makes for a difficult repro. Also, the usage of Template Haskell in those modules tends to confuse the issue. So I'd like to report this issue from a completely separate project, with the guess that this represents an issue that is affecting many other users. Steps to reproduce: * `cabal unpack fpco-api-1.2.0.1` * `cd fpco-api-1.2.0.1` * `cabal install` * `ghc --make -isrc/library src/library/FP/API/Types.hs` * `touch src/library/FP/API/Types.hs` Then, to see the time and memory usage of compiling just the one module, run: * `time ghc -O2 --make -isrc/library src/library/FP/API/Types.hs +RTS -s` On my system (16GB RAM, quadcore i7), the results are: {{{ $ time ghc -O2 --make -isrc/library src/library/FP/API/Types.hs +RTS -s [3 of 3] Compiling FP.API.Types ( src/library/FP/API/Types.hs, src/library/FP/API/Types.o ) 51,846,533,568 bytes allocated in the heap 6,799,246,288 bytes copied during GC 389,297,584 bytes maximum residency (22 sample(s)) 11,330,472 bytes maximum slop 1041 MB total memory in use (0 MB lost due to fragmentation) Tot time (elapsed) Avg pause Max pause Gen 0 2041 colls, 0 par 9.19s 9.19s 0.0045s 0.0998s Gen 1 22 colls, 0 par 4.33s 4.33s 0.1969s 0.5436s TASKS: 4 (1 bound, 3 peak workers (3 total), using -N1) SPARKS: 0 (0 converted, 0 overflowed, 0 dud, 0 GC'd, 0 fizzled) INIT time 0.00s ( 0.00s elapsed) MUT time 34.21s ( 37.74s elapsed) GC time 13.52s ( 13.53s elapsed) EXIT time 0.07s ( 0.07s elapsed) Total time 47.81s ( 51.34s elapsed) Alloc rate 1,515,662,937 bytes per MUT second Productivity 71.7% of total user, 66.8% of total elapsed gc_alloc_block_sync: 0 whitehole_spin: 0 gen[0].sync: 0 gen[1].sync: 0 real 0m51.375s user 0m50.864s sys 0m0.456s }}} By contrast, with optimizations turned off: {{{ $ time ghc -O0 --make -isrc/library src/library/FP/API/Types.hs +RTS -s [3 of 3] Compiling FP.API.Types ( src/library/FP/API/Types.hs, src/library/FP/API/Types.o ) 12,767,593,936 bytes allocated in the heap 1,078,202,664 bytes copied during GC 179,551,768 bytes maximum residency (13 sample(s)) 6,684,544 bytes maximum slop 439 MB total memory in use (0 MB lost due to fragmentation) Tot time (elapsed) Avg pause Max pause Gen 0 623 colls, 0 par 1.24s 1.24s 0.0020s 0.0651s Gen 1 13 colls, 0 par 0.76s 0.76s 0.0586s 0.2397s TASKS: 4 (1 bound, 3 peak workers (3 total), using -N1) SPARKS: 0 (0 converted, 0 overflowed, 0 dud, 0 GC'd, 0 fizzled) INIT time 0.00s ( 0.00s elapsed) MUT time 8.28s ( 9.23s elapsed) GC time 2.00s ( 2.00s elapsed) EXIT time 0.05s ( 0.05s elapsed) Total time 10.34s ( 11.28s elapsed) Alloc rate 1,542,443,897 bytes per MUT second Productivity 80.6% of total user, 73.9% of total elapsed gc_alloc_block_sync: 0 whitehole_spin: 0 gen[0].sync: 0 gen[1].sync: 0 real 0m11.304s user 0m11.108s sys 0m0.172s }}} and -O1: {{{ $ time ghc -O1 --make -isrc/library src/library/FP/API/Types.hs +RTS -s [3 of 3] Compiling FP.API.Types ( src/library/FP/API/Types.hs, src/library/FP/API/Types.o ) 45,550,443,664 bytes allocated in the heap 5,721,700,512 bytes copied during GC 358,036,456 bytes maximum residency (21 sample(s)) 9,167,176 bytes maximum slop 906 MB total memory in use (0 MB lost due to fragmentation) Tot time (elapsed) Avg pause Max pause Gen 0 1642 colls, 0 par 7.70s 7.70s 0.0047s 0.1031s Gen 1 21 colls, 0 par 3.68s 3.69s 0.1756s 0.4968s TASKS: 4 (1 bound, 3 peak workers (3 total), using -N1) SPARKS: 0 (0 converted, 0 overflowed, 0 dud, 0 GC'd, 0 fizzled) INIT time 0.00s ( 0.00s elapsed) MUT time 30.21s ( 33.09s elapsed) GC time 11.38s ( 11.39s elapsed) EXIT time 0.07s ( 0.07s elapsed) Total time 41.68s ( 44.55s elapsed) Alloc rate 1,507,571,106 bytes per MUT second Productivity 72.7% of total user, 68.0% of total elapsed gc_alloc_block_sync: 0 whitehole_spin: 0 gen[0].sync: 0 gen[1].sync: 0 real 0m44.577s user 0m44.164s sys 0m0.368s }}} Note that this module is essentially just 900 lines of data declarations, with a large number of derived classes, and a few manually written helper functions and instances. This high memory usage hasn't just resulted in user complaints: automated Jenkins and Travis jobs will often times fail without optimizations disabled, which can be problematic for proper testing and production code. In the case of fpco-api, I've worked around this by adding `-O0` to the cabal file, but it would be much nicer to not have to do that. For some other examples of complaints along these lines: * https://groups.google.com/d/msg/yesodweb/MX1bnOFu8Hc/VLQXhnvpIkMJ * https://groups.google.com/d/msg/yesodweb/XPWixNjuOnM/FN26bmkudgwJ I believe there are a few other threads discussing this, if it would be helpful. I did my testing on GHC 7.8.3 64-bit, Ubuntu 12.04. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9669 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler