
#14697: Redundant computation in fingerprintDynFlags when compiling many modules -------------------------------------+------------------------------------- Reporter: niteria | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by niteria): * cc: simonmar (added) Old description:
I profiled a build of a production code base with thousands of modules and computing `fingerprintDynFlags` is `7%` of time and `14%` of allocations.
Here's a synthetic test case inspired by what I observed: {{{ SIZE=1000
for i in $(seq -w 1 $SIZE); do echo "module A$i where" > A$i.hs echo "data A$i = A$i" >> A$i.hs done }}}
This generates a 1000 modules each with one datatype. Compiling them with: {{{ inplace/bin/ghc-stage2 A*.hs -optP-D__F{1..10000}__ }}} results in `fingerprintDynFlags` being the top cost centre in the profile. AFAICT there's only one module dependent piece that goes into computing `fingerprintDynFlags` and the rest is the same between those 1000 modules.
Now, why would I have so many preprocessor flags? This is how the Buck build system currently works. If a Haskell library depends on a C++ library then the GHC invocation gets the C++ library's directory as include path (`-optP-I -optP-I some/library/path`). This can grow quite big.
New description: I profiled a build of a production code base with thousands of modules and computing `fingerprintDynFlags` is `7%` of time and `14%` of allocations. Here's a synthetic test case inspired by what I observed: {{{ SIZE=1000 for i in $(seq -w 1 $SIZE); do echo "module A$i where" > A$i.hs echo "data A$i = A$i" >> A$i.hs done }}} This generates a 1000 modules each with one datatype. Compiling them with: {{{ inplace/bin/ghc-stage2 A*.hs -optP-D__F{1..10000}__ }}} results in `fingerprintDynFlags` being the top cost centre in the profile. AFAICT there's only one module dependent piece that goes into computing `fingerprintDynFlags` and the rest is the same between those 1000 modules. Now, why would I have so many preprocessor flags? This is how the Buck build system currently works. If a Haskell library depends on a C++ library then the GHC invocation gets the C++ library's directory as include path (`-optP -I -optP some/library/path`). This can grow quite big. -- -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14697#comment:1 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler