[GHC] #11285: Static linking is really slow sometimes

#11285: Static linking is really slow sometimes -------------------------------------+------------------------------------- Reporter: ezyang | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.10.3 Keywords: | Operating System: Unknown/Multiple Architecture: | Type of failure: None/Unknown Unknown/Multiple | Test Case: | Blocked By: Blocking: | Related Tickets: Differential Rev(s): | Wiki Page: -------------------------------------+------------------------------------- I'm testing on Cabal `Setup.hs`, which links against Cabal. On a beefy machine with many cores and lots of RAM, I see a x2 regression in linking time from GHC 7.10.2 to GHC 8.0 (a recent HEAD) using GNU ld (not gold): {{{ [ezyang@hs01 ezyang]$ rm A; time ghc --make A.hs -fforce-recomp [1 of 1] Compiling Main ( A.hs, A.o ) Linking A ... real 0m1.273s user 0m0.990s sys 0m0.210s [ezyang@hs01 ezyang]$ rm A; time ghc-8.0/usr/bin/ghc --make A.hs -fforce- recomp [1 of 1] Compiling Main ( A.hs, A.o ) Linking A ... real 0m3.270s user 0m2.727s sys 0m0.523s }}} On a puny eight year-old laptop, I see a x2 regression from 7.6 to 7.10 (with not much change with 8.0) {{{ ezyang@sabre:~$ rm A; time ghc --make -O0 A.hs -fforce-recomp rm: cannot remove ‘A’: No such file or directory [1 of 1] Compiling Main ( A.hs, A.o ) Linking A ... real 0m3.058s user 0m1.860s sys 0m1.164s ezyang@sabre:~$ rm A; time ghc-7.10 --make -O0 A.hs -fforce-recomp [1 of 1] Compiling Main ( A.hs, A.o ) Linking A ... real 0m7.139s user 0m4.616s sys 0m2.488s }}} There must be something which is causing the linker to run slowly in one case, and quickly in the other. It would be really good to figure out what this is. Slow linking is NOT NICE. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11285 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#11285: Static linking is really slow sometimes -------------------------------------+------------------------------------- Reporter: ezyang | Owner: Type: bug | Status: new Priority: high | Milestone: Component: Compiler | Version: 7.11 (Linking) | Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by ezyang): * priority: normal => high * failure: None/Unknown => Compile-time performance bug * version: 7.10.3 => 7.11 * component: Compiler => Compiler (Linking) -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11285#comment:1 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#11285: Split objects makes static linking really slow -------------------------------------+------------------------------------- Reporter: ezyang | Owner: Type: feature request | Status: new Priority: high | Milestone: Component: Compiler | Version: 7.11 (Linking) | Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by ezyang): * type: bug => feature request Old description:
I'm testing on Cabal `Setup.hs`, which links against Cabal.
On a beefy machine with many cores and lots of RAM, I see a x2 regression in linking time from GHC 7.10.2 to GHC 8.0 (a recent HEAD) using GNU ld (not gold):
{{{ [ezyang@hs01 ezyang]$ rm A; time ghc --make A.hs -fforce-recomp [1 of 1] Compiling Main ( A.hs, A.o ) Linking A ...
real 0m1.273s user 0m0.990s sys 0m0.210s
[ezyang@hs01 ezyang]$ rm A; time ghc-8.0/usr/bin/ghc --make A.hs -fforce- recomp [1 of 1] Compiling Main ( A.hs, A.o ) Linking A ...
real 0m3.270s user 0m2.727s sys 0m0.523s }}}
On a puny eight year-old laptop, I see a x2 regression from 7.6 to 7.10 (with not much change with 8.0)
{{{ ezyang@sabre:~$ rm A; time ghc --make -O0 A.hs -fforce-recomp rm: cannot remove ‘A’: No such file or directory [1 of 1] Compiling Main ( A.hs, A.o ) Linking A ...
real 0m3.058s user 0m1.860s sys 0m1.164s ezyang@sabre:~$ rm A; time ghc-7.10 --make -O0 A.hs -fforce-recomp [1 of 1] Compiling Main ( A.hs, A.o ) Linking A ...
real 0m7.139s user 0m4.616s sys 0m2.488s
}}}
There must be something which is causing the linker to run slowly in one case, and quickly in the other. It would be really good to figure out what this is. Slow linking is NOT NICE.
New description: Here's a comparison of a few builds of `Setup.hs` using GHC 7.10.3. In the first case, I am building using a version of GHC with split objects disabled on all libraries. In the second, split objects were enabled but Cabal was compiled without split objects. In the third, Cabal was built with split objects. {{{ [ezyang@hs01 ezyang]$ rm Setup; time ghc-7.10-nosplitobjs/inplace/bin/ghc- stage2 --make Setup.hs -O0 rm: cannot remove ‘Setup’: No such file or directory [1 of 1] Compiling Main ( Setup.hs, Setup.o ) Linking Setup ... real 0m0.950s user 0m0.757s sys 0m0.163s [ezyang@hs01 ezyang]$ rm Setup; time ghc --make Setup.hs -O0 Linking Setup ... real 0m1.209s user 0m0.973s sys 0m0.177s [ezyang@hs01 ezyang]$ rm Setup; time ghc -no-user-package-db --make Setup.hs -O0 [1 of 1] Compiling Main ( Setup.hs, Setup.o ) [Distribution.Simple changed] Linking Setup ... real 0m3.136s user 0m2.693s sys 0m0.407s }}} In my experience, Cabal is the MOST expensive library to compile with split objects (on my laptop, this is an x2 difference in link time); among base libraries, ld.gold visibly hitches when it has to link base. Slow link times make for unpleasant experience for users, especially since we don't compile executables as dynamic by default. To make matters worse, split object compiled boot libraries represent a mandatory tax for anyone using static linking, because it's *not possible* to swap out those static archives with non-split objects ones. Could we enhance GHC to support running the linker in a "fast mode", where we ask the linker to treat archives as atomic units and not try to optimize for binary size? We can keep the current slow mode for production executables that people want to ship. -- Comment: I've diagnosed that split objects is the problem. I've rewritten the description to reflect this. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11285#comment:2 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#11285: Split objects makes static linking really slow -------------------------------------+------------------------------------- Reporter: ezyang | Owner: Type: feature request | Status: new Priority: high | Milestone: Component: Compiler | Version: 7.11 (Linking) | Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by rwbarton): 8.0 has the `-ffunction-sections`-style replacement for `-split-objs`, right? Is that better or worse? -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11285#comment:3 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#11285: Split objects makes static linking really slow -------------------------------------+------------------------------------- Reporter: ezyang | Owner: Type: feature request | Status: new Priority: high | Milestone: Component: Compiler | Version: 7.11 (Linking) | Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by ezyang): Ha! On a quick and dirty test, `-ffunction-sections` is FOUR times worse for compiling `Setup.hs` on `ld.bfd`. However, it is TWO times better with `ld.gold`. (But not using split objects with gold is still the fastest.) {{{ [ezyang@hs01 ezyang]$ rm Setup; time ghc-8.0-nosplitobjs/inplace/bin/ghc- stage2 -no-user-package-db --make Setup.hs -O0 -optl-fuse-ld=gold [1 of 1] Compiling Main ( Setup.hs, Setup.o ) Linking Setup ... real 0m1.429s user 0m1.250s sys 0m0.163s sys 0m0.583s [ezyang@hs01 ezyang]$ rm Setup; time ghc-8.0/usr/bin/ghc -no-user-package- db --make Setup.hs -O0 -optl-fuse-ld=gold Linking Setup ... real 0m2.537s user 0m2.310s sys 0m0.220s [ezyang@hs01 ezyang]$ rm Setup; time ghc-8.0-nosplitobjs/inplace/bin/ghc- stage2 -no-user-package-db --make Setup.hs -O0 Linking Setup ... real 0m11.349s user 0m10.823s sys 0m0.553s [ezyang@hs01 ezyang]$ rm Setup; time ghc-8.0/usr/bin/ghc -no-user-package- db --make Setup.hs -O0 [1 of 1] Compiling Main ( Setup.hs, Setup.o ) Linking Setup ... real 0m3.380s user 0m2.867s sys 0m0.500s }}} I don't think we can generally assume people will be using gold, so switching this on by default probably is unacceptable. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11285#comment:4 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#11285: Split objects makes static linking really slow -------------------------------------+------------------------------------- Reporter: ezyang | Owner: Type: feature request | Status: new Priority: high | Milestone: Component: Compiler | Version: 7.11 (Linking) | Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by olsner): ghc could very well default to use gold if it's available, I think. There are a few reasons to explicitly need bfd-ld (e.g. when using linker scripts), but for linking normal programs it shouldn't matter either way. To support "special" use cases, we'd just need to make sure `-optl-fuse- ld=bfd` overrides ghc's setting.
Could we enhance GHC to support running the linker in a "fast mode"
I think this is not entirely up to the linking stage, as both split objects and function-sections are compile-time rather than link-time settings. Something that could be done at the linking stage is linking against the incrementally linked libraries-for-ghci - both split objects and split sections are undone by the incremental linking step. That might just run into different bottlenecks though :) Since #8405, `--gc-sections` is sent to the linker too. IIRC my previous experiments didn't find that it affected link times much unless actually using `-split-sections` for the installed libraries, but it could be moved to an explicit flag if need be. The downside of that is that users then have to learn a new flag to get smaller binaries. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11285#comment:5 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#11285: Split objects makes static linking really slow -------------------------------------+------------------------------------- Reporter: ezyang | Owner: Type: feature request | Status: new Priority: high | Milestone: Component: Compiler | Version: 7.11 (Linking) | Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by ezyang): olsner: It's not, unless we install both "slow to link" and "quick to link" versions of static libraries. As you mention, this might be helpful anyway for loading statically linked libraries to GHCi. I *believe* my experiments showed that `--gc-sections` didn't really cost you anything if you weren't using `-split-sections`. So it should be fine to continue to pass it. Disabling it didn't really help with link times anyway. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11285#comment:6 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#11285: Split objects makes static linking really slow -------------------------------------+------------------------------------- Reporter: ezyang | Owner: Type: feature request | Status: new Priority: high | Milestone: Component: Compiler | Version: 7.11 (Linking) | Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by ezyang): I gave some bogus numbers, because the build system did not inform me that SplitSections doesn't actually do anything on Linux yet. So someone will have to do this test on Mac OS X and tell us what the difference is. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11285#comment:7 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#11285: Split objects makes static linking really slow -------------------------------------+------------------------------------- Reporter: ezyang | Owner: (none) Type: feature request | Status: closed Priority: high | Milestone: Component: Compiler | Version: 7.11 (Linking) | Resolution: wontfix | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Compile-time | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Changes (by bgamari): * status: new => closed * resolution: => wontfix Comment: Given that (a) split objects will soon be ripped out and replaced with split sections (likely for 8.6, see #13939) , and (b) we now use `ld.gold` when possible (#13541) I think this can be safely closed. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/11285#comment:8 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler
participants (1)
-
GHC