
#14031: Linker paths carry substantial N*M overhead when many libaries are used -------------------------------------+------------------------------------- Reporter: nh2 | Owner: (none) Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.0.2 Keywords: | Operating System: Unknown/Multiple Architecture: | Type of failure: Compile-time Unknown/Multiple | performance bug Test Case: | Blocked By: Blocking: | Related Tickets: Differential Rev(s): | Wiki Page: -------------------------------------+------------------------------------- **"The linker search path problem"** It's common these days to have 400 (transitive) Hackage dependencies (even my hobby projects have that many). When linking, statically or dynamically, GHC passes `-L /path/to/foo/dir` and `-L HSfoo...` to the linker for each Haskell package `foo`. That results in `N` many linker search paths (`-L`) and `M` many library names. That will translate in `N * M` many `stat()` syscalls (file existence checks) in the linker to check where each file is. For 400 package dependencies, that translates to 160k `stat()` syscalls. On my machine (on SSD and hot file system cache in RAM), that takes `0.7 seconds` (measured with `strace -c -w`). That is a substantial overhead for linking (for example, the executable with 400 dependencies with which I found this problem can be linked with `ld.gold` in just 1.5 seconds). If there is tooling in between GHC and the linker that also parses linker flags and checks for file existence (for example Nix's `cc-wrapper` and `ld-wrapper` scripts, or other instrumentation or build system tooling, which are often not written in compiled languages and thus [https://github.com/NixOS/nixpkgs/issues/27609 much slower] at stat()ing than the linker is), then the problem can be easily blown up another 10x. As a result, you may be waiting a lot longer for your link to finish than necessary. For static linking, this could be resolved by passing the archive files (`libHSfoo....a`) directly to the linker instead of the `-L`/`-l` combination. For this we would need to check that passing files directly really have the same semantics as `-L` and `-l`, but the man pages suggest so (and I'll check it with the binutils authors). For dynamic linking, it cannot be fixed this way. It can be fixed at compile time, but with dynamic linking the same problem exists at run-time (startup): Here all the `-L`s become `RPATH`s if not installed in a global location (for example, nix has an `RPATH` for each package), and thus program startup can take up to those 0.7 seconds I mentioned. To solve the runtime problem, the only solution seems to be to place the libraries into the same directory (i.e. to have `M = 1` essentially). This approach (putting all libs into 1 directory) would also solve it for the static linking case, with the drawback of this being not compatible with how cabal sandboxes, stack build and snapshot dirs, nix, Debian and other distributions deliver and build libraries. And for some of them this approach goes straight against their nature. **Detail discussion:** See below the relevant discussion on #ghc: {{{ nh2: bgamari-: do you know why ghc uses -L and -l for linking libHS... things instead of passing the .a file directly? hvr: nh2: btw, fun fact: on IBM AIX, both static and DSOs are stored in .a archives bgamari-: nh2, I suspect just since it makes the dynamic and static linking codepaths consistent hvr: nh2: and if you use .a for linking, you end up hardwiring the DSO path into your executable nh2: bgamari-: I'm looking into whether it would make sense to change it, at least for the platforms that would support it, because the (-L * -l) searching that it does currently seems to dominate link times here nh2: cc hvr hvr: nh2: sounds reasonable; but then you end up with longer CLI arguments hvr: nh2: it could be a problem for when the tools don't support @ bgamari-: nh2, seriously? bgamari-: the search is really that long? bgamari-: nh2, it seems like that's a linker bug hvr: bgamari-: I guess it depends on how many -L there are bgamari-: nh2, how many libraries are you linking against? hvr: bgamari-: as each -L and -l multiply w/ each other requiring several stat() calls iirc bgamari-: hvr, yes hvr: but otoh, I wouldn't expect this to *dominate* linker times bgamari-: right bgamari-: hvr, this is the compile-time equivalent to #11587 bgamari-: which I still think we should do hvr: right bgamari-: hmm bgamari-: although comment:9 sort of suggests that this is already the case for user-installed libraries ***bgamari- forgot about that hvr: I mean, if you have really *many* -L-paths hvr: then the quadratic effect becomes probably significant bgamari-: right bgamari-: nh2, how many libraries are you linking against? nh2: bgamari-: I have approximately L=450 and l=350. There is some notoriously slow tooling "in between" that I'm currently investigating, namely nix's ld-wrapper, which does does the stat'ing in bash, where statting that many files easily takes 1 second, but I did a quick test with `xargs ls -1` on the "$DIR/lib${name}.a" expansion, an that too takes 1 second, so it might not be entirely its fault hvr: ok, ~400 is quite a bit bgamari-: sheesh bgamari-: yes, that is... impressive geekosaur: next they'll want this to work on macOS >.. geekosaur: >.> hvr: and then windows! nh2: even my hobby projects have 300 recursive deps by now nh2: bgamari-: my strace says that `ls` did the 160k stat()s in 0.7 seconds, which is half of the time that `gold` needs to link my binary nh2: bgamari-: I have now strace'd `ld` itself, that does it slightly faster, 0.5 seconds, but this is just because it's my hobby project and has "only" that many deps; for a real project that'd be 2 seconds just stat()ing bgamari-: nh2, sheesh nh2: bgamari-: of course nix's ld wrapper, being written in bash, does the same thing but at least 5x slower bgamari-: nh2, do you just add all of Hackage to your build-depends? ;) bgamari-: nh2, Instead of passing explicit paths I think I would rather just put all of the archives in LIBDIR nh2: bgamari-: I blame it on my dependees ;) also, "Haskell has high-quality libraries and encourages code reuse" comes back at us here bgamari-: although, admittedly, I haven't really thought through the consequences of this yet bgamari-: this does sound like a worthwhile thing to fix though bgamari-: that is a significant amount of time nh2: bgamari-: personally I'd WAY prefer explicit args vs env vars because then I can see them in ps, I can easily replay the specific command myself to debug it etc hvr: nh2: not every library on Hackage is "high-quality" ;-) bgamari-: I had always assumed that our large linking times we due to, well, the "linking" bgamari-: not just the stating bgamari-: nh2, fair enough, but I'd rather not have to add even more platform-dependent behavior to our interaction with the system toolchain bgamari-: oh, wait bgamari-: nh2, I didn't mean to pass via env var bgamari-: nh2, I just meant pass one -L bgamari-: nh2, where the static and dynamic libraries can both be bgamari-: found nh2: bgamari-: I don't know that, where can I read about that? Is it a flag? bgamari-: nh2, -L? bgamari-: it's just the usual --library-path flag nh2: bgamari-: I got confused by LIBDIR in caps bgamari-: ah bgamari-: right, I should have been more specific nh2: bgamari-: so what do you mean by that, somehow put all the libraries into a single dir? bgamari-: nh2, treat static archives the same way we treat dynamic libraries bgamari-: which is to place them in, e.g., $HOME/.cabal/lib/$ghc_version/ bgamari-: instead of where they are currently, $HOME/.cabal/lib/$ghc_version/$library_abi_name bgamari-: then you can just pass -L$HOME/.cabal/lib/$ghc_version to ld bgamari-: and a -l for each library bgamari-: I think this is probably as much a Cabal change as it is a GHC change nh2: bgamari-: that doesn't seem to go well with approaches that package separately, like debian, nix, cabal sandboxes and stack work dirs hvr: which includes new-build bgamari-: I'm not sure it's fundamentally incompatible hvr: yeah, but it's a lot of work to change everyone :-) bgamari-: to take the case of new-build, all that matters AFAIK is that the packages are registered in separate package dbs bgamari-: there's nothing stopping you from throwing all of the libraries into one directory hvr: bgamari-: well, I kinda exploit that new-build has nicely self-contained folders hvr: this makes it easier to atomically move around artifacts nh2: bgamari-: do you have a dislike for passing them explicitly as files beyond cross-platform worries? Because I really find explicit arguments the cleanest, easiest and most debuggable approach hvr: as I can stage the new artifact, and then move it into place hvr: atomically hvr: by simply renaming/moving the folder which contains everyting bgamari-: nh2, I'm just currently really not sure that passing libraries by path is really equivalent to passing via -l hvr: unfortauntely filesytems don't offer many other facilities for atomic transactions bgamari-: I would need to run some experiments nh2: bgamari-: gcc's and ld's man pages suggest that to me, looking at their respective `-l`; we certainly should double check (or maybe simpler, ask the linker authors) bgamari-: if we can really show that the linker treats explicit paths no differently then I'd be okay with it nh2: suggests that they are equivalent) bgamari-: but I'm rather weary bgamari-: nh2, are you only suggesting this for static libraries? geekosaur: it's supposed to be equivalent. but shared objects have the added issue of runtime discoverability (i.e. passing by full path at link time does not install a full path for runtime access) geekosaur: except on macos or older aix where it's always a full path) nh2: bgamari-: no, at least the stat()-taking-time problem is the same problem for .so's (though I'm using static linking in many places and would be appreciate an improvement even if limited to static linking) bgamari-: nh2, the problem is that your approach doesn't help dynamic linkage at all bgamari-: since the runtime linker still needs to search N paths bgamari-: so I would argue that new-build, stack, etc. really should be putting libraries in the same directory regardless bgamari-: more work or not bgamari-: it's the only way to truly solve the problem mpickering: Wonders if David saw this issue https://github.com/haskell/cabal/issues/4550 alanz: mpickering: what are you planning to use the source plugins for? nh2: bgamari-: yeah you're right, the start-up cost of the executable will always be there for dynamic linking, I was purely talking from the compiler's perspective bgamari-: nh2, my point was that I would prefer to solve the problem once and for all bgamari-: rather than have another sharp edge that we then still need to solve bgamari-: given that moving the libraries solves both problems, it feels like the right solution mpickering: alanz: haskell-indexer was the plan nh2: bgamari-: I agree that for dynamic linking that is the only solution. But I'm not sure there exist a fits-it-all solution across static and dynamic. At least for my work, I'd prefer to have static libs that can be anywhere, in nix-style immutable locations, and give up on dynamic linking (of Haskell libs) to get that. It seems that hvr's thoughts are going in the same direction on the cabal side, and lots of build system and OS movements these days seem to converge away from global shared dirs to dir-prefixes. nh2: I also suspect that static linking will be even more attractive now that sensible -split-sections is in place, which can give smaller code sizes, which .so's can't deliver. nh2: So I'm not sure that a big shift to the fits-it-all solution that satisfies dynamic linking needs would be the most practical or used; it'd certainly be nice to have it for dynamic linking, but I'm not sure all the downstreams can be forced to do it that way for non-static if it quite fundamentally dictates how they have to work. If they can, it would probably take pretty long. }}} -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14031 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler