
Niklas Hambüchen
I have some suggestions for low hanging fruits in this effort.
1. Make ghc print more statistics on what it spending time on
When I did the linking investigation recently (https://www.reddit.com/r/haskell/comments/63y43y/liked_linking_3x_faster_wit...) I noticed (with strace) that there are lots of interesting syscalls being made that you might not expect. For example, each time TH is used, shared libraries are loaded, and to determine the shared library paths, ghc shells out to `gcc --print-file-name`. Each such invocation takes 20 ms on my system, and I have 1000 invocations in my build. That's 20 seconds (out of 2 minutes build time) just asking gcc for paths.
I recommend that for every call to an external GHC measures how long that call took, so that it can be asked to print a summary when it's done.
That might give us lots of interesting things to optimize. For example, This would have made the long linker times totally obvious.
For what it's worth, some of us have recognized for quite some time that BFD ld is a known slow spot in the compilation pipeline. However, up until now we have considered this to be more of an issue to be handled at the packaging level than in GHC issue. Currently, GHC relies on `gcc` to link and gcc has its own idea of the default linker. Many users (e.g. Debian packagers, users who are cross compiling) are quite unhappy when GHC coerces the system toolchain into changing its default behavior. Consequently we have maintained the status quo and silently wished that Linux distributions would start moving to gold by default. Sadly there continues to be little progress on this matter. However, given that so many users are now relying on GHC binary distributions and that BFD ld is an increasingly significant issue as dependency counts increase, I think the status quo is increasingly untenable. I have proposed (#13541) that we introduce configure logic to use gold when available. There are outstanding questions however * What interface (i.e. configure flags) do we expose to the user to allow them to turn on or off this logic? * Do we want to enable the logic by default in binary distributions to ensure that users see the benefits of gold if it is available? * Do we want to also enable it in source distributions as well? If yes then developers may be caught by surprise; if no then there will be an inconsistency between the default configuration of the source tree and binary distributions (although there is plenty of precedent for this). Anyways, let me know if you have thoughts on any of this. Hopefully we'll be able to get this done for 8.4. Cheers, - Ben