 
            Sergej Jaskiewicz via ghc-devs 
Hello everyone,
Recently I've been exploring ways to improve the LLVM support in GHC.
While doing that, I've encountered an issue that I think is worth looking into. Namely, the compilation times with LLVM are considerably longer than with the native code generator. To be more concrete, on my machine I've managed to build the whole toolchain in just half an hour with NCG, but the same build with LLVM-powered GHC took several hours.
Indeed this is reflective of my experience as well. This is indeed a major problem for architectures like ARM which are only supported via the LLVM backend.
If we want to improve compilation times, we should be able to measure it first. As far as I understand, the only way for profiling the compiler pipeline today is building GHC with special flags: https://gitlab.haskell.org/ghc/ghc/wikis/debugging/profiling-ghc
This page uses "profiling" in the sense of "cost-center profiler". However, as you point out, many performance analysis tasks do not require the cost-center profiler.
This has three important downsides: * We need build the whole compiler ourselves. If I'm just a user interested in why my program is so slow to compile and how I should fix it, this is too much for me to ask to do. * There's just too much irrelevant data in the output. For example, it doesn't tell me which part of my Haskell code is taking so long to compile. It would be nice if the report contained all the metadata that I need to go and fix my code. * If we're using LLVM, we're pretty much out of luck, because we're using an existing LLVM binary which is likely not built for profiling.
There have been the same problem with the Clang compiler until recently. In Clang 9, which was released in the last year, there's a new flag '-ftime-trace' that produces a beautiful flame chart with all the metadata needed to identify bottlenecks. This flag tells Clang to produce a JSON file that can be visualized in Chrome's chrome://tracing page, or on https://www.speedscope.app. This is described in more details here: https://aras-p.info/blog/2019/01/16/time-trace-timeline-flame-chart-profiler...
I'm thinking about implementing similar functionality in GHC. One option would be to utilize the plugin infrastructure.
Maybe I'm missing something and there are better options?
We already have such a flag: -ddump-timings [1]. Moreover, the timings are also emitted in GHC's eventlog if you link the `ghc` executable with `-eventlog`. Alp wrote a blog post [2] describing his experiences using all of this a few months ago. I have also had some amount of luck using the Linux `perf` tool to profile GHC. Cheers, - Ben [1] https://ghc.gitlab.haskell.org/ghc/doc/users_guide/debugging.html#ghc-flag--... [2] https://www.haskell.org/ghc/blog/20190924-eventful-ghc.html