Re: [GHC] #10074: Implement the 'Improved LLVM Backend' proposal

1 May 2017

      #10074: Implement the 'Improved LLVM Backend' proposal
-------------------------------------+-------------------------------------
        Reporter:  thoughtpolice     |                Owner:  angerman
            Type:  task              |               Status:  new
        Priority:  high              |            Milestone:  8.4.1
       Component:  Compiler (LLVM)   |              Version:
      Resolution:                    |             Keywords:  llvm, codegen
Operating System:  Unknown/Multiple  |         Architecture:
                                     |  Unknown/Multiple
 Type of failure:  None/Unknown      |            Test Case:
      Blocked By:                    |             Blocking:
 Related Tickets:  #11295, #12470    |  Differential Rev(s):  Phab:D530
       Wiki Page:                    |
  wiki:ImprovedLLVMBackend           |
-------------------------------------+-------------------------------------

Comment (by kavon):
...
Yet again I'd like to stress the point that someone would need to
 seriously take ownership of the opt/llc code, ensure that all the hacks
 are still necessary, that opt and llc flags match up, and ensure that they
 work with new llvm version.
I'll take ownership of all of these things!
...
As the Imrpoved LLVM Proposal was about bundling llvm with ghc, to have
 better control over the llvm backend, bundling clang (or if we really must
 opt+llc) looks to me like the way to go.
...
Regarding LTO, I believe we can do LTO at the bitcode level with llvm-
Bundling a customized clang is '''not''' the way to go. Clang is just the
 C/C++ frontend for LLVM, and we lose control over LLVM by trying to access
 it through clang... how will we use any of our customizations? If we want
 to write our own IR optimization/analysis passes, we would end up exposing
 the flag to run it via opt... hacking up clang to access the pass is much
 harder!

 I also would like to stress that we cannot rely on the system's version of
 clang. For example, on OS X the default clang is built against Apple's
 LLVM, whose source code is unknown. The opt/llc obtained by package
 managers are always the open-source version.

 link and opt.

 Yes, I'm quite certain that's all you do, and I'd be willing to look into
 this. It really shouldn't be too difficult.

 ---

 Overall, I still don't see good motivation for moving to opt/clang instead
 of opt/llc, other than an attempt to reduce compilation time.  If I missed
 something in the prior discussion please forgive me.

 If compilation time with LLVM is very important, I think there are more
 profitable ways of reducing it than using clang:

 Here are the timings I'm seeing on a GHC produced 2.6MB LLVM IR file (the
 Move module from the mate benchmark) with a Debug build of LLVM 5 with
 assertions on, so these times ''are'' inflated in unknown ways:

 `opt -time-passes -O1 Move.ll | llc -O1 -time-passes -o blah.s`

 1.08 seconds were spent by `opt` parsing the textual LLVM IR we generated.
 `opt` spent 0.19 seconds emitting bitcode, and `llc` spent 0.16 seconds
 parsing it, so 0.35 seconds between `opt` and `llc`.

 The total time spent to complete that pipeline was 18.15 seconds, so 2% of
 this time is owed to bitcode serialization between `opt` and `llc`,
 whereas 6% is owed to parsing textual LLVM IR from GHC. This doesn't
 include the time spent by GHC emitting the textual IR too!

 Thus, to reduce compile times, I think it would make more sense to either
 generate LLVM bitcode, or switch to using Haskell bindings for LLVM to
 access the API directly. Neither of these are small tasks, but I think
 they're better for us in the long-run.

--
Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/10074#comment:29
GHC http://www.haskell.org/ghc/
The Glasgow Haskell Compiler