
Austin Seipp
Joachim, thanks for the forward and discussion.
Just to rehash two points for the people reading at home:
- I *do not* want to ship GHC specific patches to LLVM in the builds we use, anymore than anyone else does. I don't have any plans or even patches I would apply right now. A stock LLVM is ideal - one that's just been picked to work well by us. Even if it has some bugs or workarounds are needed, that's probably OK.
I agree. Shipping a known-stable LLVM is one thing; patching our own LLVM is quite another. Patching LLVM should be avoided if at all possible. Thankfully LLVM is quite modular so this shouldn't be so difficult. [snip]
I'd love to work more with LLVM upstream to fix problems... but the time to do so is pretty limited for most of us, I think, and the current backend has real issues in the design that cooperation just can't fundamentally fix - cooperation can't fix the fact a new release may change IR semantics and break existing GHC releases, for example. Users will simply suffer from that. And some of those changes may not be totally trivial to accommodate (as Ben's recent work shows).
While this is technically true I wonder whether IR changes will be a persistent problem going forward. I don't have a deep knowledge of the history of the LLVM IR but my impression is that the maintainers are fairly deliberate in their consideration of sematic changes (despite the arguable `symbol_offset` and `prefix_data` mis-step in 3.5). The alias change was an unexpected turn but frankly our previous use of aliases was a bit odd and was never supposed to work in the first place (something is amiss when you are relying on the optimizer to elide aliases to produce valid code). The alias rework and (hopefully) upcoming TNTC rework make me optimistic that our use of LLVM moving closer to how the interfaces are designed to be used. Hopefully this will be accompanied by a corresponding improvement in maintainability. There are other reasons besides IR instability that we might want to distribute our own LLVM. These might include, * Decoupling GHC from changes in LLVM's optimization passes * Wanting to ship own optimization passes that need to link against LLVM. * Wanting to use a library like llvm-general in GHC * Wanting to use leverage related libraries such as Polly I'm not sure how these weigh against the maintenance and packaging costs of shipping our own LLVM. I've not seen evidence that changes in LLVM's optimizer have hurt us in the past; then again if we are going to be more selective about which optimizations we ask LLVM to perform perhaps this will become a bigger concern in the future. Shipping our own passes would be great and is probably necessary to continue to improve performance. It not clear whether LLVM's analysis pass interface is stable enough to facilitate this without shipping our own LLVM. Max's analysis is now three years old; I'll try dusting off the code and see how bad the damage is. It's not clear to me that we really want to add a dependency on another library to GHC. Being able to leverage things like Polly sounds tempting but adding another moving part to LlvmGen will likely incur a maintenance cost. Moreover, the fact that there still isn't a llvm-general release targeting LLVM 3.5 is a bit worrying. As a quite note, I spoke briefly with a few Rustaceans and they report that they were hoping to ultimately avoid shipping an LLVM with rustc. At this point Rust doesn't have an active packaging effort so perhaps the rustc precedent isn't as useful as I originally thought. Cheers, - Ben