
On 17/02/10 07:37, Isaac Dupree wrote:
On 02/16/10 20:13, Roman Leshchinskiy wrote:
On 15/02/2010, at 04:58, Don Stewart wrote:
Do we have the blessing of the DPH team, wrt. tight, numeric inner loops?
FWIW, I don't think we even use -fvia-C when benchmarking. In general, -fvia-C is a dead end wrt numeric performance because gcc just doesn't optimise well enough. So even if we generated code that gcc could optimise properly (which we don't atm), we still would be way behind highly optimising compilers like Intel's or Sun's. IMO, the LLVM backend is the way to go here.
LLVM and GCC are open-source projects that are improving over time... is there any particular reason we expect GCC to have poor numeric performance forever?
The problem is not the quality of the code generator in gcc vs. LLVM; indeed gcc is generally regarded as generating better code than LLVM right now, although LLVM is improving. The reason that using gcc is worse than LLVM for us in that when GHC uses gcc as a backend it generates C, whereas the LLVM backend generates code directly from GHC's internal C-- representation. Compiling via C is a tricky business that ultimately leads to not being able to generate as good code as you want(*). It would be entirely possible to hook into gcc's backend directly from GHC as an alternative to the LLVM backend, though LLVM is really intended to be used this way and has a more polished API. Even so, LLVM doesn't let us generate exactly the code we'd like: we can't use GHC's tables-next-to-code optimisation. Measurements done by David Terei who built the LLVM backend apparently show that this doesn't matter much (~3% slower IIRC), though I'm still surprised that all those extra indirections don't have more of an effect, I think we need to investigate this more closely. It's important because if the LLVM backend is to be a compile-time option, we have to either drop tables-next-to-code, or wait until LLVM supports generating code in that style. (*) Though the main reason for this is the need to keep accurate GC information; if you're prepared to forego that (as in JHC) then you can generate much more optimisable C code.
[now I rehash why to remove -fvia-C anyway. Feel free to ignore me.]
...However, we think the native-code backends (and perhaps LLVM) will be good enough within the next few years to catch up with registerized via-C;
I should point out that for most Haskell programs, the NCG is already as fast (in some cases faster) than via C. The benchmarks showing a difference are all of the small tight loop kind - which are important to some people, I don't dispute that, but I expect that most people wouldn't notice the difference. Cheers, Simon