Re: Removing/deprecating -fvia-c

16 Feb 2010

      marlowsd:
...
I manged to improve this:
Main_mainzuzdszdwfold_info:
.Lc1lP:
        addq $32,%r12
        cmpq 144(%r13),%r12
        ja .Lc1lS
        movq %r14,%rax
        cmpq $1000000000,%rax
        jne .Lc1lV
        movq $ghczmprim_GHCziTypes_Dzh_con_info,-24(%r12)
        movsd %xmm6,-16(%r12)
        movq $ghczmprim_GHCziTypes_Dzh_con_info,-8(%r12)
        movsd %xmm5,(%r12)
        leaq -7(%r12),%rbx
        leaq -23(%r12),%r14
        jmp *(%rbp)
.Lc1lS:
        movq $32,184(%r13)
        movl $Main_mainzuzdszdwfold_closure,%ebx
        addq $-24,%rbp
        movsd %xmm5,(%rbp)
        movsd %xmm6,8(%rbp)
        movq %r14,16(%rbp)
        jmp *-8(%r13)
.Lc1lV:
        addsd .Ln1m2(%rip),%xmm5
        addsd .Ln1m3(%rip),%xmm6
        leaq 1(%rax),%r14
        addq $-32,%r12
        jmp Main_mainzuzdszdwfold_info
from 9 instructions in the last block down to 5 (one instruction fewer  
than gcc).  I haven't commoned up the two constant 1's though, that'd  
mean doing some CSE.
On my machine with GHC HEAD and gcc 4.3.0, the gcc version runs in 2.0s,  
with the NCG at 2.3s.  I put the difference down to a bit of instruction  
scheduling done by gcc, and that extra constant load.
But let's face it, all of this code is crappy.  It should be a tiny  
little loop rather than a tail-call with argument passing, and that's  
what we'll get with the new backend (eventually).  LLVM probably won't  
turn it into a loop on its own, that needs to be done before the code  
gets passed to LLVM.
Agreed. Ideally the new backend would be (starting to be?) usable about
the time -fvia-C dies? Otherwise there's always going to be something
that gcc spots that the current codegen won't.

Then again, killing perl from the ghc toolchain, and having a
funeral/dancing on its grave, would be satisfying in itself :-)
...
Have you looked at this example on x86?  It's *far* worse and runs about  
5 times slower.
x86 scares me.. :)

Re: Removing/deprecating -fvia-c

Don Stewart