So it looks like some values are being moved from registers to the stack only to be immediately moved from the stack to the register on entry to the function. It should be possible to eliminate both the load and the stores.At some point down the line the function makes a tail call to itself and this is the code generatedOne of the generated function starts off withDear All,I am new to Haskell so please forgive me if I am asking about something already well-understood.
I was trying to understand the performance of my Haskell program compiled with the LLVM backend. I used -ddump-llvm to dump the LLVM assembly and then ran llc -O3 on the resulting file to look at the native assembly.
s5BH_info: # @s5BH_info
# BB#0:
subq $208, %rsp
movq %r13, 200(%rsp)
movq %rbp, 192(%rsp)
movq %r12, 184(%rsp)
movq %rbx, 176(%rsp)
movq %r14, 168(%rsp)
movq %rsi, 160(%rsp)
movq %rdi, 152(%rsp)
movq %r8, 144(%rsp)
movq %r9, 136(%rsp)
movq %r15, 128(%rsp)
movss %xmm1, 124(%rsp)
movss %xmm2, 120(%rsp)
movss %xmm3, 116(%rsp)
movss %xmm4, 112(%rsp)
movsd %xmm5, 104(%rsp)
movsd %xmm6, 96(%rsp)
movq %r14, 168(%rsp)
movq 200(%rsp), %r13
movq 192(%rsp), %rbp
movq 184(%rsp), %r12
movq 176(%rsp), %rbx
movq 128(%rsp), %r15
movsd 104(%rsp), %xmm5
addq $208, %rsp
jmp s5BH_info
Is this behaviour due to LLVM or GHC? If it is GHC, it this an optimization a newcomer can attempt to implement or are there deep issues here?Jyotirmoy Bhattacharya