[GHC] #9387: LLVM backend: redundant code for functions calling themselves

#9387: LLVM backend: redundant code for functions calling themselves -----------------------------------+--------------------------------------- Reporter: jmoy | Owner: Type: feature request | Status: new Priority: normal | Milestone: Component: Compiler (LLVM) | Version: 7.6.3 Keywords: | Operating System: Linux Architecture: x86_64 (amd64) | Type of failure: None/Unknown Difficulty: Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Revisions: -----------------------------------+--------------------------------------- It seems that the LLVM backed does not recognize when a function calls itself. As a result it generates redundant load and store instructions. I am compiling the following simple recursive Fibonacci function {{{ fib::Int->Int fib n = loop n 0 1 where loop !n !a !b = if n==1 then a else loop (n-1) b (a+b) }}} I am attaching the files produced by {{{ ghc -c -O3 -fllvm -ddump-stg -ddump-cmm -ddump-llvm -ddump-to-file Fib.hs }}} I rename the generated LLVM file with a `.ll` extension and run `llc -O3` on it to generate the attached `.s` file (LLVM version 3.4.2). The actual work gets done in `Fib_zdwloop_info`. It begins as: {{{ Fib_zdwloop_info: # @Fib_zdwloop_info # BB#0: # %css pushq %rax movq %r13, (%rsp) movq %rbp, -8(%rsp) movq %r12, -16(%rsp) movq %rbx, -24(%rsp) movq %r14, -32(%rsp) movq %rsi, -40(%rsp) movq %rdi, -48(%rsp) movq %r8, -56(%rsp) movq %r9, -64(%rsp) movq %r15, -72(%rsp) movss %xmm1, -76(%rsp) movss %xmm2, -80(%rsp) movss %xmm3, -84(%rsp) movss %xmm4, -88(%rsp) movsd %xmm5, -96(%rsp) movsd %xmm6, -104(%rsp) }}} Later on when the function makes a recursive call to itself the code generated is {{{ movq -8(%rsp), %rbp movq -16(%rsp), %r12 movq -24(%rsp), %rbx movq -32(%rsp), %r14 movq -40(%rsp), %rsi movq -72(%rsp), %r15 popq %rax jmp Fib_zdwloop_info # TAILCALL }}} Given that the values are being loaded from the stack to the registers in the second excerpt, it is redundant to re-store the same values into the stack at the entry of the function. It seems to me that this is happening because LLVM uses a general function call instruction to translate a function calling itself. I was wondering if it is possible to recognize this special case in the LLVM backed or earlier so that instead of a function call we can translate this as the LLVM level as a jump to some point after the function entry. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9387 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9387: LLVM backend: redundant code for functions calling themselves -------------------------------------+------------------------------------- Reporter: jmoy | Owner: Type: feature | Status: new request | Milestone: Priority: normal | Version: 7.6.3 Component: Compiler | Keywords: (LLVM) | Architecture: x86_64 (amd64) Resolution: | Difficulty: Unknown Operating System: Linux | Blocked By: Type of failure: | Related Tickets: None/Unknown | Test Case: | Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Description changed by jmoy: Old description:
It seems that the LLVM backed does not recognize when a function calls itself. As a result it generates redundant load and store instructions.
I am compiling the following simple recursive Fibonacci function
{{{ fib::Int->Int fib n = loop n 0 1 where loop !n !a !b = if n==1 then a else loop (n-1) b (a+b) }}}
I am attaching the files produced by
{{{ ghc -c -O3 -fllvm -ddump-stg -ddump-cmm -ddump-llvm -ddump-to-file Fib.hs }}}
I rename the generated LLVM file with a `.ll` extension and run `llc -O3` on it to generate the attached `.s` file (LLVM version 3.4.2).
The actual work gets done in `Fib_zdwloop_info`. It begins as:
{{{ Fib_zdwloop_info: # @Fib_zdwloop_info # BB#0: # %css pushq %rax movq %r13, (%rsp) movq %rbp, -8(%rsp) movq %r12, -16(%rsp) movq %rbx, -24(%rsp) movq %r14, -32(%rsp) movq %rsi, -40(%rsp) movq %rdi, -48(%rsp) movq %r8, -56(%rsp) movq %r9, -64(%rsp) movq %r15, -72(%rsp) movss %xmm1, -76(%rsp) movss %xmm2, -80(%rsp) movss %xmm3, -84(%rsp) movss %xmm4, -88(%rsp) movsd %xmm5, -96(%rsp) movsd %xmm6, -104(%rsp) }}}
Later on when the function makes a recursive call to itself the code generated is
{{{ movq -8(%rsp), %rbp movq -16(%rsp), %r12 movq -24(%rsp), %rbx movq -32(%rsp), %r14 movq -40(%rsp), %rsi movq -72(%rsp), %r15 popq %rax jmp Fib_zdwloop_info # TAILCALL }}}
Given that the values are being loaded from the stack to the registers in the second excerpt, it is redundant to re-store the same values into the stack at the entry of the function.
It seems to me that this is happening because LLVM uses a general function call instruction to translate a function calling itself. I was wondering if it is possible to recognize this special case in the LLVM backed or earlier so that instead of a function call we can translate this as the LLVM level as a jump to some point after the function entry.
New description: It seems that the LLVM backed does not recognize when a function calls itself. As a result it generates redundant load and store instructions. I am compiling the following simple recursive Fibonacci function {{{ fib::Int->Int fib n = loop n 0 1 where loop !n !a !b = if n==1 then a else loop (n-1) b (a+b) }}} I am attaching the files produced by {{{ ghc -c -O3 -fllvm -ddump-stg -ddump-cmm -ddump-llvm -ddump-to-file Fib.hs }}} I rename the generated LLVM file with a `.ll` extension and run `llc -O3` on it to generate the attached `.s` file (LLVM version 3.4.2). The actual work gets done in `Fib_zdwloop_info`. It begins as: {{{ Fib_zdwloop_info: # @Fib_zdwloop_info # BB#0: # %css pushq %rax movq %r13, (%rsp) movq %rbp, -8(%rsp) movq %r12, -16(%rsp) movq %rbx, -24(%rsp) movq %r14, -32(%rsp) movq %rsi, -40(%rsp) movq %rdi, -48(%rsp) movq %r8, -56(%rsp) movq %r9, -64(%rsp) movq %r15, -72(%rsp) movss %xmm1, -76(%rsp) movss %xmm2, -80(%rsp) movss %xmm3, -84(%rsp) movss %xmm4, -88(%rsp) movsd %xmm5, -96(%rsp) movsd %xmm6, -104(%rsp) }}} Later on when the function makes a recursive call to itself the code generated is {{{ movq -8(%rsp), %rbp movq -16(%rsp), %r12 movq -24(%rsp), %rbx movq -32(%rsp), %r14 movq -40(%rsp), %rsi movq -72(%rsp), %r15 popq %rax jmp Fib_zdwloop_info # TAILCALL }}} Given that the values are being loaded from the stack to the registers in the second excerpt, it is redundant to re-store the same values into the stack at the entry of the function. It seems to me that this is happening because LLVM uses a general function call instruction to translate a function calling itself. I was wondering if it is possible to recognize this special case in the LLVM backed or earlier so that instead of a function call we can translate this at the LLVM level as a jump to some point after the function entry. -- -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9387#comment:1 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9387: LLVM backend: redundant code for functions calling themselves -------------------------------------+------------------------------------- Reporter: jmoy | Owner: Type: feature | Status: new request | Milestone: Priority: normal | Version: 7.6.3 Component: Compiler | Keywords: (LLVM) | Architecture: x86_64 (amd64) Resolution: | Difficulty: Unknown Operating System: Linux | Blocked By: Type of failure: | Related Tickets: None/Unknown | Test Case: | Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Changes (by jmoy): * cc: jyotirmoy@… (added) -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9387#comment:2 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9387: LLVM backend: redundant code for functions calling themselves -------------------------------------+------------------------------------- Reporter: jmoy | Owner: Type: feature | Status: new request | Milestone: Priority: normal | Version: 7.6.3 Component: Compiler | Keywords: (LLVM) | Architecture: x86_64 (amd64) Resolution: | Difficulty: Unknown Operating System: Linux | Blocked By: Type of failure: | Related Tickets: None/Unknown | Test Case: | Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Comment (by carter): on the mailing list, its mentioned that if opt is invoked before llc, this performance issue goes away. http://www.haskell.org/pipermail/haskell-cafe/2014-July/115325.html -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9387#comment:3 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#9387: LLVM backend: redundant code for functions calling themselves -------------------------------------+------------------------------------- Reporter: jmoy | Owner: Type: feature | Status: closed request | Milestone: Priority: normal | Version: 7.6.3 Component: Compiler | Keywords: (LLVM) | Architecture: x86_64 (amd64) Resolution: invalid | Difficulty: Unknown Operating System: Linux | Blocked By: Type of failure: | Related Tickets: None/Unknown | Test Case: | Blocking: | Differential Revisions: | -------------------------------------+------------------------------------- Changes (by rwbarton): * status: new => closed * resolution: => invalid Comment: [http://lists.cs.uiuc.edu/pipermail/llvmdev/2012-June/051355.html This mailing list thread] has more information on `opt` vs `llc`, with a similar example. Use `ghc -v` to see the steps GHC goes through to compile your file. It runs `opt` before `llc`, and you can see from inspecting `Fib.o` that the stack spilling is not present. (You can also inspect the actual intermediate files with `ghc -v -keep-tmp-files`.) -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9387#comment:4 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler
participants (1)
-
GHC