[GHC] #14359: C-- pipeline/NCG fails to optimize simple repeated addition

#14359: C-- pipeline/NCG fails to optimize simple repeated addition
-------------------------------------+-------------------------------------
Reporter: bgamari | Owner: (none)
Type: bug | Status: new
Priority: low | Milestone:
Component: Compiler | Version: 8.2.1
Keywords: | Operating System: Unknown/Multiple
Architecture: | Type of failure: Runtime
Unknown/Multiple | performance bug
Test Case: | Blocked By:
Blocking: | Related Tickets:
Differential Rev(s): | Wiki Page:
-------------------------------------+-------------------------------------
While debugging #14346 I noticed some rather abhorrent code in a
disassembly of the `newPinnedByteArray#` primop:
{{{
Dump of assembler code for function stg_newPinnedByteArrayzh:
0x00000000004a8518 <+0>: mov 0x378(%r13),%rax
0x00000000004a851f <+7>: cmpq $0x0,0x10(%rax)
0x00000000004a8524 <+12>: je 0x4a8593

#14359: C-- pipeline/NCG fails to optimize simple repeated addition -------------------------------------+------------------------------------- Reporter: bgamari | Owner: (none) Type: bug | Status: new Priority: low | Milestone: Component: Compiler | Version: 8.2.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Description changed by bgamari: Old description:
While debugging #14346 I noticed some rather abhorrent code in a disassembly of the `newPinnedByteArray#` primop: {{{ Dump of assembler code for function stg_newPinnedByteArrayzh: 0x00000000004a8518 <+0>: mov 0x378(%r13),%rax 0x00000000004a851f <+7>: cmpq $0x0,0x10(%rax) 0x00000000004a8524 <+12>: je 0x4a8593
0x00000000004a8526 <+14>: mov 0x4f5730,%rax 0x00000000004a852e <+22>: mov 0x38(%rax),%rax 0x00000000004a8532 <+26>: cmp 0x4f5718,%rax 0x00000000004a853a <+34>: jae 0x4a8593 0x00000000004a853c <+36>: mov %rbx,%rax 0x00000000004a853f <+39>: lea 0x7(%rax),%rcx 0x00000000004a8543 <+43>: shr $0x3,%rcx 0x00000000004a8547 <+47>: add $0x10,%rax <--- starts here 0x00000000004a854b <+51>: add $0xf,%rax 0x00000000004a854f <+55>: add $0x7,%rax 0x00000000004a8553 <+59>: shr $0x3,%rax 0x00000000004a8557 <+63>: mov $0x49d820,%ecx }}} That is three successive `add` instructions; surely those should be collapsed into one by the Cmm-to-Cmm pipeline.
New description:
While debugging #14346 I noticed some rather abhorrent code in a
disassembly of the `newPinnedByteArray#` primop:
{{{
Dump of assembler code for function stg_newPinnedByteArrayzh:
0x00000000004a8518 <+0>: mov 0x378(%r13),%rax
0x00000000004a851f <+7>: cmpq $0x0,0x10(%rax)
0x00000000004a8524 <+12>: je 0x4a8593

#14359: C-- pipeline/NCG fails to optimize simple repeated addition -------------------------------------+------------------------------------- Reporter: bgamari | Owner: (none) Type: bug | Status: new Priority: low | Milestone: Component: Compiler | Version: 8.2.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by alexbiehl): Actually compiling `PrimOps.cmm` with `-O` already results in the desired constant folding: {{{ $ ghc -ddump-asm -c rts/PrimOps.cmm | less ... stg_newPinnedByteArrayzh: _cv: movq 888(%r13),%rax cmpq $0,16(%rax) je _cl _cn: movq g0@GOTPCREL(%rip),%rax movq (%rax),%rax movq 56(%rax),%rax movq large_alloc_lim@GOTPCREL(%rip),%rcx cmpq (%rcx),%rax jae _cl _co: subq $8,%rsp leaq -24(%r13),%rax leaq 38(%rbx),%rsi <- see here shrq $3,%rsi movq %rax,%rdi xorl %eax,%eax call allocatePinned addq $8,%rsp testq %rax,%rax ... }}} -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14359#comment:2 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler

#14359: C-- pipeline/NCG fails to optimize simple repeated addition -------------------------------------+------------------------------------- Reporter: bgamari | Owner: (none) Type: bug | Status: new Priority: low | Milestone: Component: Compiler | Version: 8.2.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime | Unknown/Multiple performance bug | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by bgamari): Well that is a relief. I guess this might just be an artifact from the fact I was using a `validate` build. I'll have to check this. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14359#comment:3 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler
participants (1)
-
GHC