 
            #8905: Function arguments are always spilled/reloaded if scrutinee is already in WHNF --------------------------------------------+------------------------------ Reporter: tibbe | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.9 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Type of failure: Runtime performance bug | Unknown/Multiple Test Case: | Difficulty: Unknown Blocking: | Blocked By: | Related Tickets: --------------------------------------------+------------------------------ Comment (by tibbe): To provide some motivation, I've included the full Cmm of the '''common''' path below. Every line that starts with `>` is what I would considered unnecessary spills/loads for this path: {{{ c2wQ: // stack check if ((Sp + -72) < SpLim) goto c2wR; else goto c2wS; c2wS: // stack check success
I64[Sp - 40] = PicBaseReg + block_c2my_info; // return addr for
eval R1 = R6; // t
I64[Sp - 32] = R2; // spill: s I64[Sp - 24] = R3; // spill: x P64[Sp - 16] = R4; // spill: k I64[Sp - 8] = R5; // spill: h
Sp = Sp - 40; if (R1 & 7 != 0) goto c2my; else goto c2mz; // eval check of t c2my: // eval check succeded
_s2b1::I64 = I64[Sp + 8]; // reload: h _s2b2::I64 = I64[Sp + 16]; // reload: k _s2b3::P64 = P64[Sp + 24]; // reload: x _s2b4::I64 = I64[Sp + 32]; // reload: s
switch [0 .. 4] (R1 & 7 - 1) { case 0 : goto c2wK; case 1 : goto c2wL; case 2 : goto c2wM; case 3 : goto c2wN; case 4 : goto c2wO; } c2wN: // Full _s2cx::P64 = P64[R1 + 4]; // ary _s2cy::I64 = (_s2b1::I64 >> _s2b4::I64) & 15; // i _s2cC::P64 = P64[(_s2cx::P64 + 24) + (_s2cy::I64 << 3)]; // st I64[Sp] = PicBaseReg + block_c2nJ_info; // return addr R6 = _s2cC::P64; // arg: st R5 = _s2b4::I64 + 4; // arg: s + bitsPerSubkey R4 = _s2b3::P64; // arg: x R3 = _s2b2::I64; // arg: k R2 = _s2b1::I64; // arg: h P64[Sp + 8] = _s2cC::P64; // spill: st I64[Sp + 16] = _s2cy::I64; // spill: i P64[Sp + 24] = _s2cx::P64; // spill: ary P64[Sp + 32] = R1; // spill: t (only used in uncommon branch) call $wpoly_go_info(R6, R5, R4, R3, R2) returns to c2nJ, args: 8, res: 8, upd: 8; c2nJ:
_s2b6::P64 = P64[Sp + 32]; // reload: t (only used in uncommon
branch) _s2cx::P64 = P64[Sp + 24]; // reload: ary _s2cy::I64 = I64[Sp + 16]; // reload: i _s2cE::P64 = R1; _s2cF::I64 = R1 == P64[Sp + 8]; if (_s2cF::I64 != 1) goto c2nR; else goto c2AE; c2nR: // heap check Hp = Hp + 176; if (Hp > I64[BaseReg + 856]) goto c2AB; else goto c2AA; c2AA: // heap check success I64[Hp - 168] = I64[PicBaseReg + stg_MUT_ARR_PTRS_DIRTY_info@GOTPCREL]; I64[Hp - 160] = 16; I64[Hp - 152] = 17; _c2nT::I64 = Hp - 168; call MO_Memcpy(_c2nT::I64 + 24, _s2cx::P64 + 24, 128, 8); P64[(_c2nT::I64 + 24) + (_s2cy::I64 << 3)] = _s2cE::P64; I64[_c2nT::I64] = I64[PicBaseReg + stg_MUT_ARR_PTRS_DIRTY_info@GOTPCREL]; I8[(_c2nT::I64 + 24) + ((I64[_c2nT::I64 + 8] << 3) + (_s2cy::I64 >> 7))] = 1 :: W8; I64[_c2nT::I64] = I64[PicBaseReg + stg_MUT_ARR_PTRS_FROZEN0_info@GOTPCREL]; I64[Hp - 8] = PicBaseReg + Full_con_info; I64[Hp] = _c2nT::I64; R1 = Hp - 4; Sp = Sp + 40; call (P64[Sp])(R1) args: 8, res: 0, upd: 8; // return }}} The main body of this function, except for the recursive call, is in the last block, `c2AA`. Quite of bit of time is spent on "administrative" things. Also note that there are a bunch of static arguments passed around (`x`, `k`, and `h`). I will try to see what the Cmm looks like if I manually perform a static argument transform on this code. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8905#comment:6 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler