
#9660: unnecessary indirect jump when returning a case scrutinee -------------------------------------+------------------------------------- Reporter: rwbarton | Owner: Type: bug | Status: new Priority: normal | Milestone: Component: Compiler | Version: 7.8.3 (CodeGen) | Operating System: Keywords: | Unknown/Multiple Architecture: Unknown/Multiple | Type of failure: Runtime Difficulty: Unknown | performance bug Blocked By: | Test Case: Related Tickets: | Blocking: | Differential Revisions: -------------------------------------+------------------------------------- I happened to be looking at the Cmm for this code (ghc 7.8.3, -O2) {{{#!hs f :: Int -> Int f x = if x < 0 then x else x+1 }}} and I noticed something a bit funny about it: {{{ c12e: if ((Sp + -8) < SpLim) goto c12z; else goto c12A; c12z: R2 = R2; R1 = Test.f_closure; call (stg_gc_fun)(R2, R1) args: 8, res: 0, upd: 8; c12A: I64[Sp - 8] = c12b; R1 = R2; Sp = Sp - 8; if (R1 & 7 != 0) goto c12b; else goto c12c; c12c: call (I64[R1])(R1) returns to c12b, args: 8, res: 8, upd: 8; c12b: Hp = Hp + 16; if (Hp > HpLim) goto c12y; else goto c12x; c12y: HpAlloc = 16; R1 = R1; call stg_gc_unpt_r1(R1) returns to c12b, args: 8, res: 8, upd: 8; c12x: _s11Q::I64 = I64[R1 + 7]; if (%MO_S_Lt_W64(_s11Q::I64, 0)) goto c12u; else goto c12v; c12u: Hp = Hp - 16; R1 = R1 & (-8); /* <--- */ Sp = Sp + 8; call (I64[R1])(R1) args: 8, res: 0, upd: 8; /* <--- */ c12v: I64[Hp - 8] = GHC.Types.I#_con_info; I64[Hp] = _s11Q::I64 + 1; R1 = Hp - 7; Sp = Sp + 8; call (P64[Sp])(R1) args: 8, res: 0, upd: 8; }}} On the two marked lines, we untag R1 (which is `x`) and enter it. However, we know at this point that `x` is already in WHNF so we could simply return it by replacing the two lines with `call (P64[Sp])(R1)`, if I'm not mistaken. That will save a load and an indirect jump (which we actually know is to `I#_con_info`, which would just retag R1 and return to the address on the stack anyways). I think the same optimization should be available any time we do an algebraic `case` and in a branch simply return the scrutinee. I looked at what it would take to fix this. It looks almost easy: if we add a new `LambdaFormInfo` constructor `LFUnknownCon` meaning that we know the identifier is bound to a saturated application of an unknown constructor, then we could set the `cg_lf` of the case binder variable of an algebraic case statement to `LFUnknownCon`, and return `ReturnIt` for `LFUnknownCon` variables in `getCallMethod`. I think that would do it. Does that sound right? Is there a better way? (In my original example we actually know the constructor has to be `I#`. But if the case was on a type with more than one constructor we wouldn't know statically which one we got, just that it has to be one of them.) -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/9660 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler