Re: [GHC] #8590: Reduce code size of CAFs

6 Dec 2013

      #8590: Reduce code size of CAFs
-------------------------------------+------------------------------------
        Reporter:  parcs             |            Owner:  parcs
            Type:  feature request   |           Status:  closed
        Priority:  normal            |        Milestone:
       Component:  Compiler (NCG)    |          Version:  7.7
      Resolution:  fixed             |         Keywords:
Operating System:  Unknown/Multiple  |     Architecture:  Unknown/Multiple
 Type of failure:  None/Unknown      |       Difficulty:  Unknown
       Test Case:                    |       Blocked By:
        Blocking:                    |  Related Tickets:
-------------------------------------+------------------------------------

Comment (by parcs):

 I am attempting to further reduce the code size of CAFs by moving the CAF-
 updating code out from each CAF and into a shared "CAF" info-table. For
 example, consider the module

 {{{
 #!haskell
 module CAF where
 a = "test"
 }}}

 Currently, the Cmm outputted for this module looks like

 {{{
 #!c
 [section "data" {
      CAF.a_closure:
          const CAF.a_info;
          const 0;
          const 0;
          const 0;
  },
  section "readonly" {
      cCM_str:
          I8[] [116,101,115,116]
  },
  CAF.a_entry() //  [R1]
          { info_tbl: [(cCN,
                        label: CAF.a_info
                        rep:HeapRep static { Thunk })]
            stack_info: arg_space: 8 updfr_space: Just 8
          }
      {offset
        cCN:
            if ((Sp + -16) < SpLim) goto cCO; else goto cCP;
        cCO:
            R1 = R1;
            call (stg_gc_enter_1)(R1) args: 8, res: 0, upd: 8;
        cCP:
            (_cCJ::I64) = call "ccall" arg hints:  [PtrHint,
                                                    PtrHint]  result hints:
 [PtrHint] newCAF(BaseReg, R1);
            if (_cCJ::I64 == 0) goto cCL; else goto cCK;
        cCL:
            call (I64[R1])() args: 8, res: 0, upd: 8;
        cCK:
            I64[Sp - 16] = stg_bh_upd_frame_info;
            I64[Sp - 8] = _cCJ::I64;
            R2 = cCM_str;
            Sp = Sp - 16;
            call GHC.CString.unpackCString#_info(R2) args: 24, res: 0, upd:
 24;
      }
  }]
 }}}

 Each CAF is augmented with code that handles the updating of the CAF
 itself. This is the `newCAF()` stuff shown above. It should be possible to
 refactor the CAF-updating out of the entry code of a CAF and into a
 special "CAF" info table. For example, the above code could look like

 {{{
 #!c
 [section "data" {
      CAF.a_closure:
          const stg_CAF_info; // XXX new info table
          const 0;
          const 0;
          const CAF.a_info; // pointer to the "real" info table
                            // this field corresponds to the saved_info
 field of an StgIndstatic
  },
  section "readonly" {
      cCK_str:
          I8[] [116,101,115,116]
  },
  CAF.a_entry() //  [R1]
          { info_tbl: [(cCL,
                        label: CAF.a_info
                        rep:HeapRep static { Thunk })]
            stack_info: arg_space: 8 updfr_space: Just 8
          }
      {offset
        cCL: // no CAF boilerplate here!
            R2 = cCK_str;
            call GHC.CString.unpackCString#_info(R2) args: 8, res: 0, upd:
 8;
      }
  }]
 }}}

 where `stg_CAF_info` is the info table that encapsulates the CAF-specific
 code:

 {{{
 #!c
 INFO_TABLE(stg_CAF, 0, 0, THUNK_STATIC, "CAF", "CAF")
     (P_ node)
 {
     P_ bh;
     W_ info;

     STK_CHK_GEN();

     info = StgIndStatic_saved_info(node);

     ("ptr" bh) = ccall newCAF(BaseReg "ptr", node "ptr");
     if (bh == 0) {
         jump %GET_ENTRY(node) ();
     } else {
         push (stg_bh_upd_frame_info, bh) {
             jump (%ENTRY_CODE(info)) ();
         }
     }
 }
 }}}

 Firstly, I wonder whether this approach is feasible. Is there a reason why
 CAF updates are not implemented this way in the first place?

 I have a tentative patch that implements this approach, and it does
 produce correct results given contrived input -- but when I try bootstrap
 GHC with the patch, the dll-split program (which is built against ghc-
 stage1) segfaults. So there is strong reason to believe that there is
 something flawed about the idea and/or the implementation.

 My STG-fu is still weak, and I am failing to figure out the source of the
 problem. dll-split does not have any CAFs Does anybody have any insights
 on

 Does anyone have any insights on this approach of updating CAFs and on
 what may be causing the segfault?

--
Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8590#comment:15
GHC http://www.haskell.org/ghc/
The Glasgow Haskell Compiler