
#13569: Optimise code-gen for field-updates of large products -------------------------------------+------------------------------------- Reporter: hvr | Owner: (none) Type: task | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.0.1 (CodeGen) | Keywords: | Operating System: Unknown/Multiple Architecture: | Type of failure: Other Unknown/Multiple | Test Case: | Blocked By: Blocking: | Related Tickets: Differential Rev(s): | Wiki Page: -------------------------------------+------------------------------------- Here's something we talked about last year which I wanted to look into but then got distracted (but I'm writing this one up now so it doesn't get lost again): Consider large field updates such as {{{#!hs module BigRec where data T = C { fld01 :: !Int , fld02 :: !Int , fld03 :: Int , fld04 :: Int , fld05 :: Int , fld06 :: Int , fld07 :: Int , fld08 :: Int , fld09 :: Int , fld10 :: Int , fld11 :: Int , fld12 :: Int , fld13 :: Int , fld14 :: Int , fld15 :: Int , fld16 :: Int , fld17 :: Int , fld18 :: Int , fld19 :: Int } inc01 x = x { fld01 = (fld01 x) + 1 } inc0102 x = x { fld01 = (fld01 x) + 1, fld02 = (fld02 x) + 1 } }}} These get translated into long chains of assignments, which in turn result in respectively long chains of register<->memory move instructions being generated by the NCG {{{#!c BigRec.inc0102_entry() // [R2] { [(c1em, block_c1em_info: const 0; const 31;), (c1ep, BigRec.inc0102_info: const 4294967301; const 0; const 15;)] } {offset c1ep: if ((Sp + -8) < SpLim) goto c1ez; else goto c1eA; c1ez: // nop R1 = BigRec.inc0102_closure; call (I64[BaseReg - 8])(R2, R1) args: 8, res: 0, upd: 8; c1eA: I64[Sp - 8] = block_c1em_info; R1 = R2; Sp = Sp - 8; if (R1 & 7 != 0) goto c1em; else goto c1en; c1en: call (I64[R1])(R1) returns to c1em, args: 8, res: 8, upd: 8; c1em: Hp = Hp + 160; if (Hp > I64[BaseReg + 856]) goto c1eD; else goto c1eC; c1eD: I64[BaseReg + 904] = 160; // nop call stg_gc_unpt_r1(R1) returns to c1em, args: 8, res: 8, upd: 8; c1eC: _s15F::P64 = P64[R1 + 7]; _s15G::P64 = P64[R1 + 15]; _s15H::P64 = P64[R1 + 23]; _s15I::P64 = P64[R1 + 31]; _s15J::P64 = P64[R1 + 39]; _s15K::P64 = P64[R1 + 47]; _s15L::P64 = P64[R1 + 55]; _s15M::P64 = P64[R1 + 63]; _s15N::P64 = P64[R1 + 71]; _s15O::P64 = P64[R1 + 79]; _s15P::P64 = P64[R1 + 87]; _s15Q::P64 = P64[R1 + 95]; _s15R::P64 = P64[R1 + 103]; _s15S::P64 = P64[R1 + 111]; _s15T::P64 = P64[R1 + 119]; _s15U::P64 = P64[R1 + 127]; _s15V::P64 = P64[R1 + 135]; _s15X::I64 = I64[R1 + 151] + 1; _s15W::I64 = I64[R1 + 143] + 1; I64[Hp - 152] = BigRec.C_con_info; P64[Hp - 144] = _s15F::P64; P64[Hp - 136] = _s15G::P64; P64[Hp - 128] = _s15H::P64; P64[Hp - 120] = _s15I::P64; P64[Hp - 112] = _s15J::P64; P64[Hp - 104] = _s15K::P64; P64[Hp - 96] = _s15L::P64; P64[Hp - 88] = _s15M::P64; P64[Hp - 80] = _s15N::P64; P64[Hp - 72] = _s15O::P64; P64[Hp - 64] = _s15P::P64; P64[Hp - 56] = _s15Q::P64; P64[Hp - 48] = _s15R::P64; P64[Hp - 40] = _s15S::P64; P64[Hp - 32] = _s15T::P64; P64[Hp - 24] = _s15U::P64; P64[Hp - 16] = _s15V::P64; I64[Hp - 8] = _s15W::I64; I64[Hp] = _s15X::I64; R1 = Hp - 151; }}} This happens in a few places inside GHC's source-tree (where we have large-ish products, such as DynFlags where we update single fields), which is where I noticed this while looking at generated ASM output. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/13569 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler