
#8885: Add inline versions of clone array primops -------------------------------------+------------------------------------ Reporter: tibbe | Owner: simonmar Type: feature request | Status: patch Priority: normal | Milestone: Component: Compiler | Version: 7.9 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: Unknown/Multiple Type of failure: None/Unknown | Difficulty: Unknown Test Case: | Blocked By: Blocking: | Related Tickets: -------------------------------------+------------------------------------ Comment (by tibbe): I did some investigation to figure out why the old implementation holds on to all memory (assuming that it's not an accounting error). First, here's what the inner loop looks like for the old implementation: {{{ $wa_entry() // [R3, R2] { [(c3i2, $wa_info: const 12884901901; const 0; const 15;)] } {offset c3i2: if (R2 != 0) goto c3i0; else goto c3i1; c3i0: _s3f9::P64 = R3; (_c3hS::I64) = call "ccall" arg hints: [PtrHint,] result hints: [PtrHint] allocate(BaseReg - 24, 20); I64[_c3hS::I64] = I64[PicBaseReg + stg_MUT_ARR_PTRS_FROZEN0_info@GOTPCREL]; I64[_c3hS::I64 + 8] = 16; I64[_c3hS::I64 + 16] = 17; _c3i8::I64 = _c3hS::I64 + 24; call MO_Memcpy(_c3i8::I64, _s3f9::P64 + 24, 128, 8); call MO_Memset(_c3i8::I64 + 128, 1, 1, 8); R3 = _s3f9::P64; R2 = R2 - 1; call $wa_info(R3, R2) args: 8, res: 0, upd: 8; c3i1: R1 = PicBaseReg + (()_closure+1); call (P64[Sp])(R1) args: 8, res: 0, upd: 8; } } }}} The first thing I did was to use the correct info table (`FROZEN`, not `FROZEN0`). That didn't help. The second thing I did was to use `-fno-omit-yields` to try to make sure the GC was run, as there's no heap check in this loop (and I don't know if `allocate` ever invokes the GC). That didn't have any effect. For reference, here's what the code with `-fno-omit-yields` looks like: {{{ $wa_entry() // [R3, R2] { [(c3i7, $wa_info: const 12884901901; const 0; const 15;)] } {offset c3i7: if (I64[BaseReg + 856] == 0) goto c3i8; else goto c3ia; c3i8: // nop // nop R1 = PicBaseReg + $wa_closure; call (I64[BaseReg - 8])(R3, R2, R1) args: 8, res: 0, upd: 8; c3ia: if (R2 != 0) goto c3i5; else goto c3i6; c3i5: _s3f9::P64 = R3; (_c3hX::I64) = call "ccall" arg hints: [PtrHint,] result hints: [PtrHint] allocate(BaseReg - 24, 20); I64[_c3hX::I64] = I64[PicBaseReg + stg_MUT_ARR_PTRS_FROZEN_info@GOTPCREL]; I64[_c3hX::I64 + 8] = 16; I64[_c3hX::I64 + 16] = 17; _c3ie::I64 = _c3hX::I64 + 24; call MO_Memcpy(_c3ie::I64, _s3f9::P64 + 24, 128, 8); call MO_Memset(_c3ie::I64 + 128, 1, 1, 8); R3 = _s3f9::P64; R2 = R2 - 1; call $wa_info(R3, R2) args: 8, res: 0, upd: 8; c3i6: R1 = PicBaseReg + (()_closure+1); call (P64[Sp])(R1) args: 8, res: 0, upd: 8; } } }}} -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/8885#comment:2 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler