
#8321: improve basic block layout on LLVM backend by predicting stack/heap checks
------------------------------------+-------------------------------------
Reporter: rwbarton | Owner:
Type: feature request | Status: new
Priority: normal | Milestone: 7.10.1
Component: Compiler (LLVM) | Version: 7.7
Keywords: | Operating System: Unknown/Multiple
Architecture: Unknown/Multiple | Type of failure: None/Unknown
Difficulty: Unknown | Test Case:
Blocked By: | Blocking:
Related Tickets: |
------------------------------------+-------------------------------------
Currently we don't give the LLVM optimizer any information about which
branch of an `if` is likely to be taken. As a result, the optimizer is
likely to produce a basic block layout which is not optimal. Improving the
layout can improve performance through better instruction cache usage and
better branch prediction by the hardware.
We can control LLVM's idea of what is likely with the `llvm.expect`
intrinsic function. Some obvious branches which we can predict accurately
are the stack and heap checks that appear near the entry of many
functions.
Here's a small example of some Cmm code and the output of the LLVM
optimizer/compiler.
{{{
block_c2Lc_entry() // [R1, R2]
{ info_tbl: [(c2Lc,
label: block_c2Lc_info
rep:StackRep [])]
stack_info: arg_space: 0 updfr_space: Nothing
}
{offset
c2Lc:
Hp = Hp + 24;
if (Hp > HpLim) goto c2Lm; else goto c2Ll;
c2Lm:
HpAlloc = 24;
R2 = R2;
R1 = R1;
call stg_gc_pp(R2, R1) args: 8, res: 8, upd: 24;
c2Ll:
I64[Hp - 16] = :_con_info;
P64[Hp - 8] = R1;
P64[Hp] = R2;
R1 = Hp - 14;
Sp = Sp + 8;
call (P64[Sp])(R1) args: 24, res: 0, upd: 24;
}
}]
}}}
{{{
00000000000002b8