
#14672: Make likelyhood of branches/conditions available throughout the compiler. -------------------------------------+------------------------------------- Reporter: AndreasK | Owner: (none) Type: task | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.2.2 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D4316 Wiki Page: | Phab:D4324 Phab:D4327 -------------------------------------+------------------------------------- Comment (by AndreasK): I did some work on this for the Cmm/STG stages. But so far the results still don't justify the complexity. Although it's pretty clear which things should be tried before further judgment can be made. I did implement a rudimentary version which detected recursion as likely and bottoming branches as unlikely. Overall performance was underwhelming. While most nofib programs got slightly faster (0-1%) very few got a lot (order of ~5%) slower. What I did not do yet is propagte the likelyhood info into the asm passes which is probably required. Without this we get some awkward interactions. For example we might have code of the sort: `if checkError() then {goto panic} else {goto foo};` Assuming another block also jumps to `foo` there are two ways to compile this block: {{{ check: cmp R1, 0 jnz panic jmp foo check2: cmp R1, 0 jz foo jmp panic }}} Performance of jumps to panic is essentially meaningless so we want variant two which saves an instruction in each loop. Blocklayout however turns this into a pessimization. We often get something like this: {{{ check: cmp R1, 0 jnz panic jmp foo bar: #<other block jumping to foo> few ins here #<shortcut>jmp foo foo: ins do things panic: some more instructions }}} This seems reasonable. If we are lucky bar is small enough that when we jump to foo we won't even miss the cache. But if we optimize the check block we suddenly get: {{{ check2: cmp R1, 0 jz foo #<shortcut> jmp panic #block layout assumes we will take this path panic: some more instructions bar: #<other block jumping to foo> few ins here #<shortcut>jmp foo foo: ins do things }}} Clearly we don't want this because it pulls check2 and foo far apart. But block layout assumes the last jump is the likely one so tries to place `panic` right after `check2`. If this is a loop there is a good chance that this causes a cache miss on each jump to foo. I did play around with the block layout code but without actually having likelyhood info I couldn't make it work better than what we have know. I do want to give lowering likelyhood info into the asm stage a try sooner or (probably) later since I would expect that to lead to much better code.. However for now I ran out of steam before I came up with a good way to tackle this. Things that need to be still done. - Find a good layout algorithm which can make use of partial information. There are descriptions for algorithms which work with full information about block entry counts. But Implementing one which works well on partial information is something I couldn't find anything on yet. Rolling my own is not out of the question but seems like something for a few weeks and not a few days. * Chose a design for lowering likelyhood information to the asm level. How isn't yet clear. Annotate the instructions? Add information about blocks in a sidechannel? It's also not static since things like the register allocator also generate new blocks on the fly. Shortcutting might remove blocks and so on. * Besides block layout if we lower that information it should also be used by the register allocators. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14672#comment:14 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler