
Additional things to consider: Performance in tight loops is often vastly different, because branch prediction/caching will most likely kick in visibly. Correctly predicted branches will cost you almost nothing, while unknown/incorrectly predicted branches will be much more costly. In
#14644: Improve cmm/assembly for pattern matches with two constants. -------------------------------------+------------------------------------- Reporter: AndreasK | Owner: AndreasK Type: task | Status: patch Priority: normal | Milestone: Component: Compiler | Version: 8.2.2 (CodeGen) | Keywords: Codegen, CMM, Resolution: | Patterns, Pattern Matching Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Phab:D4294 Wiki Page: | -------------------------------------+------------------------------------- Comment (by AndreasK): Replying to [comment:3 svenpanne]: the absence of more information from their branch predictor, quite a few processors assume that backward branches are taken and forward branches are assumed to be not taken. So code layout has a non-trivial performance impact. I went over Agners guide and it seems like this is only for Netburst CPU's, the last of which was released in 2001 so I'm not too worried about these. And even if you have on of these according to Agner:
It is rarely worth the effort to take static prediction into account. Almost any branch that is executed sufficiently often for its timing to have any significant effect is likely to stay in the BTB so that only the dynamic prediction counts.
All other architectures he lists default to not taken if they use static prediction at all. ---- What might help explain the difference is that jumps not taken should be faster than taken jumps on both modern Intel and AMD CPU's. If someone wants to dig deeper Agner probably has enough info in the guides to explain the change completely based on the assembly generated. But I don't think that is necessary. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14644#comment:11 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler