
Poorly predicted branches can indeed be expensive. However, I think here we are just taking about jumps which, as far as I know, are quite cheap assuming they don't boot you out of $I since they can be predicted
#14372: CMM contains a bunch of tail-merging opportunities -------------------------------------+------------------------------------- Reporter: heisenbug | Owner: (none) Type: feature request | Status: new Priority: normal | Milestone: Component: Compiler | Version: 8.2.1 Resolution: | Keywords: Operating System: Unknown/Multiple | Architecture: | Unknown/Multiple Type of failure: None/Unknown | Test Case: Blocked By: | Blocking: Related Tickets: | Differential Rev(s): Wiki Page: | -------------------------------------+------------------------------------- Comment (by AndreasK): Replying to [comment:3 bgamari]: perfectly. I've run the code: {{{ {-# NOINLINE f_test #-} f_test :: Int ->Int f_test a = case a of 1 -> 11001 2 -> 11002 3 -> 11003 4 -> 11004 5 -> 11005 6 -> 11006 7 -> 11007 8 -> 11008 main = print . sum . map (f_test) $ (concat . replicate 9000000) [1..45::Int] }}} I did the transformation manually into: {{{ .Lc48I: cmpq $9,%r14 jge .Lc48z .Lu48L: cmpq $1,%r14 jl .Lc48z .Lu48M: movl .Ln48P-8(,%r14,8), %ebx jmp *(%rbp) .Lc48z: movq $-1,%rbx jmp *(%rbp) .section .rodata .Ln48P: .quad 11001 # 11002 .. 11007 .quad 11008 }}} Turning the jumps into a indirect mov instruction.
I wonder what prior art exists in this area; I'm sure other compilers have considered this in the past.
Gcc/clang do the same thing for switch statements. It improved speed, but not by much and depending on the codelayout I did it was possible to get worse performance in specific cases. But thats true for everything that changes the code layout and isn't a major win I assume. -- Ticket URL: http://ghc.haskell.org/trac/ghc/ticket/14372#comment:17 GHC http://www.haskell.org/ghc/ The Glasgow Haskell Compiler