Em domingo, 19 de abril de 2015, Joachim Breitner <mail@joachim-breitner.de> escreveu:
Dear devs,
in https://ghc.haskell.org/trac/ghc/ticket/10124 bgamari suggested that
code like
f :: Int -> Bool
f a = case a of
1 -> True
5 -> True
8 -> True
9 -> True
11 -> True
19 -> True
_ -> False
{-# NOINLINE f #-}
should not compile to a series of conditional jumps, but rather a
branchless code akin to
let !p = a ==# 1 `orI#` a ==# 5 `orI#` a ==# 8 `orI#` a ==# 9
`orI#` a ==# 11 `orI#` a ==# 19
case p of
1 -> do_something
0 -> do_something_else
I'd suggest to compile this particular case as a bittest in a 32-bit word.
Gabor
Subsequently, I refactored the way we produce Cmm code from STG, opening
the possibility to implement this optimization at that stage¹.
But when I then added this implementation, I could not measure any
runtime improvement, see
https://ghc.haskell.org/trac/ghc/ticket/10124#comment:15
So my question to the general audience: Is such branchless code really
better than the current, branching code? Can someone provide us with an
example that shows that it is better? Do I need to produce different
branchless assembly?
If it is not better, I can again refactor the switch generation and
simplify it a bit again.
Hmm, only now I see that rwbarton has replied there. Not sure why I
missed that. Anyways, more voices are welcome :-)
Greetings,
Joachim
¹ This should not preclude an implementation on the Core level, which
SPJ preferred. But I improved other aspects of the code generation as
well, so this is worthwhile on its own.
--
Joachim “nomeata” Breitner
mail@joachim-breitner.de • http://www.joachim-breitner.de/
Jabber: nomeata@joachim-breitner.de • GPG-Key: 0xF0FBF51F
Debian Developer: nomeata@debian.org