Re: Branchless implementation for literal case – is it worth it?

19 Apr 2015

      Joachim Breitner  writes:
...
Dear devs,
in  https://ghc.haskell.org/trac/ghc/ticket/10124 bgamari suggested that
code like
f :: Int -> Bool
        f a = case a of
                1  -> True
                5  -> True
                8  -> True
                9  -> True
                11 -> True
                19 -> True
                _  -> False
        {-# NOINLINE f #-}
should not compile to a series of conditional jumps, but rather a
branchless code akin to
let !p = a ==# 1 `orI#` a ==# 5 `orI#` a ==# 8 `orI#` a ==# 9
                         `orI#` a ==# 11 `orI#` a ==# 19
        case p of 
          1 -> do_something
          0 -> do_something_else
Hi Joachim,

Thanks for trying this and I'm sorry to hear that the optimization
hasn't yielded the improvement I would have expected. I've been a bit
silent on the issue as I've been working on finishing up my
PhD. Hopefully in a couple of months after I'm settled in Germany I'll
have time to delve into this more deeply.
...
Subsequently, I refactored the way we produce Cmm code from STG, opening
the possibility to implement this optimization at that stage¹.
But when I then added this implementation, I could not measure any
runtime improvement, see
https://ghc.haskell.org/trac/ghc/ticket/10124#comment:15
So my question to the general audience: Is such branchless code really
better than the current, branching code? Can someone provide us with an
example that shows that it is better? Do I need to produce different
branchless assembly?
Have you had a look at the Intel Optimization manual? I'll admit that I
haven't looked myself but I'd imagine they'd probably have a few things
to say about this.

Cheers,

- Ben

Re: Branchless implementation for literal case – is it worth it?

Ben Gamari