Re: panic when compiling SHA

13 Jan 2014

      agreed,
this level of unrolling would cause problems even in most C compilers! When
I write unrolled simd C code, I use a fixed number of variables that
corresponds to the # of registers that are live on my target arch (yes,
internally the turn into SSA which then does an ANF/CPS style and such),
but by shadowing/resuing a fixed number of  names, i can make it
"syntactically" clear what the lifetimes of my variables is.

But yeah, branch predictors are pretty good on modern hardware, a loop is
worth considering.

On Sun, Jan 12, 2014 at 10:06 PM, Ben Lippmeier  wrote:
...
On 10/01/2014, at 6:17 , Adam Wick  wrote:
...
That’s the problem with SHA, then. The implementation (and the spec,
really) is essentially a long combination of the form:
let x_n5 = small_computation x_n1 x_n2 x_n3 x_n4
     x_n6 = small_computation x_n2 x_n3 x_n4 x_n5
     …
Which has ~70 entries. The actual number of live variables alive at any
time should be relatively small, but if slots aren’t getting reused there’s
going to be some significant blowup. (To be honest, I had figured — and
thought I had validated — that doing it this way would give the compiler
the best chance at generating optimal code, but it appears I merely set
myself up to hit this limitation several years later.)
If this [1] is the current version then I don't think there is any
performance reason to manually unroll the loops like that. If you write a
standard tail-recursive loop then the branch predictor in the processor
should make the correct prediction for all iterations except the last one.
You'll get one pipeline stall at the end due to a mis-predicted backward
branch, but it won't matter in terms of absolute percentage of execution
time. You generally only need to worry about branches if the branch flag
flips between True and False frequently.
If you care deeply about performance then on some processors it can be
helpful to unfold this sort of code so that the SHA constants are
represented as literals in the instruction stream instead of in static data
memory -- but that ability is very processor specific and you'd need to
really stare at the emitted assembly code to see if it's worthwhile.
Ben.
[1] https://github.com/GaloisInc/SHA/blob/master/Data/Digest/Pure/SHA.hs
_______________________________________________
ghc-devs mailing list
ghc-devs@haskell.org
http://www.haskell.org/mailman/listinfo/ghc-devs

Re: panic when compiling SHA

Carter Schonwald