
I was wondering if anyone has investigated the alignment of doubles in ghc.
On x86 machines, doubles can be aligned on 4 byte boundaries, but there is a performace improvement if they have 8 byte alignment. As far as I can tell, ghc uses 4 byte alignment for doubles.
I started to look into the changes needed to go from 4 to 8 byte alignment...
It's quite a difficult task. You would need to arrange that the stack pointer and heap pointer are always 8-byte aligned, which is something we don't do at the moment: they're always 4-byte aligned only. This would mean changes to the code generator to add alignment padding to stack frames, and to pad the heap pointer to an 8-byte boundary after each allocation. We haven't looked into it, but you're welcome to try! We *have* seen examples where gcc spilled some intermediate double values to the stack, and it made a big difference whether the stack address used was 8-byte aligned or not. I seem to recall this was the mysterious cause of a 40% or so difference in speed between two runs of the same binary at different times of the day :-) Cheers, Simon

I've recently browsed some assembly code generated by GHC (via gcc). It appeared that most values were accessed via memory, because the stack is managed by GHC explicity. Even intermediate results seem accessed via memory, probably due to shortcomings of aliasing analysis in gcc. (?) This brings on the topic of eval/apply vs push/enter. In the eval/apply model, the stack management can (possibly) be left to anunderlying compiler, which would remove most memory accesses, and leave most of the alignment issue to the back-end. (http://research.microsoft.com/Users/simonpj/papers/eval-apply/) From that article, I can only assume that GHC will switch to the eval/apply model. Can we expect that soon? The "alignment" thing wouldbecome easier to deal with... Cheers,Jean-Philippe.
Simon Marlow
I was wondering if anyone has investigated the alignment of doubles in ghc.
On x86 machines, doubles can be aligned on 4 byte boundaries, but there is a performace improvement if they have 8 byte alignment. As far as I can tell, ghc uses 4 byte alignment for doubles.
I started to look into the changes needed to go from 4 to 8 byte alignment...
It's quite a difficult task. You would need to arrange that the stack pointer and heap pointer are always 8-byte aligned, which is something we don't do at the moment: they're always 4-byte aligned only. This would mean changes to the code generator to add alignment padding to stack frames, and to pad the heap pointer to an 8-byte boundary after each allocation. We haven't looked into it, but you're welcome to try! We *have* seen examples where gcc spilled some intermediate double values to the stack, and it made a big difference whether the stack address used was 8-byte aligned or not. I seem to recall this was the mysterious cause of a 40% or so difference in speed between two runs of the same binary at different times of the day :-) Cheers, Simon _______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users --------------------------------- Do you Yahoo!? The New Yahoo! Search - Faster. Easier. Bingo.
participants (2)
-
JP Bernardy
-
Simon Marlow