
I was wondering if anyone has investigated the alignment of doubles in ghc.
On x86 machines, doubles can be aligned on 4 byte boundaries, but there is a performace improvement if they have 8 byte alignment. As far as I can tell, ghc uses 4 byte alignment for doubles.
I started to look into the changes needed to go from 4 to 8 byte alignment...
It's quite a difficult task. You would need to arrange that the stack pointer and heap pointer are always 8-byte aligned, which is something we don't do at the moment: they're always 4-byte aligned only. This would mean changes to the code generator to add alignment padding to stack frames, and to pad the heap pointer to an 8-byte boundary after each allocation. We haven't looked into it, but you're welcome to try! We *have* seen examples where gcc spilled some intermediate double values to the stack, and it made a big difference whether the stack address used was 8-byte aligned or not. I seem to recall this was the mysterious cause of a 40% or so difference in speed between two runs of the same binary at different times of the day :-) Cheers, Simon