
On 27 October 2005 01:33, John Meacham wrote:
I think I might have found why (or partially why) ghc is so slow on x86-64..
section 5.10 of the optimization manual
http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/ 25112.PDF
(which has a whole lot of good info for any processor, including a whole chapter on how to write C code that optimizes well independent of the CPU)
"don't place code and data on the same cache line"
I'd be surprised if this is an issue. GHC doesn't normally touch the info tables during execution (with one exception - getting the tag from a constructor in a datatype with >8 constructors). It touches the info tables during GC, but it doesn't touch the code during GC. So we might push some code out of the cache on a GC, but that shouldn't have a large effect. It could be an alignment issue, I suppose. Or passing arguments in registers (we don't, at the moment, on x86_64). If you have any handy test programs, can you try fiddling with the alignment of code blocks and see if you get a measurable difference? (I'm still digesting your other message, I'll reply in due course). Cheers, Simon