
Simon Marlow wrote: ...
I have seen strange artifacts like this before that turned out to be caused by one of two things:
- bad cache interactions, e.g. we just happen to have laid out the code in such a way that frequently accessed code sequences push each other out of the cache, or the relative position of the heap and stack have a bad interaction. This happens less frequently these days with 4-way and 8-way associative caches on most CPUs.
- alignment issues, such as storing or loading a lot of misaligned Doubles
in the second case, I've seen the same program run +/- 50% in performance from run to run, just based on random alignment of the stack. But it's not likely to be the issue here, I'm guessing. If it is code misalignment, that's something we can easily fix (but I don't *think* we're doing anything wrong in that respect).
I have an Opteron box here that regularly gives +/- 20% from run to run of the same program with no other load on the machine. I have no idea why...
... This got me wondering how I could test for code misalignment problems. I expect there's a cleverer way, but how about a single executable containing several copies of the same code to be tested and a loop that runs and times the different copies. A consistently higher or lower runtime from one copy would indicate a misalignment problem. (I'm assuming the different copies of the code would probably load at fairly random alignments, random padding could be added.) It might have to run the copies in a different order each time round the loop to avoid the possibility of external periodic events affecting a particular copy. Richard.